class: center, middle, inverse, title-slide # Three models with latent variables ## parameter inference ### 吴燕丰 ### 2018/03/28 --- # An example: Gaussian Mixtures A mixture of two Gaussians: `$$X=\begin{cases} X_1 \sim \mathcal{N}(\mu_1,\sigma_1^2) & \text{ if } L = 1;\\ X_2 \sim \mathcal{N}(\mu_2,\sigma_2^2) & \text{ if } L = 2. \end{cases}$$` where `\(L\)` is a **latent variable** (whoes value is unobservable), `$$L=\begin{cases} 1 & \text{ w.p. } p; \\ 2 & \text{ w.p. } 1-p. \end{cases}$$` -- **Parameter inference** methods: - Maximum Likelihood Estimation - Method of Moments - Bayesian method ??? `$$p \mathcal{N}(\mu_1,\sigma_1^2) + (1-p) \mathcal{N}(\mu_2,\sigma_2^2)$$` --- # Models with latent variables .pull-left[ Advantage - flexibility ↑ - closer to data ] .pull-right[ Disadvantage - treatability ↓ - computation consuming ] -- <br><br><br> The **parameter inference** - is difficult - plays an important role .center[in the application of such models] ??? # Why Statistical Model? .pull-left[ Observation (DATA) - 可变性 - 部分解释(易) - 完全解释(难) ] .pull-right[ Explanation (MODEL) - 这个世界是确定的吗? - 这个世界是随机的吗? ] <br> .center[ Trade-off between observation and (full) explanation ] .center[<span style="font-size:2em;">↓</span>] .center[statistical model is more persuasive] --- # OU processes OU processes ( **OU_LV** ) `$$X_t = e^{-\lambda\Delta}X_{t-1} + \int_{(t-1)\Delta}^{t\Delta}e^{(s-t\Delta)\lambda}dL_s$$` where `\(L_s\)` is an unobservable Lévy process. <br> -- a simple subclass of above processes ( **OU_CP** ): `$$X_t = e^{-\lambda\Delta}X_{t-1} +\sum_{i=N(\lambda(t-1)\Delta)+1}^{N(\lambda t\Delta)}e^{(s_i-t\Delta)\lambda}J_i$$` Note: 1. `\(N(\cdot)\)` is a Poisson process, `\(s_i\)` is the ith arrival time 2. `\(J_i\sim \mathcal{J}(\cdot,b)\)` --- # A trajectory of OU_CP .center[] --- # How to estimate the parameters (OU_CP) Given a trajectory of `\(X_t\)`, `\(x_0,\cdots,x_N\)`, (assume `\(\lambda\)` is known) and define `$$Y_t = X_t - e^{-\lambda\Delta}X_{t-1}$$` likelihood ratio function: `$$l(\theta|y_{1:N})=\prod_{t=1}^{N}f_Y(y_t)=?$$` where `\(f_Y(\cdot)\)` is the p.d.f. of `\(Y_t\)`, -- `$$f_Y(y|y>0)=\sum_{n=1}^{\infty}e^{-a\lambda\Delta}\frac{(a\lambda\Delta)^n}{n!}\int_{0}^{\lambda\Delta}\cdots\int_{0}^{\lambda\Delta}f_n(y,s_1,\cdots,s_n)\frac{1}{(\lambda\Delta)^n}ds_1\cdots ds_n$$` `$$\begin{align*} f_n(y,s_1,\cdots,s_n)&\triangleq\sum_{i=1}^{n}\lambda_ie^{-y\lambda_i}\left( \prod_{j=1,j\neq i}^{n}\frac{\lambda_j}{\lambda_j-\lambda_i} \right) \quad (n>0, \lambda_j\neq \lambda_i)\\ \lambda_i&=\frac{b}{e^{s_i-\lambda\Delta}} \quad (s_i \in (0,\lambda\Delta]) \end{align*}$$` --- # MLE V.S. MM (OU_CP) ### MLE (Maximum Likelihood Estimation) is not quite fit for OU processes -- ### MM (Method of Moments) promising -- `$$\begin{align*} E[Y_t]&=E\left[\sum_{i=N(\lambda(t-1)\Delta)+1}^{N(\lambda t\Delta)}e^{(s_i-t\Delta)\lambda}J_i\right]\\ &=a(1-e^{-\lambda\Delta})E[J_i]\\ E[Y_t^2]&= (\lambda a\Delta)E[e^{2(s_i-1)\lambda\Delta}]E[J_i^2]\\ &~~+(\lambda a\Delta)^2E^2[e^{(s_i-1)\lambda\Delta}]E^2[J_i]\\ E[Y_t^3]&=\cdots \end{align*}$$` --- # Estimation of parameters (OU_CP) Estimator of `\(\lambda\)`: `$$\hat{\lambda}=\frac{1}{\Delta}\max_{1\le t \le N}\{\ln X_{t-1}-\ln X_t\}$$` Method of Moments for the rest parameters: `$$\begin{align*} \hat{a} &= -\frac{1}{\hat{\lambda}\Delta}\ln\bar{1}_{\{Y(\hat{\lambda})=0\}}\\ E[J_i] &= \frac{\overline{Y}(\hat{\lambda})}{\hat{a}(1-e^{-\hat{\lambda}\Delta})} \end{align*}$$` --- # Estimation of parameters (OU_LV) Estimator of `\(\lambda\)`: `$$\hat{\lambda}=-\frac{1}{\Delta}\log(\hat{\rho}_1)$$` where `\(\hat{\rho}_1=\hat{\gamma}_1/\hat{\gamma}_0\)`, `$$\begin{align*} \hat{\gamma}_0 &=\frac{1}{N+1}\sum_{n=0}^{N}(X_n-\bar{X})(X_n-\bar{X}), \\ \hat{\gamma}_1 &= \frac{1}{N}\sum_{n=0}^{N-1}(X_{n+1}-\bar{X})(X_n-\bar{X}), \\ \bar{X} &=\frac{1}{N+1}\sum_{n=0}^NX_n. \end{align*}$$` --- # Estimation of parameters (OU_LV) Method of Moments for the rest parameters: `$$\begin{align*} E[Y_t] &= \cdots \\ E[Y_t^2] &= \cdots \\ E[Y_t^3] &= \cdots \end{align*}$$` --- # Numerical results (OU_LV) .center[] --- # Stochastic volatility model (OU_CP type) **Obervable** `\(y_t\)`: `$$y_t = \mu\Delta + \beta q_t + \sqrt{q_t}\epsilon_t, t=1,\cdots,N.$$` **Unobservable** `\(q_t,v_t,\epsilon_t\)`: `$$\begin{align*} q_t &= \frac{1}{\lambda}[(z_{t}-z_{t-1})-(v_t-v_{t-1})],\\ v_t &= e^{-\lambda\Delta}v_{t-1} + \int_{(t-1)\Delta}^{t\Delta}e^{\lambda(s-t\Delta)}dz(\lambda s),\\ \epsilon_t & \sim \mathcal{N}(0,1). \end{align*}$$` Note: - `\(v_t\)` is an **OU_CP** process. - `\(z(t)\)` is a compound Poisson process --- # Method of Moments `$$\begin{align*} E[y_t] &= \mu\Delta + \beta E[q_t], \\ var(y_t) &= \beta^2\left( E[q_t^2] - E^2[q_t] \right) + E[q_t],\\ cov(y_t,y_{t-1}) & = \beta^2 cov(q_t,q_{t-1}), \\ cov(y_t^2,y_{t-1}) & = \beta^3 cov(q_t^2,q_{t-1}) + (\beta +2\mu\Delta\beta^2) cov(q_t,q_{t-1}),\\ cov(y_t,y_{t-1}^2) & = \beta^3 cov(q_t,q_{t-1}^2) + (\beta +2\mu\Delta\beta^2) cov(q_t,q_{t-1}). \end{align*}$$` --- # Numerical results sample `\(E[y]\)`, boxplot .center[] --- # Numerical results sample `\(var(y)\)`, boxplot .center[] --- # Numerical results sample `\(cov(Y_t,Y_t)\)`, boxplot .center[] --- # Numerical results sample `\(cov(Y_t^2,Y_{t-1})\)`, boxplot .center[] --- # Numerical results sample `\(cov(Y_t,Y_{t-1}^2)\)`, boxplot .center[]