Input
Center data
For each feature \(j = 1, \ldots, d\):
Compute means: \(\mu_j = \text{mean}(X[:, j])\)
Center features: \(\tilde{X}[:, j] = X[:, j] - \mu_j\)
Compute covariance matrix
\[ \Sigma = \frac{1}{n - 1} \tilde{X}^T \tilde{X} \quad \text{(a } d \times d \text{ matrix)} \]
Find eigenvectors and eigenvalues
\[ (\Sigma - \lambda I) \cdot v = 0 \]
Select components
Project data \[ Z = \overbrace{\tilde{X}}^{n \times d} \cdot \overbrace{V[:, 1:k]}^{d \times k} \quad \text{(an } n \times k \text{ matrix)} \]
Output
\[ Z \in \mathbb{R}^{n \times k} \]
Bonus Exercise
Prove that the variance of the projected data along an eigenvector \(v\) is equal to the corresponding eigenvalue \(\lambda\).
Project new point \(x\)
\[ \overbrace{z}^{1 \times k} = \overbrace{(x - \mu)}^{1 \times d} \cdot \overbrace{V[:, 1:k]}^{d \times k} \]
Lossy reconstruction
\[ \overbrace{x_{\text{reconstructed}}}^{1 \times d} = \overbrace{z}^{1 \times k} \cdot \overbrace{V[:, 1:k]^T}^{k \times d} + \overbrace{\mu}^{1 \times d} \]