，对数据点分布情况的解释力也就更强。

Suppose are training samples with zero mean. The goal of PCA is to find a set
of vectors in space containing the max amount of variance in the data.

The projections of the all pixels xj onto this normalized direction v are

The variance of the projections is

found by the following equation:

problem:

the data on the principal axes are called 主成分, also known as PC
scores。注意因为v是单位向量，所以点 xj 向 v 轴做投影所得之 PC score 就是 vT·xj
。而且这也是最大的主成分方向。如果要再多加的一个方向的话，则继续求一个次大的λ，而这个λ对应特征向量v所指示的就是使得variance第二大的方向，并以此类推。

，所以它是对称的，对于一个对称矩阵而言，如果它有N个不同的特征值，那么这些特征值对应的特征向量就会彼此正交。如果把Cv=λv
，中的向量写成矩阵的形式，也就是采用矩阵对角化（特征值分解）的形式，则有C=VΛVT，where V is a matrix of eigenvectors
(each column is an eigenvector) and Λ is a diagonal matrix with eigenvalues λi
in the decreasing order on the diagonal. The eigenvectors are called principal
axes or principal directions of the data.

Projections of the data on the principal axes are called principal components,
also known as PC scores; these can be seen as new, transformed, variables. The
j-th principal component is given by j-th column of XV. The coordinates of the
i-th data point in the new PC space are given by the i-th row of XV。

Kernel PCA

Covariance matrix C，但不同的是这一次我们要再目标空间中来求，而非原空间。

C和XTX具有相同的特征向量。但现在的问题是Φ是隐式的，我们并不知道。所以，我们需要设法借助核函数K来求解XTX。

The eigenvalue problem of K=XXT is (XXT)u = λu。现在我们需要的是XTX，所以把上述式子的左右两边同时乘以一个XT
，从而构造出我们想要的，于是有

XT(XXT)u = λXTu

(XTX)(XTu)=λ(XTu)

score 就是vT·Φ(xj)。如下

Solve the following eigenvalue problem:

The projection of the test sample Φ(xj) on the i-th eigenvector can be
computed by

【1】http://blog.csdn.net/baimafujinji/article/details/50373143

【2】部分图片来自李政軒博士的在线授课视频

【3】http://blog.csdn.net/baimafujinji/article/details/50372906

【4】https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca

【5】http://blog.csdn.net/baimafujinji/article/details/79372911