Kernel-independent component analysis

In statistics, kernel-independent component analysis (kernel ICA) is an efficient algorithm for independent component analysis which estimates source components by optimizing a generalized variance contrast function, which is based on representations in a reproducing kernel Hilbert space.[1][2] Those contrast functions use the notion of mutual information as a measure of statistical independence.

Main idea edit

Kernel ICA is based on the idea that correlations between two random variables can be represented in a reproducing kernel Hilbert space (RKHS), denoted by  , associated with a feature map   defined for a fixed  . The  -correlation between two random variables   and   is defined as

 

where the functions   range over   and

 

for fixed  .[1] Note that the reproducing property implies that   for fixed   and  .[3] It follows then that the  -correlation between two independent random variables is zero.

This notion of  -correlations is used for defining contrast functions that are optimized in the Kernel ICA algorithm. Specifically, if   is a prewhitened data matrix, that is, the sample mean of each column is zero and the sample covariance of the rows is the   dimensional identity matrix, Kernel ICA estimates a   dimensional orthogonal matrix   so as to minimize finite-sample  -correlations between the columns of  .

References edit

  1. ^ a b Bach, Francis R.; Jordan, Michael I. (2003). "Kernel independent component analysis" (PDF). The Journal of Machine Learning Research. 3: 1–48. doi:10.1162/153244303768966085.
  2. ^ Bach, Francis R.; Jordan, Michael I. (2003). "Kernel independent component analysis". 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03) (PDF). Vol. 4. pp. IV-876-9. doi:10.1109/icassp.2003.1202783. ISBN 978-0-7803-7663-2. S2CID 7691428.
  3. ^ Saitoh, Saburou (1988). Theory of Reproducing Kernels and Its Applications. Longman. ISBN 978-0582035645.