Applied Mathematics/Learning Theory

Theoretical Prediction Accuracy of Nonnegative Matrix Factorization


Non-negative matrix factorization ( NMF ) is a method of decomposing a matrix with nonnegative value entries into products of two non-negative value elements. Signal processing such as image and sound, bioinformatics, text mining, consumer analysis, etc. are widely used as composite data analysis technology. It is used as a knowledge discovery method in various fields numerically and experimentally. For example, it is applied to decomposition of power spectrum in signal processing, analysis of purchasing factors in consumer analysis and extraction of purchasing patterns such as representative products in the factor. Despite its wide applicability, its mathematical structure, in particular the estimation accuracy, has not been clarified much. If the theoretical value of the estimation precision is known, a criterion can be obtained as to whether or not the numerical experiment is successful. Conversely, if there is no theoretical value, the correctness of the result obtained by numerical calculation is not guaranteed at all.

In general Bayesian Learning, there is the theoretical method to clarify expected generalization error. Moreover, if the model is singular i.e. posterior cannot be approximated to normal distribution, Bayesian method is superior to maximum likelihood method and maximum a posteriori method in that the generalization error is smaller. That is why we study theoretical prediction accuracy of Bayesian NMF.

Main Theorem

Please see: ''Upper Bound of Bayesian Generalization Error in Non-Negative Matrix Factorization''. We also made the bound tighter in the conference paper in IEEE SSCI 2017.

Numerical Experiments

We carried out numerical experiments for estimating the exact value of the generalization error. Experiments results show that the exact value is strictly larger than the generalization error in reduced rank regrerssion if the non-negative rank is strictly larger than usual rank of true matrix. Or else, i.e. if the non-negative rank is equal to the rank, then it seemed that the generalization error is equal to the one of reduced rank regression.

I have written the master thesis in this research, submitted and finished defence.

Model Selection by singular Bayesian Information Criterion (sBIC)

Under Construction