5.7. Non-negative Matrix and Tensor Factorization#
5.7.1. Introduction#
Many of the most descriptive features of speech are described by energy; for example, formants are peaks and the fundamental frequency is visible as a comb-structure in the power spectrum. A basic property of such features is that they are positive-valued. Negative values in energy are not physically realizable. However, most signal processing methods are applicable only for real-valued variables and inclusion of a non-negative constraints is cumbersome.
Non-negative matrix factorization (NMF or NNMF) and its tensor-valued counterparts is a family of methods which explicitly assumes that the input variables are non-negative, that is, they are by definition applicable to energy-signals. In some sense, NMF methods are an extension of prinicipal component analsys (PCA) -type and other subspace methods to positive-valued signals.
5.7.2. Model definition#
Specifically, suppose that the power (or magnitude) spectrum of one
window of a speech signal is represented as a
where
The idea is that
Since the model order
The model is generally optimized by
Here the norm refers to the Frobenius norm, which is defined as the square root sum of squared elements. We do not have analytic solutions to the above optimization problem, but we can solve it by numerical methods, which are included in typical software libraries.
5.7.3. Application#
A typical use of NMF type algorithms is source separation, where we find
the solution of the above optimization problem and then identify those
dimensions of
Note however that NMF-type methods extract only the power (or magnitude) spectrum of the desired signal. In contrast, usually the input signal is a time-frequency representation which has also a phase-component. After application of NMF-estimation, we therefore need also an estimate of the phase-component of the signal. Such methods will be discussed in the speech enhancement chapter of this document.
For more information, see the Wikipedia article: Non-negative matrix factorization.