MATLAB File Help: cv.EM Index
cv.EM

Expectation Maximization Algorithm

The class implements the Expectation Maximization algorithm. This is an algorithm to train Gaussian Mixture Models (GMM).

Expectation Maximization

The Expectation Maximization(EM) algorithm estimates the parameters of the multivariate probability density function in the form of a Gaussian mixture distribution with a specified number of mixtures.

Consider the set of the N feature vectors {x1,x2,...,xN} from a d-dimensional Euclidean space drawn from a Gaussian mixture:

p(x;a_k,S_k,PI_k) = sum_{k=1..m} PI_k * p_k(x),
    PI_k>=0,  sum_{k=1..m}(PI_k)=1

p_k(x) = phi(x;a_k,S_k)
       = 1/((2*pi)^(d/2) * |Sk|^(1/2)) *
         exp(-0.5 * (x - a_k)' * inv(S_k) * (x - a_k))

where m is the number of mixtures, p_k is the normal distribution density with the mean a_k and covariance matrix S_k, PI_k is the weight of the k-th mixture. Given the number of mixtures m and the samples xi, i=1..N the algorithm finds the maximum-likelihood estimates (MLE) of all the mixture parameters, that is, a_k, S_k and PI_k:

L(x,theta) = logp(x,theta)
           = sum_{i=1..N} log(sum_{k=1..m} PI_k * p_k(x))
-> argmax_{theta IN THETA},

THETA = {(a_k,S_k,PI_k): a_k IN R^d, S_k=S_k'>0, S_k IN R^(dxd),
                         PI_k>=0, sum_{k=1..m}(PI_k)=1}

The EM algorithm is an iterative procedure. Each iteration includes two steps. At the first step (Expectation step or E-step), you find a probability p_{i,k} (denoted alpha_{i,k} in the formula below) of sample i to belong to mixture k using the currently available mixture parameter estimates:

alpha_{k,i} = ( PI_k * phi(x;a_k,S_k) ) /
              sum_{j=1..m} (PI_j * phi(x;a_j,S_j))

At the second step (Maximization step or M-step), the mixture parameter estimates are refined using the computed probabilities:

PI_k = (1/N) * sum_{i=1..N}(alpha_{k,i})
a_k = sum_{i=1..N}(alpha_{k,i}*x_i) / sum_{i=1..N}(alpha_{k,i})
S_k = sum_{i=1..N}(alpha_{k,i} * (x_i - a_k)*(x_i - a_k)') /
      sum_{i=1..N}(alpha_{k,i})

Alternatively, the algorithm may start with the M-step when the initial values for p_{i,k} can be provided. Another alternative when p_{i,k} are unknown is to use a simpler clustering algorithm to pre-cluster the input samples and thus obtain initial p_{i,k}. Often (including machine learning) the k-means algorithm is used for that purpose.

One of the main problems of the EM algorithm is a large number of parameters to estimate. The majority of the parameters reside in covariance matrices, which are dxd elements each where d is the feature space dimensionality. However, in many practical problems, the covariance matrices are close to diagonal or even to mu_k * I, where I is an identity matrix and mu_k is a mixture-dependent "scale" parameter. So, a robust computation scheme could start with harder constraints on the covariance matrices and then use the estimated parameters as an input for a less constrained optimization problem (often a diagonal covariance matrix is already a good enough approximation).

References

[Bilmes98]:

J. A. Bilmes. "A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models". Technical Report TR-97-021, International Computer Science Institute and Computer Science Division, University of California at Berkeley, April 1998.

See also
Class Details
Superclasses handle
Sealed false
Construct on load false
Constructor Summary
EM EM constructor 
Property Summary
ClustersNumber The number of mixture components in the Gaussian mixture model. 
CovarianceMatrixType Constraint on covariance matrices which defines type of matrices. 
TermCriteria The termination criteria of the EM algorithm. 
id Object ID 
Method Summary
  addlistener Add listener for event. 
  calcError Computes error on the training or test dataset 
  clear Clears the algorithm state 
  delete Destructor 
  empty Returns true if the algorithm is empty 
  eq == (EQ) Test handle equality. 
  findobj Find objects matching specified conditions. 
  findprop Find property of MATLAB handle object. 
  ge >= (GE) Greater than or equal relation for handles. 
  getCovs Returns covariation matrices 
  getDefaultName Returns the algorithm string identifier 
  getMeans Returns the cluster centers (means of the Gaussian mixture) 
  getVarCount Returns the number of variables in training samples 
  getWeights Returns weights of the mixtures 
  gt > (GT) Greater than relation for handles. 
  isClassifier Returns true if the model is a classifier 
  isTrained Returns true if the model is trained 
Sealed   isvalid Test handle validity. 
  le <= (LE) Less than or equal relation for handles. 
  load Loads algorithm from a file or a string 
  lt < (LT) Less than relation for handles. 
  ne ~= (NE) Not equal relation for handles. 
  notify Notify listeners of event. 
  predict Predicts response(s) for the provided sample(s) 
  predict2 Returns log-likelihood values and indices of the most probable mixture component for given samples 
  save Saves the algorithm parameters to a file or a string 
  train Estimates the Gaussian mixture parameters from a samples set 
  trainE Estimate the Gaussian mixture parameters from a samples set, starting from the Expectation step 
  trainEM Estimate the Gaussian mixture parameters from a samples set 
  trainM Estimate the Gaussian mixture parameters from a samples set, starting from the Maximization step