How to find the best fit of a GMM trying multiple components in Matlab

In R, there is Mclust, which would by default run the EM algorithm considering G=9 components. In Matlab, it seems that it only performs the EM algorithm for a single given number of components. Is there either:

  1. A function that is like Mclust where I can test different components at once OR
  2. A way to separately fit G models, and then extract the one satisfying a criteria (i.e. lowest BIC)?

Thanks in advance.

1 Answers

You can use the fitgmdist function in Matlab to separately fit G models with different numbers of components and then select the best fitting model based on a criteria such as the lowest Bayesian Information Criterion (BIC). Here's an example of how you can do this:

BIC = zeros(1, G_range); % Initialize array to store BIC values for different number of components
GMModels = cell(1, G_range); % Initialize cell array to store Gaussian Mixture Models

for i = 1:G_range
    GMModels{i} = fitgmdist(data, i); % Fit Gaussian Mixture Model with i components
    BIC(i) = GMModels{i}.BIC; % Calculate BIC for the model
end

[~, best_model_idx] = min(BIC); % Find the index of the model with the lowest BIC
best_model = GMModels{best_model_idx}; % Select the best fitting model

% You can then access the parameters of the best model using best_model.Means, best_model.Covariances, best_model.ComponentProportion, etc.

This code snippet fits G different models with varying numbers of components to your data, calculates the BIC for each model, and selects the model with the lowest BIC as the best fit.