GPmix.UniGaussianMixtureEnsemble
- class GPmix.UniGaussianMixtureEnsemble(n_clusters, init_method='kmeans', n_init=10, mom_epsilon=0.05)[source]
Bases:
objectConsensus clustering using an ensemble of univariate Gaussian Mixture Models (GMMs).
This class fits univariate GMMs to multiple one-dimensional projections of the data, computes base clusterings, and combines them into a consensus clustering using spectral clustering on an affinity matrix built from binary membership matrices. Base clusterings are weighted by an estimate of their total misclassification probability.
- Parameters:
n_clusters (int) – Number of mixture components (clusters) to fit in each GMM and in the consensus clustering.
init_method ({"kmeans", "k-means++", "random", "random_from_data", "mom"}, optional) – Initialization method for GMM parameters. Default is
"kmeans". The"mom"option uses method-of-moments initialization.n_init (int, optional) – Number of initializations to perform for each GMM fit. The best result is kept. Default is
10.mom_epsilon (float, optional) – Lower bound for GMM weights (and related constraints) when using
init_method="mom". Ignored otherwise. Default is5e-2.
- MoM_res
If
init_method == 'mom', the method-of-moments solver results for each projection.- Type:
- clustering_weights_
Weights assigned to each base clustering.
- Type:
ndarray of shape (n_projs,)
- labels_
Cluster labels assigned by the consensus clustering.
- Type:
ndarray of shape (n_samples,)
- max_cca_labels_
Permutation of predicted labels that yields the highest classification accuracy when compared to ground truth.
- Type:
Notes
The affinity matrix is constructed as a weighted sum of outer products of binary membership matrices (one per projection), where the weights are proportional to the inverse of each GMM’s estimated total misclassification probability.
- binary_membership_matrix()[source]
Construct a binary membership indicator matrices from the cluster membership matrice.
- Return type:
- fit_gmms(projs_coeffs, n_jobs=-1, **kwargs)[source]
Fit projection coefficients to univariate Gaussian mixture models
- Parameters:
projs_coeffs (array-like of shape (number of projections, number of samples)) – array of projection coefficients to fit to univariate GMMs.
kwargs – keyword arguments for joblib Parallel
- fuzzy_membership_matrix()[source]
Construct the cluster membership matrices from GMM fits.
- Return type:
- get_affinity_matrix(weighted_sum, precompute_gmms=None)[source]
Construct affinity matrix using binary membership matrices and clustering weights
- get_clustering(weighted_sum=True, precompute_gmms=None, **kwargs)[source]
Obtain the consensus clustering via Spectral clustering of Affinity matrix
- get_clustering_weights(weighted_sum, precompute_gmms=None)[source]
Compute weights for base clusterings
- Return type:
- get_omega_map(weights, means, vars)[source]
Construct matrix of misclassification probabilities
- Return type:
- get_omega_prob(dist_a, dist_b)[source]
Construct misclassification probability omega_{b|a} for univariate GMMs
- get_total_omega(weights, means, vars, weighted_sum)[source]
Compute total misclassification probability for univariate GMM
- Return type:
- gmm_with_MoM_inits(data)[source]
Fit gmms with initialization from method of moment estimation
- Parameters:
data (ndarray)
- plot_gmms(ncols=4, fontsize=12, fig_kws={}, **kwargs)[source]
Visualize GMM fits
- Parameters:
ncols (int, optional) – Number of columns in the plot grid. Default is 4.
fontsize (int, optional) – Font size for axis labels. Default is 12.
fig_kws (dict, optional) – Additional keyword arguments for figure creation. Default is an empty dict.
kwargs (dict, optional) – Additional keyword arguments for seaborn’s histplot function. Default is an empty dict.