I've been out of the loop for stats for a while, but is there a viable approach for estimating ex ante the number of clusters when creating a GMM? I can think if constructing ex post metrics, i.e using a grid and goodness of fit measurements, but these feel more like brute forcing it
There are Bayesian nonparametric methods that do this by putting a dirichlet process prior on the parameters of the mixture components. Both the prior specification and the computation (MCMC) are tricky, though.