It's really interesting for me to see AI voice cloning taking off. A few years ago I worked on a paper called HuBERT (https://arxiv.org/abs/2106.07447) which has since really taken off for doing this kind of stuff (it's the speech representation that SoViTs uses, for example). At the time the main research focus was low-resource ASR; it was sort of a fluke that it works so well for voice conversion. It makes me feel confident that open-source ML will drive more value over the next five years than closed-source ML, despite all the hype around big tech, simply because random people building weird, unexpected stuff on top of foundation models will make things that no one could have predicted.