As a Musician, I'll be more interested in style transfer: You give it a snare (or a multitrack drum loop !), and you tell it to generate a related hit-hat (or a complete kit !)
There is no shortage of samples and sample packs (millions), and most pros are picky : context in king, and style transfer is more contextual.
For instruments/synths/vox, "playability" is important, so the best approach IMO is cloning a sample to playable Midi instrument like midi-ddsp, midi2params or Mawf
Have you tried out Logic’s drummers? They aren’t style transfer per se, but they are AI that drums in a particular style, and you can control how it works.
Only seen it on YouTube. Yes This is still generation, but I like the level of control. If Emergent Drums have this degree of parametric adjustment (I did not test it) it could be useful (to me).
It's unusual to use the term "royalty-free" for a music-creation tool. That term is usually used for purchases of sample libraries, where the seller retains the copyright. It's a given that you own the things you create with a tool, so saying they're royalty-free is superfluous.
I couldn't find a license or terms of service on the website, so I must ask: who owns the copyright on the generated samples?
I see where you're going, but I don't see the legal issue as relevant to my question. The answer I was hoping for was that the company makes no claim toward any rights that may exist in the work the tool generates. That way, regardless of how the law treats generated art, the company won't end up with rights in stuff that their customers made. Compare "royalty-free," which implies the company retains everything except a claim for royalties.
All of the portraits in this demo are computer-generated by a machine learning model called “StyleGAN”. While most of the recent excitement around StyleGAN centers around its amazing ability to generate infinite variation (e.g. thispersondoesnotexist.com <3), the emergent semantics encoded in the latent space are impressive as well.
We can find the latent representations of, say, smiling people. We can then average them and create a new semantic vector that, when added to pictures of non-smiling faces, makes them all smile.
Play with the sliders to see what I mean.
Some possible applications: Generation of assets for games, Customizing ad photography by region/demographics, Lifelike, custom avatars, Compression, Modeling longitudinal medical imagery, Zero-shot inpainting, super-resolution, etc.
There's certainly latent demand for a platform that localizes ad photography, although those customers are sensitive to weird artifacts in the generated images. Likely non-trivial r&d investment there.
The clearest immediate opportunity for GANs is generating content where artifacts might add value or are easily ignored (eg. art). The problem here is there's very little tech moat for these businesses given how easy it is to train a GAN. It'd come down to having a valuable, private dataset.
Lots of other potential commercial applications - we list some more on the demo.