deeplearner1's comments

deeplearner1 · on June 13, 2022

Do you still plan to publish your research in a paper?

15ai · on June 13, 2022

Yes, I do. For the past three years, I have done nothing but work on this project nonstop. I've been working on massive improvements (that some have pointed out in this thread) that I've been stuck on for the past several months, but I'm getting close to finishing that up.

I don't feel comfortable publishing or releasing anything until I know for a fact that I can make no further improvements. It's not out of corporate greed or anything like that - I'm just really paranoid about getting out the best work possible.

maxander · on June 13, 2022

Respectfully, the perfect is the enemy of the good, and it’s entirely reasonable to publish what you have now. If later you make further improvements, you can simply publish again.

15ai · on June 13, 2022

You're completely correct, but I'm afraid this is more of a personal problem. I know I'll never be able to forgive myself if I figured out a solution to one of the more obvious problems with the model after I've already published it. I'd just be far more comfortable being happy with my own work before I release it to the wild. I know that this is selfish, and I apologize.

deeplearner1 · on June 13, 2022

This entire thread is honestly so disturbing, this comment especially. Not only is it rife with misinformation (using copyrighted material for training is totally legal and the whole project is paid out of pocket), but is it really that big of a deal to want credit for the work they’ve done? The developer has had their work stolen by companies, influencers, and grifters, and people here are getting pissy that they can’t wait 10 seconds to wait for a popup.

I don’t know why, but I honestly expected more from HN.

redredrobot · on June 13, 2022

You're right about the compute part being wrong. I never said it wasn't legal, just that they took someone else's work to train it. I would hope that voice synthesis is illegal without permission from the voice's owner, but I imagine it is untested so far.

But it's not just about the popup - it 's more that when your work is fundamentally about using reusing someone else's character, it feels pretty hypocritical to be so focused on making sure you get credit.

deeplearner1 · on June 13, 2022

Just curious. Do you feel the same way about DALL-E and Imagen?

redredrobot · on June 13, 2022

If they are used in a tool that lets you generate someone's likeness as part of user-specified new content, yes. But unlike 15.ai that isn't their core purpose and no such tool exists.

layer8 · on June 13, 2022

> wait 10 seconds to wait for a popup

The problem is that after having to wait for 10 seconds to reject their terms of service (which you should be able to reject right away) before even being able to see what the site is about, they are rickrolling you, effectively giving you the finger for not wanting to agree to their terms without context. That‘s quite unprofessional, counterproductive and antagonistic.

bilekas · on June 13, 2022

I share this sentiment entirely. There seems to be a growing trend on HN that negativity is popular. A project like this, to me at least, would seem to be right up HN's street.

Shame to see the toxicity over a passion project, whos creator generously went out of his way to answer the questions and ridiculous comments.

redredrobot · on June 13, 2022

I think there are a bunch of people who consider this work unethical or at least deeply in the grey. The negativity isn't that surprising

Kiro · on June 13, 2022

Just stop it. We need good vibes, not this toxic hate or we will drive the cool people away.

deeplearner1 · on June 13, 2022

Making things up out of thin air like “the creator used someone else’s compute” goes beyond negativity because someone thinks the project is in the grey. That is just straight up disinformation.

deeplearner1 · on June 13, 2022

MIT doesn’t own the model, where did you get that idea from? If you read through the website, it says that the developer alone owns everything related to the project, and the only funding he received from MIT was a small amount from the beginning.

It’s really strange reading these ignorant comments from HN…

deeplearner1 · on June 12, 2022

“Make money”? The creator loses several thousands of dollars a month hosting the site, and it’s done for free. The Patreon donations are all voluntary and only offer a pittance to the developer.

I highly suggest reading into the project first. The Wiki article I linked before (https://en.wikipedia.org/wiki/15.ai) answers all of your questions about copyright infringement.

forrestthewoods · on June 12, 2022

Feel free to replace "make money" with "collect revenue". This is currently a research project (with funding). However it's long-term goal is to achieve commercial quality voice acting and dubbing. It could be given away for free, sold directly, sold downstream, sold indirectly, or otherwise generate commercial value.

In terms of copyright infringement, your wiki link answers nothing. A court ruled that Google could use copyrighted book text to train an algorithm to improve search results because the copying was highly transformative and did not serve as a market substitute to the original work.

Meanwhile 15.ai is using copyrighted voice recordings to train an algorithm to synthesize new voice recordings that sound like they came from the original speaker. This is radically different from the Google case. Just because one instance of using copyrighted material to train an algorithm qualifies as fair use does not mean that all use of copyright material to train any algorithm also qualifies as fair use.

There is absolutely nothing about this that is settled law. In the next 20 years there are going to be lots of lawsuits, lots of settlements, possibly a few rulings, and maybe even a few new laws. I find the whole topic very interesting. YMMV.

Closi · on June 13, 2022

Like you say, the law is not settled on this, but I assume if the author got a takedown request they would probably comply.

In many instances a policy of "ask for forgiveness rather than permission" can get you further, faster. While Nickelodeon are unlikely to grant you a license to the Spongebob voice because that has broader licensing and IP repercussions, they are likely to tolerate a research project using their characters (e.g. just as they have to-date tolerated The SpongeBob SquarePants Movie Rehydrated, which was a fan re-creation of one of their actual movies).

noobermin · on June 13, 2022

I heavily doubt it's "several thousands of dollars"...

15ai · on June 13, 2022

It is indeed several thousands of dollars a month. I can show you AWS invoices, if you're skeptical. Just send me an email and I'd be happy to show proof.

adastra22 · on June 13, 2022

AWS is like 10x as expensive as the competition, most of the time. Have you considered switching hosting providers?

noobermin · on June 13, 2022

No, you don't need to send me pictures of your invoice...I'll just take your word for it then.

deeplearner1 · on June 12, 2022

If you want more information about 15.ai, I highly suggest reading their Wikipedia article! https://en.wikipedia.org/wiki/15.ai

The whole history behind the project is fascinating: 4chan had a huge role in its development, and the project's work was stolen by an NFT company that a famous voice actor endorsed not too long ago.

julianeon · on June 12, 2022

Ah, I was wondering why they were so concerned about attribution.

The truth is that, today, if I was going to use a tool to generate voices (say for YouTube), I wouldn't necessarily pick a small SaaS tool. I'd use Amazon Polly or some other GCP-style platform voice creation tool. There are already a few products in the space, and their costs are so low as to be almost negligible (example: Polly, 5 million characters free). For a commercial project, I could probably stay on a free tier for a whole year.

With Dall*E, it seems like the only option, and it's such a superior option that a website could abuse it for commercial profits. But for voice synthesis, it's already dirt cheap and commercially available without limitations.

rockemsockem · on June 13, 2022

Polly and GCP's voices still sound a tad robotic though unfortunately :/

15.ai seems to beat them on some sentences, but not all. Looking forward to the day when we can have real human-level quality of voices on-demand.

gwern · on June 13, 2022

15-kun has always been fanatical about attribution; the plagiarism just made him more so.