Furthermore, if people not only stop publishing, but also take down already published works, it will create a moat around already existing Language Models
And the more they DDOS small websites — instead of respectfully scraping once — the more realistic my conspiracy theory looks.
I'm not saying it's AI. But founder build a platform around this issue and then discovered that this issue is hard?
If it's not an AI-slop, it still shows a sloppy reasoning.
He claims anyone can build what he build, and furthermore that he doesn't know how to actually solve the outreach problem. So essentially he did nothing and achieved nothing and instead only moving on, he decides to double down.
Site menu doesn't work, half of pages are missing, I tried to learn what his plans are moving forward, and couldn't find anything.
Even with AI, if you want to sell people on the idea that you personally can make a site, you should present a site that works.
If you claim that there is a solution, you should present an evidence.
It's not enough to say "everyone can do it with AI". I wouldn't trust a site like that not to leak my info.
Even if it's not AI, it still sloppy.
I understand if it was a blog rant, but it's not.
INMHO, to build something like that people need a solid institute of reputation back, and – and! – they need to remember who sellouts are and how to push them out.
It's not a technical problem, but a societal one. We stopped pushing out sellouts, because we knew that life was unaffordable and we wanted people we admire to be able to afford it.
10 years passed, and people on youtube become a part of stock portfolios.
So we have to solve that issue – either we want people rewarded monetarily, and it all will go to shit optimized for monetization, or we don't, and we will sometimes have to see people we like and admire struggle with poverty.
I am cautious about AI "discoveries" after Mythos paper.
What was the process of a writing a paper? Was the question asked by a mathematician? Was the paper right from a get-go or was there someone who pointed out mistakes?
How much attempts were made before solution was found?
I will eat my words if an AI oneshotted that one without any external help, but for know I am left wandering whether it's a new way to attribute discoveries to companies instead of people who put the work in
As per the report, the prompt used to solve the problem is AI-written and the solution was initially graded by an AI grading pipeline. They don't say this explicitly, but it seems like OpenAI has an automatic pipeline where they prompt models for solutions to famous math problems (which wouldn't be unexpected given how flashy a solution to a famous math problem looks)
> Was the paper right from a get-go or was there someone who pointed out mistakes?
Also as per the report, the output of the model isn't really a "paper"; it's a very terse 2 page solution which is apparently correct. The paper was later written based on this solution to make it more presentable.
> How much attempts were made before solution was found?
Given that this appears to be from an automated pipeline, I would say that it had many attempts. But either way, the blogpost says that with enough test-time compute, the model finds this same solution 50% of the time.
There is little evidence that Mythos is any better at finding bugs than any other system. Mythos appears to be impactful because people are, for the first time, using lots of resources (for free from Anthropic) to try and find security issues. The actual bugs found are mostly inconsequential, any chart showing a giant leap in fixes that doesn’t consider whether they were even using any tooling before and whether these are serious issues is junk. If you read the partner’s summaries of Mythos so far, it is a damp squib. Maybe that’ll change but at least for now there is no evidence Mythos is anything but marketing hype.
Yes, and no.
As someone who had studied and had taught math, I really like peer review.
But peer review is a powerful tool.
Carefully choosing what lemmas to give for solving and reviewing the result is my favorite way to teach young minds. Yes, they do solve most problems themselves. But, most of them likely wouldn't be able to do that before someone dissects problem beforehand and points at weak spots in their explanations.
And that's why I question who prompted the model, how they prompted it, and how much their own ideas influenced the output.
I admit, I don't know enough to judge how much of the right solution was actually enclosed in a first reply
I'm also wondering about the process. What was the prompt, what they fed into the model, what it was trained on, etc. The article reads like a marketing post.
Nevertheless new maths is exciting and might lead to what I find slightly more interesting - new physics.
If a for-profit (because... you know, OpenAI isn't at all what it initially was) huge corporation (again, not a cute startup trying to help humanity) publish anything it's a piece of marketing. Every single word a corporation say is marketing.
So... that's also that, a piece of marketing to sell more of whatever their potential client can buy. It's not a piece of research. It's an ad. That's it.
No reliably reachable subset is representative of humanity.
But I also want to argue against the range of understanding argument. Attention has a limit. Anyone who wants develop a deep understanding for any topic would do themselves a disfavor by trying to expand their range aimlessly.
We can't all know everything all at once, so we should just develop some common sense for the most important topics instead. Like "people generally good and against violence". We used to have that once, we can rebuild it now.
I think it might break the game.
Most words sound similar enough to other words. "cat" and "get", "he simply" and "his simply", etc.
Add accents, and half the words would be indistinguishable from each other (note that word "indistinguishable", ironically, would be quite distinguishable).
People parse things like that in so much context, based in their own understanding of a situation, their grasp on speakers accent or speech impairments, etc.
Add to that that most native english speakers blur words together. The pause that in some languages is used to separate words, is used in english to separate sentences. English language as spoken doesn't separate words natively.
The text-to-speech before LLMs was meh. I think it's the ability to generate filler for uncertain words that makes it feel magic compared to before.
>> Teenagers having a blast on TikTok to the detriment of their academics isn’t the same as literate teenagers having a blast on mIRC.
Blast on mIRC isn't comparable to TikTok. It's direct comparison would be a discord or tg chat.
There are a lot of discord chats full with teens that spend their time talking about modifying software. It mostly a whimsical software like small games or desktop pets, not something of interest to most adults.
I, uh, might be biased, but I think not a lot of teens were at the mIRC chats at the time. Mostly teens bought magazines, watched TV and rented porn.
It seems to me like you are comparing outliers then to a baseline now. Baseline now is pretty good, drug addiction down, teen preggy down, less crime, etc
And the more they DDOS small websites — instead of respectfully scraping once — the more realistic my conspiracy theory looks.
reply