What's the benefit for Meta? They are now the true open source AI providers (after OpenAI got closedAI), but I wonder why they keep releasing such models for free and kinda open source?
Mark Zuckerberg talks about this in their Q1 earnings call.
"I think that there's an important distinction between the products we offer and a lot of the technical infrastructure, especially the software that we -- that we write to support that. And historically, whether it's the Open Compute project that we've done or just open sourcing a lot of the infrastructure that we've built, we've historically open sourced a lot of that infrastructure, even though the products themselves are obviously were not -- we haven’t open sourced the code for our core products or anything like that.
And the reason why I think why we do this is that unlike some of the other companies in the space, we're not selling a cloud computing service where we try to keep the different software infrastructure that we're building proprietary. For us, it's way better if the industry standardizes on the basic tools that we're using and therefore we can benefit from the improvements that others make and others’ use of those tools can, in some cases like Open Compute, drive down the costs of those things which make our business more efficient too.
So I think to some degree we're just playing a different game on the infrastructure than companies like Google or Microsoft or Amazon, and that creates different incentives for us. So overall, I think that that's going to lead us to do more work in terms of open sourcing, some of the lower level models and tools.
But of course, a lot of the product work itself is going to be specific and integrated with the things that we do. So it's not that everything we do is going to be open. Obviously, a bunch of this needs to be developed in a way that creates unique value for our products, but I think in terms of the basic models, I would expect us to be pushing and helping to build out an open ecosystem here, which I think is something that's going to be important."
"we can benefit from the improvements that others make and others’ use of those tools can"
I have always had the impression that maintaining an open source project is way more work than you get back from "the community" of users. Is this not true? Are for instance the internal facebook react users benefiting a huge amount from what outside contributes have built on top of react?
I think an unspoken dimension is that kneecaping the other big tech companies' entrenchments and denying them a market is always good for them - esp when as they point out, it doesn't actually hurt any of their own business interests. Other faang are always a future threat. Hurting them is always a good business move
Broad use helps uncover bugs and make the software more resilient and reliable. They don’t fix all the bugs, and they don’t build features the community wants for the sake of it, but having users of your tools is a benefit.
You get less input from the community than what you put in, but you also get different input than you would get from in-house devs who are all in the same bubble.
> I have always had the impression that maintaining an open source project is way more work than you get back from "the community" of users
You get a ton of valuable work back from quality contributors.
There is a vocal minority of people complaining that they feel burned out because of contributions but attracking some high quality contributors can help a lot.
Pretraining new hires is valuable and the new hires will also train on the open source project docs.
MS put a cool 10b into OpenAI thinking they would have a massive tech moat. FB leaks llama and now OpenAI only has it's status as the bitcoin of LLMs (first, biggest, incumbent)
FB's plan is to F everyone else (MAAG) by making sure they can't make billions off tech that FB have sitting on the shelf, yet is extremely expensive for a true startup competitor to get in on.
The software got rewritten already and the model weights are probably not protecttable.
Especially if you use the model weights to train your own model. Why would you be allowed to use copyrighted data to train your Models but not other Models?
Basically: "commoditise your complement " applied to Facebook, means they want to comoditise the foundational tech like AI. And open source is the route to that.
"For us, it's way better if the industry standardizes on the basic tools that we're using and therefore we can benefit from the improvements that others make and others’ use of those tools can, in some cases like Open Compute, drive down the costs of those things which make our business more efficient too." -- isn't this the Web 2.0 mantra applied to software?
This is the OSS model that been around for 30 years. Operating systems, web servers, countless other projects that help build the internet we know today. Now, AI tools from Meta.
In the highly interesting recent memo leaked from Google, the argument is made that open source will come out the winner in the AI battle and specifically that
"Paradoxically, the one clear winner in all of this is Meta. Because the leaked model was theirs, they have effectively garnered an entire planet's worth of free labor. Since most open source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.
The value of owning the ecosystem cannot be overstated. Google itself has successfully used this paradigm in its open source offerings, like Chrome and Android. By owning the platform where innovation happens, Google cements itself as a thought leader and direction-setter, earning the ability to shape the narrative on ideas that are larger than itself."
This makes more sense than the moat argument. Open sourcing with a noncommercial licence means they get to incorporate effort back into their project, but others can't use it in their businesses. All the academic etc effort can be captured in this way.
The software part already got reimplemented and Models are used as training data for new Models by others. You could argue if using images and other copyrighted training data is allowed you also can use Model to train your own Model.
Precisely. The old way contains innovation to a platform (fb apps, play, appstore). The new way speaks for itself. Total domination of an ecosystem from the root up.
Except neither this model nor several of their recently-lauded “open” releases are open source; they are CC-BY-NC 4.0, aka, you are free to tinker and share, but not to use the work or derivatives for commercial purposes. Any community effort the Meta’s hobbyist-source license attracts is work that isn’t enabling commercial competition, unlike actual open source systems like Suno’s Bark (MIT) or even use-restricted-but-not-non-commercial shared source licenses like Stable Diffusion’s CreativeML Open RAIL-M.
> Any community effort the Meta’s hobbyist-source license attracts is work that isn’t enabling commercial competition
So what?
Sure, maybe the Googles of the world aren't building on top of meta's products, but I can tell you that a lot of startups are.
Does it make these startups vulnerable, to long term future legal action? Sure, but nobody is thinking that far ahead. What people are thinking about is how to get users and show off flashy demos to investors.
Instead, people are just pushing out products, breaking meta's licenses, and not telling people about it, while they attempt to get traction.
Strict licensing, without enforcement, is not worth the paper that the contract is written on.
So yes, it is still beneficial that the code is released, even with a bad license.
So, that's a reason that might wish to release a non-open-source model with this particular license, and one that provides an alternative to the “Meta is doing this because they stand to benefit from open source models taking off”, specifically, “Meta is doing this because it stands to benefit from drawing energy away from open source models into ones that cannot legally be used to commercially compete”.
> Does it make these startups vulnerable, to long term future legal action? Sure, but nobody is thinking that far ahead.
Well, the startups may not be, but Meta maybe is, and its acquiring a zero-cost, upside-only investment in every startup doing that. “Unjust enrichment”.
This might be an unpopular opinion on HN but the whole “ask for forgiveness not for permission” view some take to business feels pretty bad taste to me.
But I am able to train my own LLM on the output of their LLM, right? Or are the big AI players going to argue that you cannot train an AI on data you don't have a license to? (See the catch 22 here?)
> But I am able to train my own LLM on the output of their LLM, right?
Sure. And, there's an argument that the license only applies to the code because model weights aren’t subject to copyright anyway. And available-under-any-license is a lot better than OpenAI’s current stance as far as enabling anyone else, since they’ve gone completely closed to the point where even their papers on their models are more PR than reproducible science. There's a continuum from secret sauce to “do what thou wilt”, and I am not a zealot arguing anything not Open Source must be rejected as not a positive step.
My guess is this isn't their competitive edge, network effects, products, data and distribution is.
In a way, it takes away their competitors edge while racing to the bottom to compete with open source. At the same time, they establish themselves as experts and keep attracting great talent that wants to publish their work openly. And it benefits all of us, so good marketing amongst developers too.
This comment resonates with me and reminds me of T-Mobile making international roaming free; they didn’t really have a ton of business coming from that service, but knew how important it was for their competitors. They made theirs free and forced the industry down that path. (Have since added some fees back but the point is similar to your thoughts)
As Ben Thompson pointed out recently, unlike Google and OpenAI, Meta benefits from open source AI taking off because that makes everyone better content creators, which accrues further value to their social media platforms.
If Meta stands to benefit from open source AI taking off, why are its models CC-BY-NC 4.0 instead of open source?
EDIT: On reflection, you can probably extend the content creation argument to say that noncommercial tools enabling that without enabling commercial competition, to the extent that some of the models will be integrated into Meta products, is the best of all worlds for Meta, so the basic argument works even without open source in the strict sense.
Meta has one of the best if not the best open source track record. They do it likely because it does not interfere with their business model. If outsiders find ways to improve their tech it only helps them.
Facebook doesn't want the models to be the money making bit, because they aren't a licensing/subscription service. They are an ads and soon hardware-platform company. They want those bits to be what people pay for. Not the models.
All these models are licensed under a non-commercial license. So their competitors don't gain a real advantage.
Other than OpenAI (who are remarkably tight lipped), ML researchers are pretty chatty in both their papers and watercooler hangouts. So, the information is going to get out either way. Might as well get ahead of it, and look like the good guy in the process.
This model is vision-only so it can't be SOTA even if it's #1 performing in many of the original categories of benchmarks, which it is (it's a very very good model).
We've moved on from ImageNet-style tests "Choose the most appropriate label for this image from 200 possible labels" to much more advanced "Reasoning" tests[0]. PaLI[1] is potentially the SoTA here but BeIT-3[2] may be better example for my thesis. Notice that BeIT-3 is trained on not just images, but also trained in natural language. It outperforms purely image-trained models on even pure-image tasks like Object Detection and Semantic Segmentation.
Take a look at the major benchmarks for Segmentation (ade20k) [4]: DINOv2, 11th place. BEiT-3, 4th place. Yes, BEiT-3 has 72% more parameters but it's also basically an entire LLM. Even GPT-4 is a multi-modal model, and actually accepts images as prompt inputs, OpenAI just doesn't expose that ability.
More importantly, the new multi-modal models can understand human questioning like "What type of flowers are in the blue buckets of this image?" and respond intelligently, in English/whatever.
DINOv2 was trained with techniques borrowed from LLM training methods, but is not trained for natural language.
Purely my speculation: OpenAI is hobbling their products trying to support all kinds of integrations, specifically Microsoft’s. GPT-4 is not performant enough for end user applications, so they’ve had to gimp a lot of its reasoning to make it speedier.
This opens up an opportunity for their competitors to eat into their moat because OpenAI is treading water/downgrading their product, chasing scale. Meta is leveraging this opening to flood the field with amazing open source tools, all of which compete with OpenAI offerings, knowing that the open source community will run with them and further erode OpenAI’s moat.