Your time-average power budget for things that run on phones is about 0.5W (batteries are about 10Wh and should last at least a day). That's about three orders of magnitude lower than a the GPUs running in datacenters.
Even if battery technology improves you can't have a phone running hot, so there are strong physical limits on the total power budget.
More or less the same applies to laptops, although there you get maybe an additional order of magnitude.
China has now had flat CO2 emissions for two years, and experienced a decline in overall CO2 emissions during 2025[1]. Part of this is that they're deploying way more renewables than basically any other large economy [2].
They've also pivoted their industrial strategy so that basically the entire green energy sector depends on Chinese supply chains. This is significantly contributing to their economic growth [3].
I don't know to what extent taxation in Europe contributed to China's decision making here, but it presumably created an market for green energy and therefore helped solidify the economics.
This is of course not to say that there's nothing to criticize in China's environmental policies; there certainly is. But the trope of "why should we do anything because China won't" turns out to be spectacularly ill-informed. Indeed I think it makes more sense to ask the opposite: what are the likely consequences now that China has positioned itself as the global centre of green energy, and what should other countries be doing to ensure that they're not left behind?
And yet, the premise of the question assumes that it's possible in this case.
Historically having produced a piece of software to accomplish some non-trivial task implied weeks, months, or more of developing expertise and painstakingly converting that expertise into a formulation of the problem precise enough to run on a computer.
One could reasonably assume that any reasonable-looking submission was in fact the result of someone putting in the time to refine their understanding of the problem, and express it in code. By discussing the project one could reasonably hope to learn more about their understanding of the problem domain, or about the choices they made when reifying that understanding into an artifact useful for computation.
Now that no longer appears to be the case.
Which isn't to say there's no longer any skill involved in producing well engineered software that continues to function over time. Or indeed that there aren't classes of software that require interesting novel approaches that AI tooling can't generate. But now anyone with an idea, some high level understanding of the domain, and a few hundred dollars a month to spend, can write out a plan can ask an AI provider to generate them software to implement that plan. That software may or may not be good, but determining that requires a significant investment of time.
That change fundamentally changes the dynamics of "Show HN" (and probably much else besides).
It's essentially the same problem that art forums had with AI-generated work. Except they have an advantage: people generally agree that there's some value to art being artisan; the skill and effort that went into producing it are — in most cases — part of the reason people enjoy consuming it. That makes it rather easy to at least develop a policy to exclude AI, even if it's hard to implement in practice.
But the most common position here is that the value of software is what it does. Whilst people might intellectually prefer 100 lines of elegant lisp to 10,000 lines of spaghetti PHP to solve a problem, the majority view here is that if the latter provides more economic value — e.g. as the basis of a successful business — then it's better.
So now the cost of verifying things for interestingness is higher than the cost of generating plausibly-interesting things, and you can't even have a blanket policy that tries to enforce a minimum level of effort on the submitter.
To engage with the original question: if one was serious about extracting the human understanding from the generated code, one would probably take a leaf from the standards world where the important artifact is a specification that allows multiple parties to generate unique, but functionally equivalent, implementations of an idea. In the LLM case, that would presumably be a plan detailed enough to reliably one-shot an implementation across several models.
However I can't see any incentive structure that might cause that to become a common practice.
>a plan detailed enough to reliably one-shot an implementation across several models.
What. Why should this be an output? Why if I make a project should I be responsible for also making this, an entirely different and much more difficult and potentially impossible project? If I come and show you a project that require required thousands of sessions to make I also have to show you how to one shot it in multiple models? Does that even make sense?
But the point of comparison is something like the HTML specification. That's supposed to be a document that is detailed enough about how to create an implementation that multiple different groups can produce compatible implementations without having any actual code in common.
In practice it still doesn't quite work: the specification has to be supplemented with testsuites that all implementations use, and even then there often needs to be a feedback loop where new implementations find new ambiguities or errors, and the specification needs to be updated. Plus implementors often "cheat" and examine each other's behaviour or even code, rather than just using the specification.
Nevertheless it's perhaps the closest thing I'm familiar with to an existing practice where the plan is considered canonical, and therefore worth thinking about as a model for what "code as implementation detail" would entail in other situations.
I think the looping part is what stops this from being a practical solution. If we imagine that the actual code required some iteration in order to put down, I don’t know that we could say there is a one shot equivalent without testing that. Sometimes there may not even be an equivalent.
It’s possible that the solution to code being implementation detail is to be less precious about it and not more. I don’t really have an answer here and I don’t think anyone does because it’s all very new and it is hard to manage.
There’s also a pretty normal way in which this is going to diverge and perhaps already has. Developers are building local bespoke skills just like they used to develop and still do local bespoke code to make their work more efficient. They may be able to do something that you or I cannot using the same models—-there’s no way to homologize their output. It would be like asking someone to commit their dot files alongside the project output. Regardless of whether or not it was the right thing to do no one would do it.
> But at a national level the data is compelling. I'm convinced by the Environmental Kuznets Curve.
Which data do you find compelling?
For people who don't know the Environmental Kuznets Curve is basically the hypothesis that as economies grow past a certain they naturally start to cause less environmental damage.
As far as I can tell the main empirical evidence in favour of this is the fact that some western countries have managed to maintain economic growth whilst making reductions to their carbon emissions. This has, of course, partially been driven by offshoring especially polluting industries, but also as a result of technological developments like renewable energy, and BEVs.
On the other hand, taking a global sample it's still rather clear that there's a strong correlation between wealth and carbon emissions, both at the individual scale and at the level of countries.
It's also clear that a lot of the gains that have been made in, say, Europe have been low-hanging fruit that won't be easy to repeat. For example migrating off coal power has a huge impact, but going from there to a fully clean grid is a larger challenge.
We also know that there are a bunch of behaviours that come with wealth which have a disproportionately negative effect on the environment. For example, rich people (globally) consume more meat, and take more flights. Those are both problems without clear solutions.
(FWIW I agree that solar power is somewhat regressive, but just for the normal "Vimes Boots Theory" reasons that anyone who is able to install solar will save money in the medium term. That requires the capital for the equipment — which is rapidly getting cheaper — but also the ability to own land or a house to install the equipment on. The latter favours the already well off. There are similar problems with electric cars having higher upfront costs but lower running costs. The correct solution is not to discourage people from using things, but to take the cost of being poor into account in other areas of public policy).
> The world will be greener in a high-CO2 environment. There’s no legitimate argument over that fact.
However it's important to remember that world isn't a high school physics experiment, and you can't easily separate out CO2 concentration from the other impacts of increased CO2:
| Climate change can prolong the plant growing season and expand the areas suitable for crop planting, as well as promote crop photosynthesis thanks to increased atmospheric carbon dioxide concentrations. However, an excessive carbon dioxide concentration in the atmosphere may lead to unbalanced nutrient absorption in crops and hinder photosynthesis, respiration, and transpiration, thus affecting crop yields. Irregular precipitation patterns and extreme weather events such as droughts and floods can lead to hypoxia and nutrient loss in the plant roots. An increase in the frequency of extreme weather events directly damages plants and expands the range of diseases and pests. In addition, climate change will also affect soil moisture content, temperature, microbial activity, nutrient cycling, and quality, thus affecting plant growth.
> Certainly it’s more favorable for growth of plants that make food
That does not seem to be what agricultural researchers believe:
| In wheat a mean daily temperature of 35°C caused total failure of the plant, while exposure to short episodes (2–5 days) of HS (>24°C) at the reproductive stage (start of flowering) resulted in substantial damage to floret fertility leading to an estimated 6.0 ± 2.9% loss in global yield with each degree-Celsius (°C) increase in temperature
| Although it might be argued that the ‘fertilization effect’ of increasing CO2 concentration may benefit crop biomass thus raising the possibility of an increased food production, emerging evidence has demonstrated a reduction in crop yield if increased CO2 is combined with high temperature and/or water scarcity, making a net increase in crop productivity unlikely
| When the combination of drought and heatwave is considered, production losses considering cereals including wheat (−11.3%), barley (−12.1%) and maize (−12.5%), and for non-cereals: oil crops (−8.4%), olives (−6.2%), vegetables (−3.5%), roots and tubers (−4.5%), sugar beet (−8.8%), among others
> you can't easily separate out CO2 concentration from the other impacts of increased CO2
>> I never said you could?
I took the fact that you explicitly mentioned "high-CO2 environment" and claimed there was no room for argument over the "fact"s as an indication that you were trying to separate out the impact of CO2 from other factors caused by climate change such as heat stress and drought. If that wasn't the case then apologies for misunderstanding.
> That paper is talking about a net reduction in biomass due to projected losses in places with temperature increases exceeding 10 degrees C.
The abstract says:
| with great biomass reductions in regions where mean annual temperatures exceeded 10 °C
Unless the abstract is especially badly written that suggests that it's not 10°C _change_ but 2°C change leading to biomass loss in areas that are already at 10°C on average.
> IPCC report
Thanks, that's a useful reference! Do you have a link to the final report? That one seems to be a draft and I didn't find the right published version (but there are many so I'm sure I'm missing it).
I note the paragraph you quoted concludes:
> The increased greening is largely consistent with CO2 fertilization at the global scale, with other changes being noteworthy at the regional level (Piao et al., 2020); examples include agricultural intensification in China and India (Chen et al., 2019; Gao et al., 2019) and temperature increases in the northern high latitudes (Kong et al., 2017; Keenan and Riley, 2018) and in other areas such as the Loess Plateau in central China (Wang et al., 2018). Notably, some areas (such as parts of Amazonia, central Asia, and the Congo basin) have experienced browning (i.e., decreases in green leaf area and/or mass) (Anderson et al., 2019; Gottschalk et al., 2016; Hoogakker et al., 2015). Because rates of browning have exceeded rates of greening in some regions since the late 1990s, the increase in global greening has been somewhat slower in the last two decades
So it sounds like a combination of the CO2 increases up to about the year 2000, along with agricultural intensification and various other factors have indeed increased the amount of plant cover, but we are already seeing changes to that picture with further rises to CO2 levels.
> You spent a lot of words arguing with me about things I didn't say.
Well you started with
> The world will be greener in a high-CO2 environment. There’s no legitimate argument over that fact.
And my central point is that the model you're implying there is one in which there's a monotonic relationship between CO2 levels and plant growth. However in reality things are clearly more complex than that, and there is indeed legitimate argument over what factors are dominant in different scenarios.
Your claim that things will only change over long-enough timescales so that you don't have to worry about also seems to lack evidence. In systems with significant feedback loops it seems dangerous to assume that changes will only happen slowly unless you're very confident that you fully understand all the system dynamics. With climate change it's clear that we don't fully understand the system, and some changes are happening faster than earlier models predicted. So _maybe_ we have a few centuries to figure out how to move global agriculture to northern latitudes, and deal with more variable conditions, but from a risk-analysis point of view it seems like a rather poor strategy.
The conclusion is the same, though they've added a paragraph talking about browning in some areas "somewhat slowing" the rate of aggregate increase since the late 90s. Conclusion is unchanged, and in fact, they strengthened it versus the draft by directly attributing it to CO2:
"The increased greening is largely consistent with CO2 fertilization at the global scale, with other changes being noteworthy at the regional level (Piao et al., 2020)"
> So it sounds like a combination of the CO2 increases up to about the year 2000, along with agricultural intensification and various other factors have indeed increased the amount of plant cover, but we are already seeing changes to that picture with further rises to CO2 levels.
Not really. The observations are also made in uninhabited areas. See above.
> And my central point is that the model you're implying there is one in which there's a monotonic relationship between CO2 levels and plant growth.
I said nothing about a monotonic relationship. I said that the earth will have more plants (plant mass, really) with more CO2. This is inevitable. It could follow a monotonic relationship, or it could do something else as factors shift. For example, one big, unpredictable factor that likely swamps everything else, is the randomness of human behavior.
> However in reality things are clearly more complex than that, and there is indeed legitimate argument over what factors are dominant in different scenarios.
No. Greening is occurring, and has been for some time. We have multiple lines of evidence. The IPCC report confidence is high. The only debate is over what might happen in the future, which, again, is fortune telling -- involving not only the climate system, but the actions of people.
> In systems with significant feedback loops it seems dangerous to assume that changes will only happen slowly unless you're very confident that you fully understand all the system dynamics.
I grant you that one can imagine theoretical scenarios in which all sorts of doomy feedback loops happen. The problem with that kind of imaginative exercise is that you have to bring evidence of their existence. So far, with regard to global vegetation, no such evidence exists, and in fact, the opposite of the doom loop scenario is occurring.
Could this change? Maybe! But that's just storytelling right now.
You made a scale-free claim about increasing greenness with increasing CO2 concentration. That implies a monotonic relationship.
> The only debate is over what might happen in the future, which, again, is fortune telling
The idea that using models of physical systems to predict their future evolution is "fortune telling" will surprise many scientists. Indeed, you yourself have proposed a simple model and used it to make a prediction about the future ("the world will be greener in a high-CO2 environment"), and used linear extrapolation of the past to justify the adequacy of your model.
That's not necessarily a bad starting point, but when actual studies with more complex models show different behaviours you should consider there's a possibility you're over-confident in your predictions.
Anyway, I suspect this conversation has become rather pointless. It's always unclear online to what extent people are engaging in good faith, but if it was then I'm rather sure you've now mentally pigeonholed me as a "doomer" who can't be reasoned with.
Notice that it says "almost all programs" and not "almost all _C_ programs".
I think if you understand the meaning of "crash" to include any kind of unhandled state that causes the program to terminate execution then it includes things like unwrapping a None value in Rust or any kind of uncaught exception in Python.
That interpretation makes sense to me in terms of the point he's making: Fil-C replaces memory unsafety with program termination, which is strictly worse than e.g. (safe) Rust which replaces memory unsafety with a compile error. But it's also true that most programs (irrespective of language, and including Rust) have some codepaths in which programs can terminate where the assumed variants aren't upheld, so in practice that's often an acceptable behaviour, as long as the defect rate is low enough.
Of course there is also a class of programs for which that behaviour is not acceptable, and in those cases Fil-C (along with most other languages, including Rust absent significant additional tooling) isn't appropriate.
> Rust which replaces memory unsafety with a compile error
Rust uses panics for out-of-bounds access protection.
The benefit of dynamic safety checking is that it's more precise. There's a large class of valid programs that are not unsafe that will run fine in Fil-C but won't compile in Rust.
As someone who's been quite heavily involved with web-platform-tests, I'd caution against any use of the test pass rate as a metric for anything.
That's not to belittle the considerable achievements of Ladybird; their progress is really impressive, and if web-platform-tests are helping their engineering efforts I consider that a win. New implementations of the web platform, including Ladybird, Servo, and Flow, are exciting to see.
However, web-platform-tests specifically decided to optimise for being a useful engineering tool rather than being a good metric. That means there's no real attempt to balance the testsuite across the platform; for example a surprising fraction of the overall test count is encoding tests because they're easy to generate, not because it's an especially hard problem in browser development.
We've also consciously wanted to ensure that contributing tests is low friction, both technically and socially, in order that people don't feel inclined to withhold useful tests. Again that's not the tradeoff you make for a good metric, but is the right one for a good engineering resource.
The Interop Project is designed with different tradeoffs in mind, and overcomes some of these problems by selecting a subsets of tests which are broadly agreed to represent a useful level of coverage of an important feature. But unfortunately the current setup is designed for engines that are already implementing enough feature to be usable as general purpose web-browsers.
The tweet mentions that this is an arbitrary metric thrust upon them by Apple, so I don’t think they would necessarily disagree with you. During the monthly updates they do also show the passing number of tests without including the encoding tests because of how much they skew things.
Acid 2 bakes in the assumption that you will be displaying it on a desktop/laptop monitor with 100% scaling; It depends on pixel accuracy.
This was a reasonably universal assumption in 2005, but became less and less valid over time, we now have high-dpi screens and the whole idea of pixel accuracy has fallen out of favour (it was never a good idea, but 2005) as phone browsers are expected to rescale websites for better readability/usability.
The result is that Acid 2 fails on my phone, and on my laptop it will pass/fail depending on which screen the window is on.
Acid 3 was too forwards looking and rigid. While Acid 2 was (mostly) testing accepted standards (which IE6 implemented very poorly), Acid 3 tested a bunch of draft standards. It was very strict on many things that weren't well defined and later versions of the standards took the opposite approach.
Basically, Acid 2 was very good at shaming Microsoft into fixing Internet Explorer; But in the long run the whole concept of popular cherry picked torture tests proved to be of limited usefulness (and actually counterproductive) to promoting standards compliant browsers.
They no longer reflect what the average user expects their browser to support. You can pass it and miss on several important things that are considered widespread features nowadays.
Everything you said sounds very reasonable, yet the "Browser-Specific Failures" graph on the main page of the wpt.fyi website explicitly misleads us into thinking
PS I'm a big fan of the work and appreciate what you do. I check the interop page about once a week!
As someone who's been quite heavily involved with having a brain, I'd advocate for using of the test pass rate as a metric for how many tests are passed.
"lecturing" is carrying a lot of needless weight here. Their comment doesn't read like that, they're just pointing out that the metric itself isn't what it seems to be.
The EU DMA says they have to allow third party browser engines access to the same resources (the JIT) that Safari has. It specifically allows them to place reasonable requirements on those third party alternatives:
> The gatekeeper shall not be prevented from taking, to the extent that they are strictly necessary and proportionate, measures to ensure that third-party software applications or software application stores do not endanger the integrity of the hardware or operating system provided by the gatekeeper, provided that such measures are duly justified by the gatekeeper.
Access to rwx memory is inherently dangerous, and it's completely reasonable to expect third parties to have proven that they are serious about producing a usable browser engine before putting such a risky product on the market for consumers to download. The law does not require them to allow any third party application to access the JIT, only a third party application that competes with Safari (a usable web browser).
Yes, but that doesn't require rendering performance or anything like that, but absence of security problems.
You can't justify a requirement for a minimum level of performance or some capability. You can justify a requirement of a guaranteed absence of security bugs, provided that that's a standard you impose on yourself throughout the system.
In addition, the colleges have a lot of data about the people they interview and how well they do during the degree programme.
My understanding (based on a discussion with one Natural Sciences admissions tutor at one Cambridge college nearly 20 years ago, so strictly speaking this may not be true in general, but I'd be surprised if it wasn't common) is that during the admissions process, including interviews, applicants are scored so they can be stack-ranked, and the top N given offers. Then, for the students that are accepted, and get the required exam results, the college also records their marks at each stage of their degree. To verify the admissions process is fair, these marks are compared with the original interview ranking, expecting that interview performance is (on average) correlated with later degree performance.
I don't know if they go further and build models to suggest the correct offer to give different students based on interview performance, educational background, and other factors, but it seems at least plausible that one could try that kind of thing, and have the data to prove that it was working.
Anyway my guess is that of the population of people who would do well if they got in, but don't, the majority are those whose background makes them believe it's "not for the likes of me", and so never apply, rather than people who went to private schools, applied, and didn't get a place.
(also a Cambridge alumni from a state school, FWIW),
All these Cambridge alumni with this dodgy Latin, 'smh'! You're an alumnus, or identifying as an alumna! (Identifying as many alumni at a stretch, but then still not 'a Cambridge alumni'.)
(alumnus not of Cambridge, but from a state school, fwiw)
> The UK’s electricity market operates using a system known as “marginal pricing”. This means that all of the power plants running in each half-hour period are paid the same price, set by the final generator that has to switch on to meet demand, which is known as the “marginal” unit.
> While this is unfamiliar to many people, marginal pricing is far from unique to the UK’s electricity market. It is used in most electricity markets in Europe and around the world, as well as being widely used in commodity markets in general.
The thing that's unique about the UK is that the marginal price is almost always (98% of the time) set by the price of gas. That means when the gas price increases, the wholesale price of electricity, and hence consumer bills, increase in direct response.
Of course the situation is also made worse by the fact that gas is used directly for heating and cooking in a high proportion of British homes.
Yes! It's a rare example of an app that instead of trying to capture your attention into a virtual environment, helps you to direct it outwards into the real world.
The sound id in particular is just an amazing way to really extend what's possible for most people, and provides an on-ramp for people to identify more birds by ear alone (and in general to pay more attention to sound when in nature).
I might argue that Merlin — and especially eBird — lean a bit to heavily to the competitive "high scores" view of birding; given the impact of climate change on bird populations, encouraging people to travel the world and see as many species as possible is clearly problematic.
But that's a minor quibble, and Merlin remains one of the few apps I'd unconditionally recommend to anyone with the faintest chance they'd use it.
Your time-average power budget for things that run on phones is about 0.5W (batteries are about 10Wh and should last at least a day). That's about three orders of magnitude lower than a the GPUs running in datacenters.
Even if battery technology improves you can't have a phone running hot, so there are strong physical limits on the total power budget.
More or less the same applies to laptops, although there you get maybe an additional order of magnitude.
reply