Hacker Newsnew | past | comments | ask | show | jobs | submit | concurrentsquar's commentslogin

It's (probably) XKCD #989 ("Cryogenics"): https://xkcd.com/989/


My team can't even evaluate Ball (as a tool for physical simulation of spherical cow-like objects): https://github.com/nate-parrott/ball/issues/9

Currently, we are using Unreal Engine 5 to do our hundreds of architectural physics simulations - the major issue is that UE5 is very slow on *the EC2 instance* (we only have one 2048 core EC2 instance shared between the entire office; we used to use Vercel and Cloudflare but we had to sell our homes to suddenly subscribe to Cloudflare Enterprise (the CF sales guy told us that we would not be allowed to run a CF Worker for more than 30 days without it, even though we had a CF worker run for 37 years, and many of our CF workers have been running before the creation of CF (nobody knows why)) and a giant spike in our Vercel Cuda Function Invocations (for GPGPU compute on the Edge, allowing architects to view the collapse of their buildings with only ~53 ms of latency (compared to ~53 ms without Next.js))). Ball seems much faster (it can run on a Macbook Air), potentially allowing us to save at least several tens of millions of dollars per year on AWS costs.


All common issues! Particularly the cost spikes without a reasonable explanation. Also, the lack of NT support the issue you point to shows could be a problem. Ball runs fine on "M" Apple chips, however :)


It's embarrassing that we don't have NT support already. I have many users of NT on Alpha and MIPS who need Ball for critical services. Here's a convenient patch with a bunch of garbled test cases. Merge that in.

(I'm totally not a state actor looking to socially engineer you to hide an exploit, by the way.)


Of course. Were you state, there's a Chinese CDN we can totally recommend to cut latency on this.-

(Totally :)


Google Chrome has a built-in reading list (go open the 3-dotted menu at the top-right corner, then click on "Bookmarks and lists" -> "Reading list")


Nobody has mentioned space-based solar power yet (https://www.nasa.gov/wp-content/uploads/2024/01/otps-sbsp-re...: "Launch is the largest cost driver..."), which would the cheapest (and currently only technologically feasible) route to turn humanity into a Kardashev Type 1 (or 2 if we construct a Dyson swarm) civilization (without really cheap fusion reactors).


Calculating the cost per kilogram for LEO with Starship gives me a new startup idea: small business (or even personal) interplanetary postal service.

It only costs $150 per kg in the near future to send objects into space with Starship; so I could, for example, send a Raspberry Pi (47 grams) into LEO for ~7 dollars (as long as I also had 149 tons of other objects from other people to send). A more useful use case would sending fully automated manufacturing facilities (probably either for semiconductors (https://www.nasa.gov/general/the-benefits-of-semiconductor-m...) or crystals (https://uofuhealth.utah.edu/newsroom/news/2017/07/proteinxl))


Great visualization, though (ironically, as one of the first Chrome experiments) the music no longer works on Chrome by default (go to site settings > sound and set it to "Allow" to hear it), and it is somewhat outdated now (for example, it states that no exoplanets have been discovered orbiting Proxima Centauri (and that the 'proposed' JWST is required to find these planets)).


When I first went to Josemar bank as a kid there was a display talking about how exploits may theoretically exist. This would have been early 1990s.

I went a few years ago with my kids, about the same age I was, and they had a counter which was in the thousands.


Did you mean "exoplanets" not "exploits"? That's my best guess, but I have no idea what "Josemar bank" is (another typo? Did you mean to name some kind of science museum, maybe?) which makes it hard to tell.


I think they meant Jodrell Bank,a radio telescope with an astronomy based visitors centre in Cheshire in the uk

https://www.jodrellbank.net/


Thanks!. Figured it was something like that, but Google and Wikipedia were no help.

It's an interesting name for a telescope. What does "Bank" mean in this context? The Wikipedia article links to another telescope, the "Green Bank", but that one just appears to be named after the town it's in, which doesn't seem to be the case for Jodrell Bank.

Edit: Nevermind, the Wikipedia page does have the answer. "It is named from a nearby rise in the ground, Jodrell Bank, which was named after William Jauderell, an archer whose descendants lived at the mansion that is now Terra Nova School.


Yes, post Eurovision posting from a phone leads to devastating typos


OpenAI could either hire private testers or use AB testing on ChatGPT Plus users (for example, oftentimes, when using ChatGPT, I have to select between 2 different responses to continue a conversation); both are probably much more better (in many aspects: not leaking GPT-4.5/5 generations (or the existence of a GPT-4.5/5) to the public at scale and avoiding bias* (because people probably rate GPT-4 generations better if they are told (either explicitly or implicitly (eg. socially)) it's from GPT-5) to say the least) than putting a model called 'GPT2' onto lmsys.

* While lmsys does hide the names of models until a person decides which model generated the best text, people can still figure out what language model generated a piece of text** (or have a good guess) without explicit knowledge, especially if that model is hyped up online as 'GPT-5;' even a subconscious "this text sounds like what I have seen 'GPT2-chatbot' generate online" may influence results inadvertently.

** ... though I will note that I just got a generation from 'gpt2-chatbot' that I thought was from Claude 3 (haiku/sonnet), and its competitor was LLaMa-3-70b (I thought it was 8b or Mixtral). I am obviously not good at LLM authorship attribution.


For the average person using lmsys, there is no benefit in choosing your favorite model. Even if you want to stick with your favorite model, choosing a competitor's better answer will still improve the dataset for your favorite model.

The only case where detecting a model makes any difference is for vendors who want to boost their own model by hiring people and paying them every time they select the vendor's model.


Is it something similar to beam search (https://huggingface.co/blog/how-to-generate#beam-search) or completely different (probably is not beam search if it's changing code in the middle of a block)?

(I can't try right now because of API rate limits)


> One distinguishing feature of "deluxe-chat": although it gives high quality answers, it is very slow, so slow that the arena displays a warning whenever it is chosen as one of the competitors

Beam search or weird attention/non-transformer architecture?


It has a room full of scientists typing out the answers by hand.


Reddit may have told OpenAI to pay (probably a lot of) money to legally use Reddit content for training, which is something Reddit is doing with other AI labs (https://www.cbsnews.com/news/google-reddit-60-million-deal-a... ); but GPTBot is not banned under the Reddit robots.txt (https://www.reddit.com/robots.txt).

This is assuming that lmsys' GPT-2 is retained GPT-4t or a new GPT-4.5/5 though; I doubt that (one obvious issue: why name it GPT-2 and not something like 'openhermes-llama-3-70b-oai-tokenizer-test' (for maximum discreetness) or even 'test language model (please ignore)' (which would work well for marketing); GPT-2 (as a name) doesn't really work well for marketing or privacy (at least compared to the other options)).

Lmsys has tested models with weird names for testing before: https://news.ycombinator.com/item?id=40205935


Sam Altman was on the board of reddit until recently. I don't know how these things work in SV but I wouldn't think one would go from 'partly running a company' to 'being charged for something that is probably not enforceable'. It would maybe make sense if they did pay reddit for it, because it isn't Sam's money, anyway, but for reddit to demand payment and then OpenAI to just not use the text data from reddit -- one of the largest sources of good quality conversational training data available -- strikes me as odd. But nothing would surprise me when it comes to this market.


That said, it is pretty SV behavior to have one of your companies pay the other. A subtle wealth transfer from OpenAI/Microsoft to Reddit (and tbh other VC backed flailing companies) would totally make sense.

VC companies for years have been parroting “data is the new oil” while burning VC money like actual oil. Crazy to think that the latest VC backed companies with even more overhyped valuations suddenly need these older ones and the data they’ve hoarded.


> A subtle wealth transfer from OpenAI/Microsoft to Reddit (and tbh other VC backed flailing companies) would totally make sense.

That's the confusing part -- the person I responded to posited that they didn't pay reddit and thus couldn't use the data which is the only scenario that doesn't make sense to me.


I suppose a "data transfer" from Reddit to OAI would be valuable for SamA too? Still a transfer of value from one hand to the other, while others (eg. Google) have to pay.

That said, I wouldn't be surprised if they pay now. They can't get away with scraping as easily now that they are better-known and commercially incentivized.


Maybe training on whatever this is started before the licensing deal?


robot.txt doesnt really mean anything, I used to work for a company that scraped the web and this was literally not a concern. That being said, using data for training LLMs is a new things and potential lawsuits going reddit's way are a possiblity, we can't really know.

One note, its name is not gpt-2 it is gpt2 which could indicate its a "second version" of the previous gpt architecture, gpt-3, gpt-4 being gpt1-3, gpt1-4. I am just speculating and am not an expert whatsoever this could be total bullshit.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: