Hacker Newsnew | past | comments | ask | show | jobs | submit | chromaton's commentslogin

Historically, the cycle has been requirements -> code -> test, but with coding becoming much faster, the bottlenecks have changed. That's one of the reasons I've been working on Spark Runner to help automate testing for web apps: https://https://github.com/simonarthur/spark-runner

I've recently found that my ability to add new features and squash bugs has outpaced my ability to do full end-to-end tests. To help with this, I created Spark Runner for automated website testing. It will create and execute a plan for tasks you give it in plain text like "add an item to the shopping cart" or you just point it at your front end code and have Spark Runner create the tests for you. It also makes nice reports telling you what's working and what's not.

New project, so feedback is welcome.


TFA seems to be big on mathematical proof of correctness, but how do you ever know you're proving the right thing?

Lisp has been around for 65 years (not 50 as in the author believes), and is one of the very first high-level programming languages. If it was as great as its advocates say, surely it would have taken over the world by now. But it hasn't, and advocates like PG and this article author don't understand why or take any lessons from that.


> If it was as great as its advocates say, surely it would have taken over the world by now.

That is a big assumption about the way popularity contests work.


If something is marginally better, it's not guaranteed to win out because markets aren't perfectly rational. However if something is 10x better than its competitors it will almost always win.


Invert the logic.

The big assumption here is to think that a language can be so much superior and yet mostly ignored after half of century of existence.

I'm sure Lisp has its technical merits but language adoption criterion is multi-dimensional.

Thinking Lisp should be more popular disregarding many factors of language popularity is the true "Programmer who live in Flatland".


free market brain.


The sketch here would be that Lisps used to be exceptionally resource-intensive, allowing closer-to-metal languages to proliferate and become the default. But nowadays even Common Lisp is a simple and lightweight language next to say Python or C++. Still it's hard to overcome the inertia of the past's massive investments in education in less abstraction-friendly languages.


And a C compiler transforms the code into something very lisp-like (SSA).


I take Lisp more like artisanal work. It actually requires more skill and attention to use, but in good hands it can let someone really deliver a lot quickly.

That said, like in anything else, this kind of craftsmanship doesn't translate to monetization and scale the markets demands. What markets want is to lower barrier for entry, templatize, cheapen things, and so on.

It's normal then that languages optimized for the lowest common denominator, with less expressive power and more hand holding have won in popularity in enterprise and such, where making money is the goal, but that Lisp remains a strong and popular language for the enthousiasts looking to level up their craft or just geek out.


You’re assuming that people choose languages based on merit and not based on how much money someone will give them for using them.


You're assuming something better on merit wouldn't make more money as a result, and I'm questioning the actual merits as a result


the silent assumption in both of your perspectives is that the current monetary system is an even playing field when it comes to this context (corporations and their programmers)


this assumes that greatness is a single dimension, and namely, popularity.


Moravec strikes again.


For my benchmarking suite, it turns out that it's about 1/5 the price of Claude Sonnet 4.1, with roughly comparable results.


What use case?


If you're looking for free API access, Google offers access to Gemini for free, including for gemini-2.5-pro with thinking turned on. The limit is... quite high, as I'm running some benchmarking and haven't hit the limit yet.

Open weight models like DeepSeek R1 and GPT-OSS are also made available with free API access from various inference providers and hardware manufacturers.


Gemini 2.5 pro free limit is 100 requests per day.

https://ai.google.dev/gemini-api/docs/rate-limits


I'm getting consistently good results with Gemini CLI and the free 100 requests per day and 6 million tokens per day.

Note that you'll need to either authorize with a Google Account or with an API key from AI Studio, just be sure the API key is from an account where billing is disabled.

Also note that there are other rate limits for tokens per request and tokens per minute on the free plan that effectively prevent you from using the whole million token context window.

It's good to exit or /clear frequently so every request doesn't resubmit your entire history as context or you'll use up the token limits long before you hit 100 requests in a day.


Doesn't it swap to a lower power model after that?


Not automatically but you can switch to a lower power model and access more free requests. I think Gemini 2.5 Flash is 250 requests per day.


I'm assuming it isn't sensitive for your purposes, but note that Google will train on these interactions, but not if you pay.


I think it'll be hard to find a LLM that actually respects your privacy regardless whether or not you pay. Even with the "privacy" enterprise Co-Pilot from Microsoft with all their promises of respecting your data, it's still not deemed safe enough by leglislation to be used in part of the European energy sector. The way we view LLM's on any subscription is similar to how I imagine companies in the USA views Deepseek. Don't put anything into them you can't afford to share with the world. Of course with the agents, you've probably given them access to everything on your disk.

Though to be fair, it's kind of silly how much effort we go through to protect our mostly open source software from AI agents, while at the same time, half our OT has build in hardware backdoors.


I agree, Google is definitely the champion of respecting your privacy. Will definitely not train their model on your data if you pay them. I mean you should definitely just film yourself and give them everything, access to your files, phone records, even bank accounts. Just make sure to pay them those measly $200 and absolutely they will not share that data with anybody.


You're thinking of Facebook. A lot of companies run on Gmail and Google Docs (easy to verify with `dig MX [bigco].com`), and they would not if Google shared that data with anybody.


It’s not really in either Meta or Google’s interests to share that data. What they do is to build super detailed profiles of you and what you’re likely to click on, so they can charge more money for ad impressions.


Meta certainly shares the data internally. https://www.techradar.com/computing/cyber-security/facebooks...


LLMs add a new thread model. If trained on your data, they might very well leak some of its information in some future chat.

Meta, Alphabet might not want that, but it is impossible to completely avoid with current architectures.


Honestly, there are plenty of more profitable things to do with such information. I think ad impressions being the sole motivator for anybody, is sorta two decades ago.


Big companies can negotiate their own terms and enforce them with meaningful legal action.


I don't care. From what I understand of LLM training, there's basically 0 chance a key or password I might send it will ever be regurgitated. Do you have any examples of an LLM actually doing anything like this?


This has been available (20b version, I'm guessing) for the past couple of days as "Horizon Alpha" on Openrouter. My benchmarking runs with TianshuBench for coding and fluid intelligence were rate limited, but the initial results show worse results that DeepSeek R1 and Kimi K2.


Current AI systems don't have a great ability to take instructions or information about the state of the world and produce new output based upon that. Benchmarks that emphasize this ability help greatly in progress toward AGI.


Yes, it would be fantastic to have more languages to test off of. I picked the base language I did (Mamba) because it was easy to modify and integrate into Python.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: