Hacker Newsnew | past | comments | ask | show | jobs | submit | thallavajhula's commentslogin

I wasn't impressed by the LLMs up until January or so when Claude Code swooped in. Until then, I felt like the LLMs were slowing me down. I have been using them for a couple of years now for coding at work, but I never really thought they brought in real value. Then in February I worked on a 1-month-ish project timeline and shrunk it to 3 days and that was it. I didn't write a single line of code in that project and I went all in with Claude Code. That was it, _the moment_ of realization. I was thoroughly impressed. I went from nothing to a tool that served several teams. Now I'm starting to see the cracks in LLMs and I'm slowly getting back to picking which task to offload to AI and which ones to do by myself.

Claude is great at coding. That's it. Outside of it, it's just god awful at pretty much everything else. ChatGPT OTOH, is good at coding, but at everything else, I find it brilliant. Gemini never made me want to stick with it. It's good, but never great for my use cases.


Hi Boris! Thanks for Claude Code.

Is there an example of how y'all use Dynamic Workflows internally that you could share with the rest of us here so that we can mimic something similar?


Hey, yep. A few things I personally used dynamic workflows for over the last few weeks:

1. Autonomously landed 20+ optimizations to reduce Claude Code's token usage by ~15%

2. Ported tree-sitter, color-diff, yoga-layout, and a number of other WASM and Rust native modules to TypeScript, improving CPU and memory use by 2-10x in the process

3. Made our CI faster, and repeatedly found and fixed flaky tests (with /loop)

4. Migrated from regex-based bash static analysis to tree-sitter, reducing false positive permission prompts by 45%

5. Reduced Claude Agent SDK startup time by 61%, by repeatedly profiling and optimizing the startup path, putting up a number of PRs in the process

6. Shipped 69 code simplification PRs, deleting >10k lines of code


> Ported tree-sitter, color-diff, yoga-layout, and a number of other WASM and Rust native modules to TypeScript, improving CPU and memory use by 2-10x in the process

Curious to learn more on this (unless there’s a write-up in the works). I’m naive on this matter but:

1. is this because it’s higher cost when passing objects back and forth across the JS/native boundary? 2. Does this have anything more specific to do with use of Bun? 3. is the stance for claude code then to keep all the deps in raw TypeScript? 4. How do you folks keep these ported deps up-to-date?


this feels more like a PR statement than a description of how you used the tool though

None of those are helpful examples we could mimic to figure out how to use the tools.

This reads like a CV, not trying to help or educate.


Very cool. What % of the CC team's engineering would you say goes into QoL (as opposed to new feature development)? Obviously some live in a grey area, while others are more clear like making CI faster.

You _reduced_ its _efficiency_? Why do you make CC more inefficient?

Maxxing everything is all the rage. Gotta cpumaxx or bossman isnt getting his money's worth

Typo! Edited

Is there not a reason to instead port claude code to rust? Do you have internal benchmarks that show that claude code is better at typescript than rust?

Boris, what are your thoughts on WASM as a technology and it's practical implications for AI in the future?

Salvatore really wants to popularize the term Automatic Programming/Coding it seems. (https://antirez.com/news/159)


I keep finding myself to minimize the words to describe the same thing as well, since we are finding ourselves doing "that" operation more and more over time.

maybe shortening the term to "auto-code" would help tho.


https://en.wikipedia.org/wiki/Automatic_programming It's an acknowledged term in computer science, describing any mechanism whatsoever of auto-generating code from a description at a higher level of abstraction. Of course LLM's are highly unusual in being non-deterministic and having a surprisingly broad scope, but this does not make the term inapplicable.


Do people really use Grok for anything outside of Twitter memes or understanding tweets? I'm asking out of genuine curiosity.


Yes, it is genuinely useful for some tasks. It doesn't nanny you as much as the other models. I do a lot of hunting for orphan copyright items that are decades out of print, but the primary models won't do it, chastising me for trying to find copyrighted items. Grok will do it [0].

[0] sometimes you need to lightly jailbreak it, or rerun the prompt, the non-deterministic nature means sometimes you will get a refusal


I haven't been nannied in a long time. It was definitely a problem 2 years ago but now it seems all the models are ok with just about everything I want.


Ohh sure, its users use it for all sorts of things

https://arstechnica.com/tech-policy/2026/03/elon-musks-xai-s...


Grok has the most useful voice mode (ChatGPT voice mode is very dumb, grok seems to use same model as main chat), so if I want to use voice this is the AI I use.

Also I use it for all uncomplicated topics because it gives precise short answers without fluff. Very refreshing.


It's my go to for searches, DIY, personal finance, and more general slice of life AI.

Once it is as good as Kimi K2.6 for coding, I will probably use Grok exclusively. It really is the best conversational AI I've used. It has helped me fix a broken fridge, and a broken electrical oven. Literally saved me at least $4k this year.

Edit: Also saved me $600 because I did my taxes with it. H&R Block is cooked.

Edit 2: Oh shit it is as smart as Kimi K2.6. Time to try it!


Did you do legal filings with it after doing your taxes? Oh my.


what do you mean?


It was a joke about people relying on AI and it doing absolutely terrible things.

Coding is an interesting area -- it can code, then compile to see if that part worked, then test to see if more worked.

With taxes, it sets things up and the review phase is the IRS fining you.


in this case my tax situation is so retarded simple, I could verify what it suggested step by step and I performed the actions on one of those free tax usa website. the irs accepted my returns and everything went fine. if you're in a simple tax situation, try it!


oh, I’ve had the IRS accept and then later fine more than once.

But if you are in a common situation, any llm should be quite helpful.


How do you save money on taxes?

The taxes you owe is a mathematical solve which is always the same....


in america you need to pay a preparer for your taxes because we hate poor people. The user is saying they don't need to pay a preparer because they used Grok. I didn't do that this year but I'll probably do it next year with a frontier model. US taxes are a perfect use case for AI, tbh.


deductions

child credits

points per paycheck proper setup

and of course, avoiding to pay an accountant to set run all this if you are a normal w2 worker.


I wonder how much of that comes from twitter training data. It is useful for memes and trends, but for other things is super bad.


I tried it in Cursor and oh my. No thanks. I hid it after that.


Yes.


>I think there’s something quietly screwing up a lot of engineering teams. In interviews, in promotion packets, in design reviews: the engineer who overbuilds gets a compelling narrative, but the one who ships the simplest thing that works gets… nothing.

I got emotional reading this. This is way too real.


As a Master's student, I didn't have money to afford a MacBook. So, I begrudgingly bought a Dell Vostro 13" at the time. Pretty much all of my friends just got the Dell/Sony/HP laptops and it's not like those laptops were powerful either. They were just pretty much entry level for a price tag of $600-$750. I got mine for $750. This was back in 2009. I had to remove the selection of a Webcam. These companies would pull shit like this, making basic things like a webcam, an add-on. I hated it. IDK what the price tag of a non-Apple laptop is now-a-days and IDK if they still do what they did then, including everything as an add-on, but, I'm so glad Apple released this. This'll be a blessing for students and generally folks who want a high quality laptop without bargaining over which basic add-on to pick, which seemed ridiculous then and feels the same even now.

2009 Me would've LOVED this! I'm so glad Apple released this.

Back in 2013/14 Guillermo Rauch (CEO Vercel) shared a brilliant insight -- develop software on a weak machine and optimize it to work well on it so that when it's used on a powerful machine, it's going to fly. This'll force macOS developers to consider these resource constraints.


The cheapest new laptop you can buy from HP right now has 4 GB of RAM, a 64GB eMMC drive, and an N100. It's $200-300 depending where you look. And somehow this pitiful thing is running Windows 11.

I know it's half the price of the Neo, but it feels like way less than half the computer. I don't like Apple, I won't be buying this, and I won't be recommending it to anyone. But damn, Apple.


According to the USA CPI inflation calculator, that $750 would have been $1,137.05 today. That's striking, but also incredible how much computers have progressed at the same time.

https://www.bls.gov/data/inflation_calculator.htm


Just another silly uninformed take.


When Ghostty was publicly announced, I used it for a few months and gave up on it due to the lack of support for the CMD+F feature that I use Terminal.app. This is a critical feature for me while tailing logs on my local. I tried the workaround of capturing the text into a text file and then searching it. It just didn't work for my workflow and dropped it. Ghostty is great otherwise. But, without the CMD+F, it's of no use to me.


The tip releases have had search support for a few months:

https://github.com/ghostty-org/ghostty/issues/189

https://x.com/mitchellh/status/1993728538344906978

As Mitchell stated above:

> Ghostty 1.3 is around the corner, literally a week or two away, and will bring some critically important features like search (cmd+f), scrollbars, and dozens more. In addition to GUI features it ships some big improvements to VT functionality, as always.


Same. Lack of search and lack of scrollbars make me wonder why this project got so much attention in the first place. iTerm2 seems way more capable.

I suspect it is "just" the very nice-looking default theme in Ghostty. I updated my iTerm2 colors with colors I picked from Tailwind‘s excellent color palette and iterm2 now feels fresh and has all the features I want.


Mitchell’s attempts at more correctness and better speed, plus the no-nonsense UX. iTerm2 is confusing and overwhelming and bloated for those of us who just want a terminal that works.


Sounds like that’s coming in the next release per Mitchell’s comments above, fwiw.


I used to have a custom domain setup via Google apps. Google decided to update it to something else (they changed their name several times and I lost track of the name now). I switched to iCloud+ Mail when iCloud introduced their custom domain support a few years ago. I do have notification summaries on my iOS turned on, but that's just a guilty pleasure of mine. The summarization is so bad that it's funny. I literally have the summarization feature turned on to laugh at how bad it is every time I see a new summary. Anyway, I used to be a everything-Google guy. Now, I just spread my app usage across multiple services, which I think is a win for me in the long run instead of being locked in to an ecosystem.

I also got myself out of the most of the Apple products from the Apple ecosystem too. I'm a 1Password user because I didn't want to be part of Google or Apple ecosystems.


This is great. I am hopeful that Gemini 3.1 Pro would be great. So far, I'm almost always pulled away from Gemini models by Claude. Having used Claude Opus High for a while now, Claude Opus seems to be fantastic at coding. Even Gemini's comparison chart says so. OpenAI's 5.3-codex is by far the weakest (of the 3) for my coding purposes. Claude Opus really shines at explanations and generating code.

Gemini is almost great. Claude Opus is great. I keep switching among these subscriptions every month to not miss out on any of the offerings for too long; ChatGPT Plus <-> Gemini Pro <-> Claude.


> I keep switching among these subscriptions every month to not miss out on any of the offerings for too long; ChatGPT Plus <-> Gemini Pro <-> Claude.

I wonder why many people seem to be doing this instead of just going for a copilot subscription that has access to all those models? Anybody care to share pros and cons?


OpenAI and Anthropic give you a lot of usage/$ through their plans. For the Anthropic Max plans, this can be like a ~90% discount. Copilot does not benefit from this (their pricing model is also different though, it is request-based rather than token usage based, so it is hard to compare).

That's not to mention that the models generally work better in their own harnesses, which is perhaps unsurprising because the models have been trained with the specific harness in mind (and vice versa). That said, I think some 3rd-party harnesses do a lot of work to make different models work well in their harness.


I would suggest you also take a look at Cursor's Composer1.5. It's super fast, and perform better than Gemini3P in my use cases.


I've been trying composer-1.5 on and off and it doesn't come close to Claude's Opus High. The explainability of Claude is just something else.


Sure, my point was it's better than Gemini and it's really really fast, and it's missing from the parent comment.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: