Opus 4(.1) is *so* expensive[1]. Even Sonnet[2] costs me $5 per hour (basically)...

generalizations · 2025-08-05T20:02:36 1754424156

Get a subscription and use claude code - that's how you get actual reasonable economics out of it. I use claude code all day on the max subscription and maybe twice in the last two weeks have I actually hit usage limits.

teruakohatu · 2025-08-06T02:22:19 1754446939

> Get a subscription and use claude code

I find the token/credit restrictions on Opus to be near useless even when using Claude Code. I only ever switch to it so get another model's take on the issue. Five minutes of use and I have hit the limit.

closewith · 2025-08-06T06:42:45 1754462565

Is it a max subscription?

We have the $200 plans for work and despite only using Opus, we rarely hit the limits. CCUsage suggests the same via API would have been ~$2000 over the last month (we work 5 hours a day, 4 days a week, almost always with Claude).

ngai_aku · 2025-08-06T13:27:42 1754486862

Are you part time?

closewith · 2025-08-06T15:38:22 1754494702

In a way. Those are my company's working hours.

Tostino · 2025-08-06T03:55:12 1754452512

Yup. Getting to try three or so prompts that it messes up and then running out of quota for hours is entirely useless to me.

ygouzerh · 2025-08-06T08:48:57 1754470137

It seems for Opus the Max plan is almost always needed for being useful

tgtweak · 2025-08-05T20:11:20 1754424680

Is it considerably more cost effective than cline+sonnet api calls with caching and diff edits?

Same context length and throughput limits?

Anecdotally I find gpt4.1 (and mini) were pretty good at those agentic programming tasks but the lack of token caching made the costs blow up with long context.

MarcelOlsz · 2025-08-05T23:59:04 1754438344

If you use Claude Code with a subscription and run `ccusage` [0] you can get an idea of your "true usage" and cost.

[0] https://github.com/ryoppippi/ccusage

bavell · 2025-08-05T21:11:00 1754428260

I'm on the basic $20/mo sub and only ran into token cap limitations in the first few days of using Claude Code (now 2-3 weeks in) before I started being more aggressive about clearing the context. Long contexts will eat up tokens caps quickly when you are having extended back-and-forth conversations with the model. Otherwise, it's been effectively "unlimited" for my own use.

bgirard · 2025-08-05T21:54:55 1754430895

YMMV I'm using the $100/mo max subscription and I hit the limit during a focused coding session where I'm giving it prompts non-stop.

Unfortunately there's no easy tool to inspect usage. I started a project to parse the Claude logs using Claude and generate a Chrome trace with it. It's promising but it was taking my tokens away from my core project.

bartman · 2025-08-05T22:22:01 1754432521

Check out ccusage, it sounds like the tool you’re describing: https://github.com/ryoppippi/ccusage

bgirard · 2025-08-05T22:36:16 1754433376

That's neat. According to the tool I'm consuming ~300m tokens per day coding with a (retail?) cost of ~125$/day. The output of the model is definitely worth $100/mo to me.

j45 · 2025-08-06T14:13:52 1754489632

This is a good bar to know. I see the warnings but not sure how much I really have left.

Do you mostly use opus?

bgirard · 2025-08-07T19:02:37 1754593357

Mostly sonnet because of the usage limits.

j45 · 2025-08-07T22:40:48 1754606448

Makes sense. Seeing some posts about how the system can slow down or decrease in quality at certain times of day.

j45 · 2025-08-06T14:13:26 1754489606

Neat tool thanks!

symbolicAGI · 2025-08-05T22:24:30 1754432670

ccusage on GitHub.

j45 · 2025-08-06T01:14:32 1754442872

Yes, it’s much better.

It uses way less tokens or much more effectively when running locally.

drusepth · 2025-08-06T17:32:38 1754501558

Is there any documentation on what the max sub usage limit is? A coworker tried it and was booted off Opus within just a couple hours due to "high usage". I haven't made the jump since I expect my $3k/mo on API would just instantly fly by a $200/mo sub and then I'd just be back on API again, but if it could carve out $1k-2k of costs for a little bit of time managing sub(s) it might be worth it.

generalizations · 2025-08-06T18:12:50 1754503970

It's not documented - that's the whole point. They can scale it back and forth opaquely, letting the high volume users get more usage whenever the low-volume users aren't using it much. If it's explicit and transparent, you don't get the benefit of that, since it would be gamed by unscrupulous power users.

Also there's a cli argument that lets you specify the model. try `claude --help`.

seneca · 2025-08-05T21:58:40 1754431120

Is there a way to sign up for Claude code that doesn't involve verifying a phone number with Anthropic? They don't even accept Google Voice numbers.

Maybe I'm out of touch, but I'm not handing out my phone number to sign up for random SaaS tools.

cma · 2025-08-06T01:51:48 1754445108

It's maybe the leading subscription based tool in our field, not a random SaaS tool.

what · 2025-08-06T04:30:08 1754454608

They have zero need for a phone number.

senko · 2025-08-06T08:29:04 1754468944

There are a lot of fraudsters out there who will happily create thousands of accounts with valid CCs that will fail on first actual charge.[0]

I wouldn't be surprised if asking for a phone number lowers the fraud rate enough to compensate for the added friction.

[0] Incidentally, this is also why many AI API providers ask for your money upfront (buy credits) unless you're big enough and/or have existing relationship with them.

yencabulator · 2025-08-07T22:57:57 1754607477

Sounds like a trivial fix even for monthly billing, just bill at the start of the month not at the end.

eddythompson80 · 2025-08-06T06:53:16 1754463196

Come on now. You're about to run their cli and let it send any random file on your machine to their API intentionally. Trust them a little.

seneca · 2025-08-06T22:22:33 1754518953

Sure, no contest on that. They still don't need my phone number.

tagami · 2025-08-05T22:13:12 1754431992

use a burner

thomasahle · 2025-08-06T19:59:14 1754510354

That's fine if you use it for private use. Doesn't work if you're building a product using Claude.

Aeolun · 2025-08-05T23:50:58 1754437858

In every price comparison I make. Claude (API) always comes out cheapest if you manage to keep most of your context cached. 90% price reduction for input is crazy.

cma · 2025-08-06T01:56:37 1754445397

Cached prices: $.31 for Gemini Pro / Mtok, $1.50 for claude opus 4.1 / Mtok

There's additional storage costs with google caching, around $3.75 for 5 minutes/Mtok, and Claude Opus is $3.75 for 5minute Cache Writes / Mtok.

For cached reads Gemini Pro is 5X cheaper than Opus and like $0.01 more than Sonnet.

killerstorm · 2025-08-06T11:00:23 1754478023

Well, it's expensive compared to other models. But it's often much cheaper than human labor.

E.g. if need a self-contained script to do some data processing, for example, Opus can often do that in one shot. 500 line Python script would cost around $1, and as long as it's not tricky it just works - you don't need back-and-forth.

I don't think it's possible to employ any human to make 500 line Python script for $1 (unless it's a free intern or a student), let alone do it in one minute.

Of course, if you use LLM interactively, for many small tasks, Opus might be too expensive, and you probably want a faster model anyway. Really depends on how you use it.

(You can do quite a lot in file-at-once mode. E.g. Gemini 2.5 Flash could write 35 KB of code of a full ML experiment in Python - self-contained with data loading, model setup training, evaluation, all in one file, pretty much on the first try.)

energy123 · 2025-08-06T00:33:04 1754440384

Large models are for querying the model

Small models are for querying the context

Opus is cheap if you use it for its niche

thimabi · 2025-08-06T01:14:44 1754442884

> Large models are for querying the model

> Small models are for querying the context

I respectfully disagree.

My experience is that large models are capable of understanding large contexts much better. Of course they are more expensive and slower, too. But in terms of accuracy, large models are always better at querying the context.

kroaton · 2025-08-05T21:09:52 1754428192

GLM 4.5 / Kimi K2 / Qwen Coder 3 / Gemini Pro 2.5