I posted further down before seeing your comment. First, congrats on the launch!...

xena · on Feb 14, 2024

Part of it is for people that want to do GPU things on their fly.io networks. One of the big things I do personally is I made Arsène (https://arsene.fly.dev) a while back as an exploration of the "dead internet" theory. Every 12 hours it pokes two GPUs on Fly.io to generate article prose and key art with Mixtral (via Ollama) and an anime-tuned Stable Diffusion XL model named Kohaku-XL.

Frankly, I also see the other part of it as a way to ride the AI hype train to victory. Having powerful GPUs available to everyone makes it easy to experiment, which would open Fly.io as an option for more developers. I think "bring your own weights" is going to be a compelling story as things advance.

gooseyman · on Feb 14, 2024

https://en.m.wikipedia.org/wiki/Dead_Internet_theory

What have you learned from the exploration?

xena · on Feb 14, 2024

Enough that I'd probably need to write a blogpost about it and answer some questions that I have about it. The biggest one I want to do is a sentiment analysis of these horoscopes vs market results to see if they are "correct".

cosmojg · on Feb 14, 2024

Interesting setup! What's the monthly cost of running Arsène on fly.io?

xena · on Feb 14, 2024

Because I have secret magical powers that you probably don't, it's basically free for me. Here's the breakdown though:

The application server uses Deno and Fresh (https://fresh.deno.dev) and requires a shared-1x CPU at 512 MB of ram. That's $3.19 per month as-is. It also uses 2GB of disk volume, which would cost $0.30 per month.

As far as post generation goes: when I first set it up it used GPT-3.5 Turbo to generate prose. That cost me rounding error per month (maybe like $0.05?). At some point I upgraded it to GPT-4 Turbo for free-because-I-got-OpenAI-credits-on-the-drama-day reasons. The prose level increase wasn't significant.

With the GPU it has now, a cold load of the model and prose generation run takes about 1.5 minutes. If I didn't have reasons to keep that machine pinned to a GPU (involving other ridiculous ventures), it would probably cost about 5 minutes per day (increased the time to make the math easier) of GPU time with a 40 GB volume (I now use Nous Hermes Mixtral at Q5_K_M precision, so about 32 GB of weights), so something like $6 per month for the volume and 2.5 hours of GPU time, or about $6.25 per month on an L40s.

In total it's probably something like $15.75 per month. That's a fair bit on paper, but I have certain arrangements that make it significantly less cheap for me. I could re-architect Arsène to not have to be online 24/7, but it's frankly not worth it when the big cost is the GPU time and weights volume. I don't know of a way to make that better without sacrificing model quality more than I have to.

For a shitpost though, I think it'd totally worth it to pay that much. It's kinda hilarious and I feel like it makes for a decent display of how bad things could get if we go full "AI replaces writers" like some people seem to want for some reason I can't even begin to understand.

I still think it's funny that I have to explicitly tell people to not take financial advice from it, because if I didn't then they will.

tptacek · on Feb 14, 2024

This isn't the target user, but the boy's been using it at the soil bacteria lab he works in to do basecalling for a FAST5 data from a nanopore sequencer.

yard2010 · on Feb 14, 2024

Can you please elaborate?

tptacek · on Feb 14, 2024

I am nowhere within a million miles smart enough to elaborate on this one.

subarctic · on Feb 14, 2024

Commenters like this, for one thing: https://news.ycombinator.com/item?id=34242767