Would it be useful for you to have a SQL database that’s like SQLite (single fil...

pdimitar · 2025-05-02T00:00:07 1746144007

Which database might that be?

I have been looking for replacement of SQLite for years -- admittedly, not very actively, embedded databases are just a hobby obsession and my life did not allow me much leisure time in the last years -- and still couldn't find one.

The written-in-Rust `sled` database is more like a key-value store and I had partial successes with it but it's too much work making a KV store a relational database.

vlovich123 · 2025-05-02T01:09:07 1746148147

I have a PoC KV store that does > 150M write/s (Rust) using a single core (for 8 bytes of data inserted - it gets bandwidth limited on disk quite quickly even latest NVME PCIE5 disks). The plan is to have it support application RAID0 out of the box so that you could spread the data across multiple disks, but of course that's something you have to setup up-front when you create the DB.

I then would add a SQL engine on top - not sure how much SQL would slow things down but hopefully not much. I haven't found anyone who's interested in anything like that though.

And yes I realize this is several orders of magnitude more performance than any other DB out there.

pdimitar · 2025-05-02T16:34:26 1746203666

Similar to @zeroq, I don't really need a K/V store. I need a full relational database that I can also use for analytics and time series, and I want it embedded. And as strict as possible -- PostgreSQL is doing this really well.

I don't mind that there are databases that can be forced into all that and work alright but I admit I am not willing to put the plumbing work if I can avoid it. If I can't avoid it at one point then, well, we'll cross that bridge when we get to it.

DuckDB and ClickHouse(-Local) are amazing candidates but I have never evaluated their normal OLTP performance. For now.

(EDIT: let that not stop you however. Please publish it and announce your thing on HN as well.)

vlovich123 · 2025-05-02T19:19:27 1746213567

As I said, I'll be adding a relational layer on top. Similar to how RocksDB is used under the hood for MySQL & other databases at Facebook.

zeroq · 2025-05-02T01:36:41 1746149801

kv doesn't cut my use case. After import I'm running several queries to remove unwanted data. I was shortly contemplaiting filtering while I was importing but (a) I couldn't really handle relationships - hence two distinct steps - remove and remove orphans, and (b) even if I could it's really much cleaner code having a single simple function to import a table, and then having a clean one liners to remove weed. And (c) I need sql queries later on.

zeroq · 2025-05-02T00:13:42 1746144822

Honestly I don't see much use for yet-another-sqlite.

The premise of having 100M/s writes instead of 500k/s sounds bit unrealistic, but at the same time, while simply importing tuples and completely ignoring stuff like foreign keys, I'm only utilizing one core. I had on my todo list an experiment to run these imports in paralell into different databases and then merging them somehow, but I ran out of time. Again, 10Gb sqlite is quite large.

On the other hand, I think the adoption and the fact that you can take your db basicially anywhere and it will run out of the bat is something you can't ignore. I was briefly looking at pglite but I don't really see benefits apart from a niche use case when you really need that compatibility with big brother.

And then sqlite has so many hidden gems, like the scenario where you can use a sqlite file hosted on http like a remote database! I can post my 10Gb database on S3 and run count(*) on main table and it will only take like 40kb of bandwidth.

pdimitar · 2025-05-02T00:23:33 1746145413

> Honestly I don't see much use for yet-another-sqlite.

Agreed. I want something better than SQLite, something that learns from it and upgrades it further.

> On the other hand, I think the adoption and the fact that you can take your db basicially anywhere and it will run out of the bat is something you can't ignore.

Absolutely. That's why I am soon finishing my Elixir -> Rust -> SQLite library since I happen to believe most apps don't even need a dedicated DB server.

> I was briefly looking at pglite but I don't really see benefits apart from a niche use case when you really need that compatibility with big brother.

I would probably easily pay 1000 EUR next month if I could have SQLite with PostgreSQL's strict schema. That's the one weakness of SQLite that I hate with a passion. I know about strict mode. I am using it. Still not good enough. I want "type affinity" gone forever. It's obviously a legacy feature and many people came to rely on it.

Hence I concluded that SQLite will never change and something newer should arrive at one point. Though how do you beat the (likely) millions of tests that SQLite has? You don't... but we have to start somewhere.

> And then sqlite has so many hidden gems, like the scenario where you can use a sqlite file hosted on http like a remote database! I can post my 10Gb database on S3 and run count() on main table and it will only take like 40kb of bandwidth.*

Admittedly I never saw the value in that, to me that just seems like you are having a remote database again, at which point why not just go for PostgreSQL which is stricter and has much less surprises. But that's my bias towards strictness and catching bugs at the door and not 10 km down the road.

zeroq · 2025-05-02T01:31:17 1746149477

@types

I hear you. As someone who lost rest of his hair during last 10 years talking to frontend kids claiming that types are for grannies - I'm on your side. But having that said, sqlite has a niche application and you can enforce types on app layer, the same way you do it with web apps. Trust, but verify. Better - don't trust at all. At the end of the day the way I see it - it's just like protobuffers et al. - you put some data into a black box stream of bytes and it's your responsiblity to ensure correctness on both ends.

@serverless

It's twofold. On one hand you have the ability to move faster. On the other you have less moving parts that need maintenance and can break. Plus, for me personally, it's the default mindset. Let me give you an example - in 2025 still most online shops have filters that will trigger a full reload of the web page. When I'm clicking on TVs in a shop I don't need to reload the webpage everytime I click on something, the app could easily got the whole stock in single json and filter results on the fly while I'm fiddling with filters. Sure it doesn't work for amazon, but it works for 95% of shops online. Yet no one is doing it. Why?

My point - I'm looking for a way to simplify processes, and for some niche applications it's just more convenient.

pdimitar · 2025-05-02T16:31:52 1746203512

RE: serverless, I think I understand the use-case and I get the homogeneity argument, it's just that to me SQLite mostly wins for being embedded; the rest of its traits are a nice bonus but using stuff e.g. Litestream I don't view as super important. And I would change a good amount of its internals if it were up to me. But! It helps you, you are using it, you are happy with it -- cool!

RE: stricter types, oh, I am adding a ton of code to verify types, absolutely. My upcoming FFI library (Elixir -> Rust -> SQLite) will have a lot of "raw" FFI bridges to SQLite with some good sensible default checks but ultimately I'd leave it to the users (programmers) of the library to f.ex. make sure that the field `price` in each result from the set is in fact a `REAL` (float / double) value. That's going to be the next game though, the "raw" FFI stuff will just make sure nothing ever crashes / panics (as much as I can guarantee; obviously I can't stop the OS killing the process or running out of disk or memory) and return errors as detailed and as machine-readable as they can be (major selling point, at least for me when I start dogfooding it). Just today I started working on using interruptible SQLite operations (via its progress handler mechanism) and it's almost done and I'll release it via a PR. Which will also make the library near-real-time friendly (I am aiming at 1-10 ms pauses between each check-in at most even if you are fetching a million records). Etc.

So yeah, no trust and a lot of verification indeed. But I would still like to have some more safety around executing raw SQL (not talking injection here) where e.g. you are never allowed to insert a string into an integer column.

It's hard to complain though. SQLite is one of the very very best softwares ever made. If the price for using it is to write some more conservative validation code at the edges then that's still a fantastic deal and I am happy to do it.

pdimitar · 2025-05-02T16:45:56 1746204356

Sorry, I made like 10 edits of the initial comment. Hyper-active at the moment apparently.

zeroq · 2025-05-01T16:08:54 1746115734

Not really.

I tested couple different approaches, including pglite, but node finally shipped native sqlite with version 23 and it's fine for me.

I'm a huge fan of serverless solutions and one of the absolute hidden gems about sqlite is that you can publish the database on http server and query it extremely efficitent from a client.

I even have a separate miniature benchmark project I thought I might publish, but then I decided it's not worth anyones time. x]