Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Read the presentation. Answer was what I expected. We had unique problem and because we make oil drums amount of cash, dipping a bucket and taking that cash to solve the problem was easy justification.

These are really smart people solving problems they have but many companies don't have buckets of cash to hire really smart people to solve those problems.

Also, the questions after presentation pointed out the data isn't always analyzed in their database so it's more like storage system then database.

>Participant 1: What's the optimization happening on the pandas DataFrames, which we obviously know are not very good at scaling up to billions of rows? How are you doing that? On the pandas DataFrames, what kind of optimizations are you running under the hood? Are you doing some Spark?

>Munro: The general pattern we have internally and the users have, is that your returning pandas DataFrames are usable. They're fitting in memory. You're doing the querying, so it's like, limit your results to that. Then, once people have got their DataFrame back, they might choose another technology like Polars, DuckDB to do their analytics, depending on if they don't like pandas or they think it's too slow.



I skipped to the "why build a database" section and then skipped another two minutes of his tangential thoughts - seems like the answer is "because Moore's law"?


This comment is underrated comedy gold. You clearly have worked with big data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: