Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not to be a jerk, but 'hundreds of devs and dozens of MR per day' is not 'huge repos'. Certain functionality only becomes relevant at scale, and what is easy on a repo worth hundreds of megabytes doesn't work anymore once you have terabytes of source code to deal with.


> terabytes of source code

You sure that exists?

Git repositories that contain terabytes of source code?

I could imagine a repo that is terabytes but has binaries committed or similar... But source code?


Google's monorepo is in fact terabytes with no binaries. It does stretch the definition of source code though - a lot of that is configuration files (at worst, text protos) which are automatically generated.


Google had 86TB of sourcecode data in Piper way back in 2016.


Dang, that's mind boggling - especially if I keep in mind that a book series like lord of the rings is mere kilobytes if saved as plain text.

Having 86 TB of plain text/source code - I can't fathom the scale, honestly

Are you absolutely sure there aren't binaries in there (honestly asking, the scale is just insane from my perspective - even the largest book compilation like Anna's isn't approaching that number - if you strip out images ... And that's pretty much all books in circulation - with multiple versions per title)


Each snapshot of the repo isn't that big, but all the snapshots together, plus all the commit metadata and such, are


git could never, but piper at google is way over that figure. Way, way over.


Microsoft has actually done a lot of work to scale got to large repos


It's why there's special Microsoft Git VFS (a lot like VFS at google that is also referenced in the talk).

It was made to make working on Windows source code possible with Git.


Very sure, i work in one




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: