Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think I independently came up with the same idea to solve the distributed gfs master problem: use a separate bigtable/gfs1 cluster for master metadata for gfs2.

I'm glad that I'm not crazy :)



Also, does the gfs1 still have single master, so that the gfs2 has mulitple Bigtable tablets serving as distributed masters for gfs2? Is this the cause for "In fact, it just makes the bottleneck limitations of the system’s single-master design more apparent than would otherwise be the case.", as stated in the article?


gfs1 is still single master, but the workload is much simpler in this case: it serves the gfs2 master bigtable cluster exclusively. Most of the documented gfs master failures are due to misbehaved map-reduce clients. Also the gfs1 master can be down for extended period of time without affecting the master operations, due to the nature of the cluster (you're unlikely to create a million files per second resulting in much compaction and splits in metadata tablets)

The quote you mentioned actually meant that if you use Bigtable on top of gfs1, the single master failure is more apparent due to the low latency requirement of the application that use the Bigtable.


Is this vacaya related to the vacaya of hypertable? :-)


Ahaa , we had just conceived this way several months ago in China ...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: