Hahaha I hear you. "Me too"s are frowned upon here, but I could not resist. Ther...

tumanian · on May 9, 2012

Hadoop API is a pain. Well, it is fine when doing the vanilla mapreduce, but once one steps away from that, things get ugly real fast with the zoo of Jobs, Tasks, Contexts, two mapred/mapreduce apis which do the same thing, plus hadoop.23 specific calls. I haven't touched Hadoop 1.0 yet, though I doubt they have cleaned all the mess. Hope they will streamline the API by 2.0

throwaway1979 · on May 9, 2012

I haven't used Google's software. Hadoop is in Java. If Google's implementation is in C or C++, that could explain a bit of the performance overhead.

4-6 times is crazy though. You sure it just wasn't due to data access latencies?