I had the pleasure of helping to build and manage these facilities, both hardware and software, for 5 years. It's nice to see some of Google's real innovations reach the public eye. Some of the smartest folks I ever worked with at the company build absolutely mind blowing tech that the outside never has the opportunity to see or appreciate.
In fact, while much of the content in the article has been written about before, it's still probably 2-3 years or more behind where Google is actually at. I left in 2010 and did't read about anything I had not experienced.
Reminds me when msn search spent a billion dollars (or whatever was reported) saying we have more pages than google. Google simply updated the number of pages indexed after Microsoft was done huffing and puffing.
It was pretty funny at the time but the lesson wasn't lost on me: With competition, get ahead, stay ahead, and have things already done and implemented so you can announce big accomplishments when it's strategic for you.
Yes. You can get to the top by competing with others; you stay on top by competing with yourself.
There's a great graph in "Toyota Kata" that shows per-worker productivity of major car companies for the last several decades. They all rise together for the early part of the graph. In the 60s, the American car companies level off; Toyota keeps growing. They focused on continuous improvement, while American car companies floundered.
The really interesting part of this to me is that it's rooted in a philosophical difference. Toyota was started and run by engineers. The American car companies gave birth to the MBA approach to business. Engineers naturally seek improvement; MBAs seek profit.
Google is one of the few major companies with a philosophical background like Toyota's. It's run by nerds. Their goal isn't to increase shareholder value; it's to build great stuff and organize the world's information. Like Toyota, by following their vision, they have generated vast profits and dominated their industry.
One of the details which I find highly relevant to software and, particularly, operations is rejecting the mindset of dealing with failure by finding someone to blame and instead changing the system so that one person can't inadvertently cause a failure. I see this a lot with massive ops run books which require humans to repeatedly perform complex tasks without mistake rather than automating it and regularly testing your automation.
If you ever get to be around with someone who works at a car insurance company, ask them about failure rates for cars… japanese cars (and Toyota notably) are among the lowest (that is, they are among the most reliable cars in the world by far).
I still wonder how they can manage to build affordable, reliable cars that last for years, while many expensive car makers have absurdly high failure rates.
One of the interesting reasons comes back to accounting.
Toyota's approach focuses on value from the customer perspective. So all defects are seen as waste, and are targeted for elimination.
The MBA approach just looks at P&L. Which is why they concealed the Pinto's tendency to explode; it was cheaper to pay the lawsuits than to fix people's tanks. Nevermind that many more people would die without the recall; that wasn't relevant to increasing shareholder value.
Another good example comes at the beginning of Bob Lutz's "Car Guys vs Bean Counters". Lutz, a car lover and an automotive exec for decades, once fixed a problem with transmission manufacturing. The problem was causing a lot of people's cars to die right after the warranty expired. He got yelled at because it blew a hole in their revenue projections; they were looking forward to a lot of highly profitable transmission repairs.
Toyota can make those reliable cars because they see every worker not just as a meat robot, but as a brain that should be engaged in eliminating waste. MBA thinking looks at slow order periods as a time to cut labor costs. Toyota, cognizant of how much they have invested in their workers, looks it as a time for training, plant improvement, and other value-creating activities.
Even if this is taken as a fact, I don't see how it explains why American car companies level off. It's not like potential profit is bounded but potential improvement isn't.
The long-term value of a company is based on the amount of value they create for customers. The short-term value of the company depends on profit.
So, for example, an MBA can increase profits by cutting R&D. Or by cutting costs in a way that harms product quality. The company will do well for a while because it takes a while for things like reputation and mindshare to decline. You can hide the declines for longer by investing more in promotion.
The engineer-style approach, in contrast, is to focus on cutting waste rather than cost. This is a high art in the Toyota Production System:
I don't know that the leveling off is a necessary consequence of the MBA approach. But it's certainly what I've seen, and it makes some intuitive sense. Given that some ways to increase profit improve productivity and some harm it, it's plausible that all the cheap ways to improve productivity would be exhausted early in the MBA approach.
I also think the MBA approach can lead you into a local maximum that's pretty screwed up:
Thinking further, another factor may be that MBA thinking tends to be focused on external competition, while engineers tend to optimize regardless of competition. So the American car companies could have leveled off because their major competitors were doing equally well. By the time Toyota was an obvious threat, they were too far behind to even understand how they were being beaten.
Google is another good example. They didn't seek out data center "best practices". They radically bested the competition, proceeding one step at a time, with careful attention to what they needed. In an MBA analysis, that would be seen as spending a lot of money on risky R&D with no obvious ROI. In fact, they'd want to re-task all those expensive engineer brains to something more directly related to revenue. And probably drop the quality of the ops staff as a money-saving measure.
It's only when the years of patient engineer-style optimization add up to an insurmountable lead that it looks good in a B-school spreadsheet.
> In fact, while much of the content in the article has been written about before, it's still probably 2-3 years or more behind where Google is actually at. I left in 2010 and did't read about anything I had not experienced.
What if they just plateaued and didn't really go beyond what you had done when you were there, and this is totally accurate to today?
I can believe this in a heartbeat. I know that if Platforms and datacenter/cluster management innovation stopped, I'd see a mass exodus of my Googler friends (as well as a very noticeable change in Google's products).
In fact, while much of the content in the article has been written about before, it's still probably 2-3 years or more behind where Google is actually at. I left in 2010 and did't read about anything I had not experienced.