Addressing the Trillion-Dollar Cloud Paradox
A recent article by Andreessen Horowitz (a16z) refers to the cost of cloud as a “trillion dollar paradox.” What’s the paradox? Companies need to embrace the efficiencies and optimizations that come with cloud—but they also have to shake themselves free of excess cloud costs.
The article points out that too many companies underestimate and under-budget their cloud spend and are then hit with an unnecessarily high bill for cloud infrastructure and services. For example, Dropbox, which recently repatriated its workloads—removing them from the public cloud—saved nearly $75M over two years while elevating gross margins from 33% to 67%. These are dramatic results, which Dropbox attained not through CAPEX and returning to an on-premises model, but rather, through a colocation infrastructure model that promises greater cost efficiencies over public cloud.
Essentially, the message is that while many companies manage to actualize the promise of the cloud—scalability and cost savings—early on in their journey, later on, as they focus on new features, those savings start to evaporate and “the pressure it puts on margins can start to outweigh the benefits.”
These are genuine concerns, and as the article mentions, fixing the problem—rewriting or restructuring for optimization to get better performance and value out of the cloud—can take years and add considerably to the ongoing and long-term cost. So much so that many companies simply give up entirely.
So has the cloud bubble popped? Is repatriation the best option?
In truth, “repatriation” is a misnomer. It implies that you’re going to be retaking control so you can rein in expenses, when in truth, you’d actually be migrating to a private cloud hosted in a colocation facility that may outcompete major public cloud players on price but that also opens up a wide range of unknowns in terms of service, quality, and even longer-term cost.
Think Beyond CAPEX
Nevertheless, the article makes a number of valid points, particularly in its conclusion. Every company migrating to and starting its journey with the cloud needs to think beyond short-term CAPEX cost savings to long-term optimization.
The authors at a16z recommend using cloud spend as a leading KPI in all departments, including engineering, right alongside performance and reliability metrics. This makes sense, since there’s no stage in the cloud lifecycle when costs don’t matter. Since—as the article points out—cost optimization becomes even more critical as your cloud infrastructure matures, it makes sense to ensure that this is in place from a very early stage.
Further recommendations include incentivizing cloud-efficient behavior, like paying a bonus to engineers who manage to optimize or shut down unnecessary workloads, and assessing cost of goods sold (COGS), meaning tying cloud spend to revenue earned.
The implication, however, is that if cloud isn’t earning its keep—and the article strongly suggests that as your cloud infrastructure matures, it probably won’t—the best alternative is repatriation.
But before you weigh that, especially in light of the hidden costs of repatriation, it’s worth considering other solutions recommended by the article, including third-party optimization tools.
For example, Granulate’s cloud optimization solution uses AI to help reduce compute spending while improving key application metrics. And as our many customer success stories show, Granulate works fast, especially compared to the time and headaches of migrating your entire data center. For example, when one online advertising company’s workloads scaled to encompass over 1,000 brands and advertisers and 500 premiere publishers, they used Granulate to cut CPU utilization by 60%, response time by 10%, and costs by 52%—all within days, and with zero R&D effort.
Granulate’s AI-driven real-time continuous optimization gives you a faster, less disruptive alternative to repatriation.
Our solution works right out of the box with zero R&D effort, automatically, providing ongoing improvement to OS-level scheduling and prioritization. You’ll reduce the number of resources needed, and downsize resources that aren’t in use, so you get better quality of service from your cloud infrastructure along with dramatically downsized compute costs.
Repatriation is only one option to address the cloud paradox, and it comes with costs and risks of its own. Before leaping to such a drastic conclusion, find out how you can start working more efficiently just using your current cloud infrastructure.