Dream11 Case study

Dream11 Improves Kafka Workload Performance and Reduces AWS Costs by 40%

Download Case Study

About

Dream11 is an extremely popular fantasy sports platform leading the twin rise of Indian cricket and domestic fantasy league play. In its eighth year of exponential user growth, the company was also tolerating rising cloud infrastructure costs, with massive user traffic during pre-match periods causing continually higher compute spikes on its AWS-hosted Kafka workload.

Granulate provided us with improved performance and reduced costs without any effort. We’re looking forward to continuing our partnership and to expanding our Granulate deployment across all our workloads.
Siddharth Terse, Senior Site Reliability Engineer

The Challenge

Known to amplify fans’ engagement with sports, Dream11 has experienced tremendous growth, from 1 million users in 2014 to over 80 million at the close of 2019. Based on real-world sporting events, users join fantasy contests that are automatically generated and promoted by Dream11 through the app. Each contest can have as few as two participants to upwards of tens of millions who can join up until the real-world event begins.

With ever-growing user traffic spiking mostly in the hour leading up to a sporting event, the application’s architecture began encountering difficulty processing requests from the tens of thousands of users who wanted to join event-related contests, let alone those contests related to multiple simultaneous, overlapping events.

The result of these huge spikes was that some participants were unable to register, or were being kicked out of the registration process before completion. These issues catalyzed Dream11’s decision to pursue performance and cost optimization, but with some specific requirements.

Any optimization solution would need to enable Dream11 to scale with increasing demand, self-heal and maintain throughput and uptime under extreme loads and spikes. It would also need to integrate seamlessly with AWS, to reliably adapt to year-over-year growth estimated to double, and to improve the existing performance of the platform.

The Results

By implementing Granulate on a few Kafka consumer instances, Dream11 dramatically improved the service’s overall performance. It leveraged a reduced CPU utilization of 50% and increased throughput of 40% to cut total infrastructure spend by nearly half, and with such results on just a small portion of its clusters, Dream11 got a first-hand demonstration of what Granulate could do on a system-wide scale.

Convinced of the potential cost reduction that could be achieved by expanding Granulate to additional clusters, Dream 11 followed its implementation on the Kafka consumer workload to deploy Granulate on both its Node.js and Java environments as well. This enabled even further improved performance, latency, and throughput and helped Dream11 reduce costs while ensuring scalability and stability well into the future.

40%
Reduced Costs
40%
Increased Throughput
50%
Reduced CPU Utilization

With 100 million loyal users who prioritize a low latency experience, Dream11 searched for optimization solutions that didn’t require downtime, code changes or R&D. Granulate fit the bill, and within days its continuous optimization solution reduced Dream11’s Kafka CPU utilization, boosted throughput, and reduced infrastructure costs.