Menu
Fill in your details and one of our experts will reach out shortly.
Music Streaming
HQ: Paris, France
Employees: 600+
Deezer is one of the largest independent music streaming platforms in the world, with more than 120 million tracks available in 180 countries, providing access to lossless HiFi audio, innovative
recommendation technology and industry defining features.
Deezer brings artists and fans together on a scalable and global platform,
unlocking the full potential of music through technology.
Infrastructure: Spark on GCP Dataproc
Data is core to Deezer’s music streaming offering. The faster and more accurately they are able to process the needs and behaviors of users, the better service and product they can offer in return.
Deezer’s data engineering team is responsible for providing a robust data platform allowing users to be efficient in a cost-controlled environment. Their scope covers everything related to content, which starts with providing reliable, scalable and resource-efficient tools to meet data scientist, data analyst and other data users’ needs, as well as providing Deezer teams with trusted and qualitative data in a centralized platform.
Deezer faced the challenge of migrating their data Spark jobs from On-Prem to GCP Dataproc in a limited timeframe to reduce the amount of time for maintenance to be needed on the two systems at the same time.
The migration process itself proved to be resource-intensive, requiring implementation of automatic monitoring and quality processes to ensure a successful transfer. It also involved relocating their data assets and enacting changes to security to comply with the cloud as a new paradigm.
With hundreds of instances and jobs, and thousands of cores used daily in the cloud, they started to conduct FinOps operations to reduce costs. One of those operations was to manually optimize their most costly jobs, which proved to be effective but time consuming.
In search of a solution to further optimize their workloads with an automated approach, Deezer pursued strategies focused on improving performance. They sought optimization solutions that would not compromise user data privacy and security, while maintaining the reliability, stability, and availability of their applications.
Deezer initially deployed Granulate’s agent on a small number of production jobs using Spark on Dataproc, in order to measure the effectiveness of the optimization solution on their workloads.
They saw results immediately upon activation, increased the velocity of the expansion and were optimizing over 1,000 jobs within two months of the first deployment. Job completion time improved, which meant that fewer compute resources were being used.
Currently, Granulate has been deployed on over 1,300 job clusters and Deezer has exceeded their performance goals, with Job Completion Time improving by an average of 15% across a variety of diverse short lived jobs. This reduction in Job Completion Time directly and automatically led to lower required compute per service.
These performance improvements led to shorter time to completion, an average cost reduction of 15% on optimized services, and a smoother infrastructure to gather information from and provide services to their users in a more efficient and robust way.
Reduced from 3:18 to 2:13 minutes on example Spark job