Apache Kafka Challenges with Data Streaming

Jacob Simkovich

Brand and Content Manager, Intel Granulate

Apache Kafka, the open source distributed event streaming platform, enables organizations to access and harness the streams of data that their operations rely on. By using the platform, organizations can construct real-time streaming data pipelines that can dependably move data between applications or systems. They can also create real-time streaming applications that transform in response to data streams. Organizations that want to quickly move large volumes of data, and do so at scale, should be using Apache Kafka.

However, there are some limitations. Even though Kafka can be downloaded, modified, used and redistributed free of charge, users like the more than 80 percent of the Fortune 100 that use Apache Kafka for data streaming have to contend with avoiding excessive costs related to operations, infrastructure and downtime.

While Kafka is easily scalable, application developers and others can encounter issues when trying to ensure that resources can meet dynamic demands or unexpected events in real time, such as when a mobile trading app suddenly experiences a much higher than anticipated spike in user traffic. Elasticity is a critical feature for Kafka because organizations have to be able to implement the necessary quantity of scaling processes to prevent the waste and deficiency of resources while meeting the needs of their systems in real-time. Without sufficient elasticity, Apache Kafka organizations can expect to incur added costs handling their streaming data infrastructure.

Kafka Elasticity Challenges and Data Streaming Costs

Employing Kafka systems without having the advantage of elasticity results in conditions that add to the cost of managing data streaming infrastructures:

Over-provisioning to support fluctuating demand. When systems receive more resources than needed to handle a current workload, organizations have to absorb the costs of the unused capacity.
Under-provisioning. In contrast, when there are not enough resources on hand than what is needed, the systems can be overloaded and the quality of the services it is supposed to provide will suffer.
CPU overloading. One reason for high CPU usage can be the decompression of the messages sent by producers. This will reduce network bandwidth usage and save disk space on Kafka brokers, but will result in higher CPU usage. Encrypted channels also cause high CPU usage. High CPU usage typically results in poor overall performance of the processes.

Consider the AWS-hosted Kafka workloads that are responsible for new customer registrations on a retailing app. After a heavily advertised promotion, a sharp spike in user traffic that is significantly higher than expected occurs.

A delay in scaling out or adding brokers to a cluster to handle the spike can result in outages or server overloads, causing the app to experience complications in processing the registrations of new users. From the customer-facing side, prospective customers may be kicked out of the registration process. The performance of other aspects of the app may suffer as well, with existing customers unable to login to their accounts or add items to their carts. Alternatively, if user traffic is much less than anticipated, a delay in removing unused brokers results in idle servers, a waste of cloud spend.

How to Address the Kafka Data Streaming Elasticity Problem

Keep in mind that elasticity is more than simply being able to scale faster, although this is certainly a need for modern organizations. It is having the ability to remove and add data infrastructure resources when necessary, in near to as real time as possible to be able to respond to sudden fluctuating conditions. This helps ensure that workloads are optimized and that your organization is not overpaying for that infrastructure.

There are various optimization solutions that data engineering teams can employ to configure and monitor their Kafka infrastructures. These services can lessen the burden of managing Kafka infrastructure and eliminate the need to train people.

There are also solutions like Granulate’s big data optimization solution that organizations can use to scale resources with demand and optimize Kafka workloads. Granulate enables organizations to improve the performance of their data streams by continuously optimizing application runtime and resource allocation.

Optimize application performance.

Save on cloud costs.

Start Now

Back to blog