Intel Granulate blogs: Big Data
AWS EMR Clusters and Nodes
Amazon EMR lets you leverage a group of Amazon Elastic Compute Cloud (EC2) instances as a cluster for rapid processing and analysis.
Spark on AWS: Amazon EMR Features & Creating Your First Cluster
Apache Spark is an open source, distributed data processing system for big data applications. It enables fast data analysis using in-memory...
What Is AWS EMR and 5 Critical Best Practices
AWS EMR processes data across a Hadoop cluster of virtual servers on Amazon Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2)
Running Hadoop on AWS: The Basics and 5 Tips for Success
You can run Apache Hadoop on AWS using Amazon EMR, a managed service for processing and analyzing large datasets.
Optimizing Kafka Performance
In this blog, we cover everything you need to know about optimizing Kafka performance to make sure latency remains low and throughput high.
Introduction to ETL pipelines
ETL, or extract, transform, and load, is the process of taking data from one source, transforming it, and then loading it into a destination
Introduction To Apache Spark Performance
In this article, we first present Spark’s fundamentals, including its architecture, components, and execution mode, as well as APIs.