Certhippo

CALL US
+1 302 956 2015 (USA)

Certhippo
Certhippo

571,823
Satisfied Learners

Certhippo

250,000+
Hours Classes

Certhippo

30,000+
Assignments

Certhippo

2,500+
Projects

Home   >    All Courses   >   Big Data and Analytics   >   Apache Kafka Certification Training

Apache Kafka Certification Training

SUPPORT NO. +1 302 956 2015 (USA)

Kafka is an open-source stream processing platform. Kafka can be integrated with Spark, Storm and Hadoop. Learn about Kafka Architecture, setup Kafka Cluster, understand Kafka Stream APIs, implement Twitter Streaming with Kafka, Flume, Hadoop and Storm

Why this course ?

  • Kafka is used heavily in the Big Data space as a reliable way to ingest and move large amounts of data very quickly
  • ​LinkedIn, Yahoo, Twitter, Netflix, Uber, Goldman Sachs,PayPal, Airbnb​ ​​& other fortune 500 companies use Kafka

  • The average salary of a Software Engineer with Apache Kafka skill is $87,500 per year. (Payscale.com salary data).

  • 15K + satisfied learners. Reviews

Enroll now

Instructor-led Sessions

30 Hours of Online Live Instructor-Led Classes. Weekend Class: 10 sessions of 3 hours each.

Real-life Case Studies

Live project based on any of the selected use cases, involving implementation of Kafka concepts.

Assignments

Each class has practical assignments which shall be finished before the next class and helps you to apply the concepts taught during the class.

Lifetime Access

You get lifetime access to Learning Management System (LMS) where presentations, quizzes, installation guide & class recordings are there.

24 x 7 Expert Support

We have 24x7 online support team to resolve all your technical queries, through ticket based tracking system, for the lifetime.

Certification

Towards the end of the course, you will be working on a project. Our Expert certifies you as an Apache Kafka Expert based on the project.

Forum

We have a community forum for all our customers wherein you can enrich their learning through peer interaction and knowledge sharing.

Apache Kafka Certification Training is designed to provide you with the knowledge and skills to become a successful Kafka Big Data Developer. The training encompasses the fundamental concepts (such as Kafka Cluster and Kafka API) of Kafka and covers the advanced topics (such as Kafka Connect, Kafka streams, Kafka Integration with Hadoop, Storm and Spark) thereby enabling you to gain expertise in Apache Kafka.

After the completion of Real-Time Analytics with Apache Kafka course at Edureka, you should be able to:

  • Learn Kafka and its components
  • Set up an end to end Kafka cluster along with Hadoop and YARN cluster
  • Integrate Kafka with real time streaming systems like Spark & Storm
  • Describe the basic and advanced features involved in designing and developing a high throughput messaging system
  • Use Kafka to produce and consume messages from various sources including real time streaming sources like Twitter
  • Get an insight of Kafka API
  • Understand Kafka Stream APIs
  • Work on a real-life project, ‘Implementing Twitter Streaming with Kafka, Flume, Hadoop & Storm

Kafka training helps you gain expertise in Kafka Architecture, Installation, Configuration, Performance Tuning, Kafka Client APIs like Producer, Consumer and Stream APIs, Kafka Administration, Kafka Connect API and Kafka Integration with Hadoop, Storm and Spark using Twitter Streaming use case.

This course is designed for professionals who want to learn Kafka techniques and wish to apply it on Big Data. It is highly recommended for:

  • Developers, who want to gain acceleration in their career as a "Kafka Big Data Developer"
  • Testing Professionals, who are currently involved in Queuing and Messaging Systems
  • Big Data Architects, who like to include Kafka in their ecosystem
  • Project Managers, who are working on projects related to Messaging Systems
  • Admins, who want to gain acceleration in their careers as a "Apache Kafka Administrator

Fundamental knowledge of Java concepts is mandatory. Certhippo provides a complimentary course i.e., "Java Essentials" to all the participants, who enrolls for the Apache Kafka Certification Training.


  • Minimum RAM required: 4GB (Suggested: 8GB)
  • Minimum Free Disk Space: 25GB
  • Minimum Processor i3 or above
  • Operating System of 64bit
  • Participant’s machines must support a 64-bit VirtualBox guest image.

We will help you to setup Certhippo's Virtual Machine in your System with local access. The detailed installation guides are provided in the LMS for setting up the environment. For any doubt, the 24*7 support team will promptly assist you. Certhippo Virtual Machine can be installed on Mac or Windows machine.

Case Study 1:
Stock Profit Ltd, India’s first discount broker, offers zero brokerage & unlimited online share trading in Equity Cash. Design a system to capture real-time stocks data from source (i.e. Yahoo.com) and calculate the profit and loss for customers who are subscribed to the tool. Finally, store the result in HDFS.

Case Study 2:
You are a SEO specialist in a company. You get an email from management wherein the requirement is to get Top Trending Keywords. You have to write the topology which can consume keywords from Kafka. You have given a file containing various search keywords across multiple verticals.

Case Study 3:
You have to build a system which should be consistent in nature. For example, if you are getting product feeds either through flat file or any event stream you have to make sure you don’t lose any events related to product specially inventory and price.

If we talk about price and availability it should always be consistent because there might be possibility that product is sold or seller doesn’t want to sell it anymore or any other reason. However, attributes like Name, description doesn’t make that much noise if not updated on time.

Case Study 4:
John wants to build an e-commerce portal like Amazon, Flipkart or Paytm. He will ask sellers/local brands to upload all their products on the portal so that users/buyers can visit portal online and purchase. John doesn’t have much knowledge about the system and he hired you to build a reliable and scalable solution for him where buyers and sellers can easily update their products.

Goal: In this module, you will understand where Kafka fits in the Big Data space, and Kafka Architecture. In addition, you will learn about Kafka Cluster, its Components, and how to Configure a Cluster

Skills:
• Kafka Concepts
• Kafka Installation
• Configuring Kafka Cluster

Objectives: At the end of this module, you should be able to: 
• Explain what is Big Data
• Understand why Big Data Analytics is important
• Describe the need of Kafka
• Know the role of each Kafka Components
• Understand the role of ZooKeeper
• Install ZooKeeper and Kafka 
• Classify different type of Kafka Clusters
• Work with Single Node-Single Broker Cluster

Topics:
• Introduction to Big Data
• Big Data Analytics
• Need for Kafka
• What is Kafka? 
• Kafka Features
• Kafka Concepts
• Kafka Architecture
• Kafka Components 
• ZooKeeper
• Where is Kafka Used?
• Kafka Installation
• Kafka Cluster 
• Types of Kafka Clusters
• Configuring Single Node Single Broker Cluster

Hands on:
• Kafka Installation
• Implementing Single Node-Single Broker Cluster

GoalKafka Producers send records to topics. The records are sometimes referred to as Messages. In this Module, you will work with different Kafka Producer APIs.

Skills:
• Configure Kafka Producer
• Constructing Kafka Producer
• Kafka Producer APIs
• Handling Partitions

Objectives:
At the end of this module, you should be able to:
• Construct a Kafka Producer
• Send messages to Kafka
• Send messages Synchronously & Asynchronously
• Configure Producers
• Serialize Using Apache Avro
• Create & handle Partitions

Topics:
• Configuring Single Node Multi Broker Cluster
• Constructing a Kafka Producer
• Sending a Message to Kafka
• Producing Keyed and Non-Keyed Messages 
• Sending a Message Synchronously & Asynchronously
• Configuring Producers
• Serializers
• Serializing Using Apache Avro
• Partitions

Hands On:
• Working with Single Node Multi Broker Cluster
• Creating a Kafka Producer
• Configuring a Kafka Producer
• Sending a Message Synchronously & Asynchronously

Goal: Applications that need to read data from Kafka use a Kafka Consumer to subscribe to Kafka topics and receive messages from these topics. In this module, you will learn to construct Kafka Consumer, process messages from Kafka with Consumer, run Kafka Consumer and subscribe to Topics

Skills:
•Configure Kafka Consumer
•Kafka Consumer API
•Constructing Kafka Consumer

Objectives: At the end of this module, you should be able to:
• Perform Operations on Kafka
• Define Kafka Consumer and Consumer Groups
• Explain how Partition Rebalance occurs 
• Describe how Partitions are assigned to Kafka Broker
• Configure Kafka Consumer
• Create a Kafka consumer and subscribe to Topics
• Describe & implement different Types of Commit
• Deserialize the received messages

Topics:
• Consumers and Consumer Groups
• Standalone Consumer
• Consumer Groups and Partition Rebalance
• Creating a Kafka Consumer
• Subscribing to Topics
• The Poll Loop
• Configuring Consumers
• Commits and Offsets
• Rebalance Listeners
• Consuming Records with Specific Offsets
• Deserializers

Hands On:
• Creating a Kafka Consumer
• Configuring a Kafka Consumer
• Working with Offsets

Goal: Apache Kafka provides a unified, high-throughput, low-latency platform for handling real-time data feeds. Learn more about tuning Kafka to meet your high performance needs.

Skills:
• Kafka APIs
• Kafka Storage 
• Configure Broker

Objectives: 
At the end of this module, you should be able to:
• Understand Kafka Internals
• Explain how Replication works in Kafka
• Differentiate between In-sync and Out-off-sync Replicas
• Understand the Partition Allocation
• Classify and Describe Requests in Kafka
• Configure Broker, Producer, and Consumer for a Reliable System
• Validate System Reliabilities
• Configure Kafka for Performance Tuning
 
Topics:
• Cluster Membership
• The Controller
• Replication
• Request Processing
• Physical Storage
• Reliability 
• Broker Configuration
• Using Producers in a Reliable System
• Using Consumers in a Reliable System
• Validating System Reliability
• Performance Tuning in Kafka

Hands On:
• Create topic with partition & replication factor 3 and execute it on multi-broker cluster
• Show fault tolerance by shutting down 1 Broker and serving its partition from another broker

Goal:  Kafka Cluster typically consists of multiple brokers to maintain load balance. ZooKeeper is used for managing and coordinating Kafka broker. Learn about Kafka Multi-Cluster Architectures, Kafka Brokers, Topic, Partitions, Consumer Group, Mirroring, and ZooKeeper Coordination in this module.

Skills: 
• Administer Kafka

Objectives:
At the end of this module, you should be able to
• Understand Use Cases of Cross-Cluster Mirroring
• Learn Multi-cluster Architectures
• Explain Apache Kafka’s MirrorMaker
• Perform Topic Operations
• Understand Consumer Groups
• Describe Dynamic Configuration Changes
• Learn Partition Management
• Understand Consuming and Producing
• Explain Unsafe Operations

Topics:
• Use Cases - Cross-Cluster Mirroring
• Multi-Cluster Architectures
• Apache Kafka’s MirrorMaker
• Other Cross-Cluster Mirroring Solutions
• Topic Operations
• Consumer Groups
• Dynamic Configuration Changes
• Partition Management
• Consuming and Producing
• Unsafe Operations

Hands on:
• Topic Operations
• Consumer Group Operations
• Partition Operations
• Consumer and Producer Operations

Goal: Learn about the Kafka Connect API and Kafka Monitoring. Kafka Connect is a scalable tool for reliably streaming data between Apache Kafka and other systems.

Skills: 
• Kafka Connect
• Metrics Concepts
• Monitoring Kafka

Objectives:
At the end of this module, you should be able to use, 
• Explain the Metrics of Kafka Monitoring
• Understand Kafka Connect
• Build Data pipelines using Kafka Connect
• Understand when to use Kafka Connect vs Producer/Consumer API 
• Perform File source and sink using Kafka Connect

Topics:
• Considerations When Building Data Pipelines
• Metric Basics
• Kafka Broker Metrics
• Client Monitoring
• Lag Monitoring
• End-to-End Monitoring
• Kafka Connect
• When to Use Kafka Connect?
• Kafka Connect Properties

Hands on:
• Kafka Connect

Goal: Learn about the Kafka Streams API in this module. Kafka Streams is a client library for building mission-critical real-time applications and microservices, where the input and/or output data is stored in Kafka Clusters.

Skills: 
• Stream Processing using Kafka

Objectives:
At the end of this module, you should be able to,
• Describe What is Stream Processing
• Learn Different types of Programming Paradigm
• Describe Stream Processing Design Patterns
• Explain Kafka Streams & Kafka Streams API

Topics:
• Stream Processing
• Stream-Processing Concepts
• Stream-Processing Design Patterns
• Kafka Streams by Example
• Kafka Streams: Architecture Overview

Hands on:
• Kafka Streams
• Word Count Stream Processing

Goal: In this module, you will learn about Apache Hadoop, Hadoop Architecture, Apache Storm, Storm Configuration, and Spark Ecosystem. In addition, you will configure Spark Cluster, Integrate Kafka with Hadoop, Storm, and Spark.

Skills: 
• Kafka Integration with Hadoop
• Kafka Integration with Storm
• Kafka Integration with Spark

Objectives:
At the end of this module, you will be able to,
• Understand What is Hadoop
• Explain Hadoop 2.x Core Components
• Integrate Kafka with Hadoop
• Understand What is Apache Storm
• Explain Storm Components
• Integrate Kafka with Storm
• Understand What is Spark
• Describe RDDs
• Explain Spark Components
• Integrate Kafka with Spark
 
 Topics:
• Apache Hadoop Basics
• Hadoop Configuration
• Kafka Integration with Hadoop
• Apache Storm Basics
• Configuration of Storm 
• Integration of Kafka with Storm
• Apache Spark Basics
• Spark Configuration
• Kafka Integration with Spark

Hands On:
• Kafka integration with Hadoop
• Kafka integration with Storm
• Kafka integration with Spark

Goal: Learn how to integrate Kafka with Flume, Cassandra and Talend.

Skills:
• Kafka Integration with Flume
• Kafka Integration with Cassandra
• Kafka Integration with Talend
 
 Objectives:
At the end of this module, you should be able to,
• Understand Flume
• Explain Flume Architecture and its Components
• Setup a Flume Agent
• Integrate Kafka with Flume
• Understand Cassandra
• Learn Cassandra Database Elements
• Create a Keyspace in Cassandra
• Integrate Kafka with Cassandra
• Understand Talend
• Create Talend Jobs
• Integrate Kafka with Talend

 Topics:
• Flume Basics
• Integration of Kafka with Flume
• Cassandra Basics such as and KeySpace and Table Creation
• Integration of Kafka with Cassandra
• Talend Basics
• Integration of Kafka with Talend

Hands On:
• Kafka demo with Flume
• Kafka demo with Cassandra
• Kafka demo with Talend

Goal: In this module, you will work on a project, which will be gathering messages from multiple 
sources.

Scenario:
In E-commerce industry, you must have seen how catalog changes frequently. Most deadly problem they face is “How to make their inventory and price
consistent?”.

There are various places where price reflects on Amazon, Flipkart or Snapdeal. If you will visit Search page, Product Description page or any ads on Facebook/google. You will find there are some mismatch in price and availability. If we see user point of view that’s very disappointing because he spends more time to find better products and at last if he doesn’t purchase just because of consistency.
Here you have to build a system which should be consistent in nature. For example, if you are getting product feeds either through flat file or any event
stream you have to make sure you don’t lose any events related to product specially inventory and price.

If we talk about price and availability it should always be consistent because there might be possibility that the product is sold or the seller doesn’t want to sell it anymore or any other reason. However, attributes like Name, description doesn’t make that much noise if not updated on time.

Problem Statement
You have given set of sample products. You have to consume and push products to Cassandra/MySQL once we get products in the consumer. You have to save below-mentioned fields in Cassandra.
1. PogId
2. Supc
3. Brand
4. Description
5. Size
6. Category
7. Sub Category
8. Country
9. Seller Code

In MySQL, you have to store
1. PogId
2. Supc
3. Price
4. Quantity

This Project enables you to gain Hands-On experience on the concepts that you have learned as part of this Course. 

You can email the solution to our Support team within 2 weeks from the Course Completion Date. Edureka will evaluate the solution and award a Certificate with a Performance-based Grading.

Problem Statement:
You are working for a website techreview.com that provides reviews for different technologies. The company has decided to include a new feature in the website which will allow users to compare the popularity or trend of multiple technologies based on twitter feeds. They want this comparison to happen in real time. So, as a big data developer of the company, you have been task to implement following things:

• Near Real Time Streaming of the data from Twitter for displaying last minute's count of people tweeting about a particular technology.

• Store the twitter count data into Cassandra.

"You will never lose any lecture. You can choose either of the two options:

  • View the recorded session of the class available in your LMS.
  • You can attend the missed session, in any other live batch."

Certhippo is committed to provide you an awesome learning experience through world-class content and best-in-class instructors. We will create an ecosystem through this training, that will enable you to convert opportunities into job offers by presenting your skills at the time of an interview. We can assist you in resume building and also share important interview questions once you are done with the training. However, please understand that we are not into job placements.

We have limited number of participants in a live session to maintain the Quality Standards. So, unfortunately participation in a live class without enrolment is not possible. However, you can go through the sample class recording and it would give you a clear insight about how are the classes conducted, quality of instructors and the level of interaction in the class.

All the instructors at Certhippo are practitioners from the Industry with minimum 10-12 yrs of relevant IT experience. They are subject matter experts and are trained by Certhippo for providing an awesome learning experience.

    • Once you are successfully through the project (Reviewed by a Certhippo' expert), you will be awarded with Certhippo'a’s Apache Kafka Professional certificate.
    • Certhippo' certification has industry recognition and we are the preferred training partner for many MNCs e.g.Cisco, Ford, Mphasis, Nokia, Wipro, Accenture, IBM, Philips, Citi, Ford, Mindtree, BNYMellon etc. Please be assured.