Apache Kafka Training | BigData training in Chennai

Apache Kafka
A Distributed Streaming Platform

About Kafka

Apache Kafka is an open-source stream-processing software platform developed by LinkedIn and donated to the Apache Software Foundation, written in Scala and Java

Kafka® is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

  • PUBLISH & SUBSCRIBE
  • PROCESS
  • STORE

Kafka Topics

The following are the things covered under Kafka.

Messaging

Kafka works well as a replacement for a more traditional message broker. Message brokers are used for a variety of reasons (to decouple processing from data producers, to buffer unprocessed messages, etc).

    Website Activity Tracking

    The original use case for Kafka was to be able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds.

      Metrics

      Kafka is often used for operational monitoring data. This involves aggregating statistics from distributed applications to produce centralized feeds of operational data.

        Log Aggregation

        Many people use Kafka as a replacement for a log aggregation solution. Log aggregation typically collects physical log files off servers and puts them in a central place (a file server or HDFS perhaps) for processing.

          Stream Processing

          Many users of Kafka process data in processing pipelines consisting of multiple stages, where raw input data is consumed from Kafka topics and then aggregated, enriched, or otherwise transformed into new topics for further consumption or follow-up processing.

            Event Sourcing

            Event sourcing is a style of application design where state changes are logged as a time-ordered sequence of records. Kafka's support for very large stored log data makes it an excellent backend for an application built in this style.

              Course Contents

              The following are the course contents offered for Kafka

              • Understanding the principles of messaging systems
              • Understanding messaging systems
              • Peeking into a point-to-point messaging system
              • Publish-subscribe messaging system
              • Advance Queuing Messaging Protocol
              • Using messaging systems in big data streaming applications
              • Kafka origins
              • Kafka's architecture
              • Message topics
              • Message partitions
              • Replication and replicated logs
              • Message producers
              • Message consumers
              • Role of Zookeeper
              • Kafka producer internals
              • Kafka Producer APIs
              • Producer object and ProducerRecord object
              • Custom partition
              • Additional producer configuration
              • Introduction
              • Use Cases
              • Architecture
              • Components of Kafka -­ Broker, Producer, Consumer, Topic, Partition
              • Ecosystem
              • Kafka vs Flume
              • First Things First
              • Installing a Kafka Broker
              • Broker Configuration
              • General Broker
              • Topic Defaults
              • num.partitions
              • log.retention.ms
              • log.retention.bytes
              • log.segment.bytes
              • log.segment.ms
              • message.max.bytes
              • Hardware Selection
              • Kafka in the Cloud
              • Kafka Clusters
              • How Many Brokers
              • Broker Configuration
              • Operating System Tuning
              • Virtual Memory
              • Disk
              • Networking
              • Production Concerns
              • Garbage Collector Options
              • Datacenter Layout
              • Colocating Applications on Zookeeper
              • Getting Started With Clients
              • Zookeeper
              • Single node kafka
              • Hands-On - Setting Up
              • Multi node kafka
              • Hands-On - Multi Node Setup
              • Console Producer & Console Consumer
              • Hands-On - Producer & Consumer
              • High Availability & Performance
              • Producer overview
              • Constructing a Kafka Producer
              • Sending a Message to Kafka
              • Serializers
              • Custom Serializers
              • Serializing using Apache Avro
              • Using Avro records with Kafka
              • Partitions
              • Configuring Producers
              • acks
              • buffer.memory
              • compression.type
              • retries
              • batch.size
              • linger.ms
              • client.id
              • max.in.flight.requests.per.connection
              • timeout.ms and metadata.fetch.timeout.ms
              • Old Producer APIs
              • Performance tuning
              • Serialization
              • Message Delivery Semantics
              • Replication
              • Log Compaction
              • Quotas
              • Hands-On
              • KafkaConsumer Concepts
              • Consumers and Consumer Groups
              • Consumer Groups - Partition Rebalance
              • Creating a Kafka Consumer
              • Subscribing to Topics
              • The Poll Loop
              • Commits and Offsets
              • Automatic Commit
              • Commit Current Offset
              • Asynchronous Commit
              • Combining Synchronous and Asynchronous commits
              • Commit Specified Offset
              • Rebalance Listeners
              • Seek and Exactly Once Processing
              • But How Do We Exit?
              • Deserializers
              • Configuring Consumers
              • fetch.min.bytes
              • fetch.max.wait.ms
              • max.partition.fetch.bytes
              • session.timeout.ms
              • auto.offset.reset
              • enable.auto.commit
              • partition.assignment.strategy
              • client.id
              • Stand Alone Consumer - Why and How to Use a Consumer without a Group
              • Older consumer APIs
              • Cluster Membership
              • Replication
              • Request Processing
              • Produce Requests
              • Fetch Requests
              • Other Requests
              • Physical Storage
              • Partition Allocation
              • File Management
              • File Format
              • Indexes
              • Compaction
              • How Compaction Works
              • Deleted Events
              • When Are Topics Compacted
              • Broker Configs
              • Hands-On
              • Producer Configs
              • Consumer Configs
              • Consumer groups
              • Hands-On
              • API Design
              • Producer and Consumer APIs (Java)
              • Hands-On Producer & Consumer API
              • Message format
              • Log
              • Hands-On
              • Managing Topics
              • Decommissioning nodes
              • Data mirroring
              • Data centers and Racks
              • Monitoring
              • Security
              • Authorization and ACL
              • REST API
              • Hands-On
              • Overview
              • Confluent Platform vs Apache Kafka
              • Kafka Streams
              • Kafka Connectors
              • Confluent Platform Hands On Usecases
              • Millions of Messages per second
              • How to Handle with Kafka?
              • IoT HandsOn Usecase
              • Kafka with Spark
              • Hands-On
              • Kafka with Flume (for Hadoop/Hbase/Hive)
              • Hands-On
              • IoT Realtime Streaming Data via Kafka
              • Using Kafka in Big Data Applications
              • Managing high volumes in Kafka
              • Appropriate hardware choices
              • Producer read and consumer write choices
              • Kafka message delivery semantics
              • At least once delivery
              • At most once delivery
              • Exactly once delivery
              • Big data and Kafka common usage patterns
              • Kafka and data governance
              • Alerting and monitoring
              • Useful Kafka matrices
              • Producer matrices
              • Broker matrices
              • Consumer metrics
              • An overview of securing Kafka
              • Wire encryption using SSL
              • Steps to enable SSL in Kafka
              • Configuring SSL for Kafka Broker
              • Configuring SSL for Kafka clients
              • Kerberos SASL for authentication
              • Steps to enable SASL/GSSAPI - in Kafka
              • Configuring SASL for Kafka broker
              • Configuring SASL for Kafka client - producer and consumer
              • Understanding ACL and authorization
              • Common ACL operations
              • List ACLs
              • Understanding Zookeeper authentication
              • Apache Ranger for authorization
              • Adding Kafka Service to Ranger
              • Adding policies
              • Best practices
              • Latency and throughput
              • Data and state persistence
              • Data sources
              • External data lookups
              • Data formats
              • Data serialization
              • Level of parallelism
              • Out-of-order events
              • Message processing semantics
              • Integrating Kafka with Streaming Applications
              • Introduction to Kafka Streams
              • Using Kafka in Stream processing
              • Kafka Stream - lightweight Stream processing library
              • Kafka Stream architecture
              • Integrated framework advantages
              • Understanding tables and Streams together
              • Maven dependency
              • Kafka Stream word count
              • KTable
              • Use case example of Kafka Streams
              • Managing high volumes in Kafka
              • Appropriate hardware choices
              • Producer read and consumer write choices
              • Kafka message delivery semantics
              • At least once delivery
              • At most once delivery
              • Exactly once delivery
              • Big data and Kafka common usage patterns
              • Kafka and data governance
              • Alerting and monitoring
              • Useful Kafka matrices
              • Producer matrices
              • Broker matrices
              • Consumer metrics
              • Securing Kafka
              • An overview of securing Kafka
              • List ACLs
              • Understanding Zookeeper authentication
              • Adding policies
              • Best practices
              • The Confluent Platform
              • Introduction
              • Installing the Confluent Platform
              • Using Kafka operations
              • Using the Schema Registry
              • Using the Kafka REST Proxy
              • Using Kafka Connect
              • Using Kafka with Confluent Platform
              • Introduction to Confluent Platform
              • Deep driving into Confluent architecture
              • Understanding Kafka Connect and Kafka Stream
              • Kafka Streams
              • Moving Kafka data to HDFS
              • Gobblin architecture
              • Kafka Connect
              • Flume
              • Apache Kafka Connect API
              • Kafka JDBC Connecto
              • Kafka ElasticSearch Connector
              • Spark Streaming with Kafka IOT Use-case Demo

              Download

              Download Apache Kafka course plan

              Designed by BootstrapMade