Getting Started with Kafka: Your First Event Streaming Project

Getting Started with Kafka: Your First Event Streaming Project

Introduction

In today’s world, real-time systems are everywhere — ride-sharing, stock trading, and live dashboards. These systems rely on processing streams of data instantly and reliably. That’s where Apache Kafka shines.

In this post, I’ll walk you through how I built a real-time event streaming system using Kafka — simulating a basic live event stream. This project is simple and beginner-friendly, ideal for those who want to take their first step into the Kafka ecosystem by building something fun and practical.

The Use Case: Simulated Real-Time Bidding

Imagine a scenario where:

  • Users place bids in real time.
  • Bids need to be validated.
  • The highest bid is tracked live.
  • Users get instant notifications.
  • All bids are stored for future reference.

Kafka will act as the central event hub for all of this using a topic called live-bids .

Architecture

[ Bidder App ] → [ Kafka Topic: live-bids ]
                                ↓
                   ┌────────────┼────────────┐────────────────────┐
                   ↓            ↓            ↓                    ↓ 
        [ Bid Validator ] [ Highest Bid ] [ Notification ][ Persist to DB ]
                                      

Kafka decouples the producer from multiple consumers, enabling independent scaling and fault tolerance.

Step 1: Kafka Setup with Docker

Let’s keep setup simple using Docker. Create a docker-compose.yml :

version: '2'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-kafka:latest
    ports:
      - "9092:9092"
    environment:
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

Start Kafka:

docker-compose up -d

Once running, Kafka will be ready to accept messages (bids).

Step 2: Simulating a Bidder (Producer)

Let’s simulate a bidder that sends random bids.

producer_bidder.py

from kafka import KafkaProducer
import json, random, time

producer = KafkaProducer(bootstrap_servers='localhost:9092',
    value_serializer=lambda v: json.dumps(v).encode('utf-8'))

auction_ids = ['A1', 'A2']
users = [101, 102, 103]

while True:
    bid = {
        "auction_id": random.choice(auction_ids),
        "user_id": random.choice(users),
        "amount": random.randint(100, 1000),
        "timestamp": time.time()
    }
    producer.send('live-bids', bid)
    print("Sent:", bid)
    time.sleep(1)

Step 3: Bid Validator Consumer

consumer_validator.py

from kafka import KafkaConsumer, KafkaProducer
import json

consumer = KafkaConsumer('live-bids',
    bootstrap_servers='localhost:9092',
    group_id='validator',
    auto_offset_reset='earliest',
    value_deserializer=lambda x: json.loads(x.decode('utf-8')))

producer = KafkaProducer(bootstrap_servers='localhost:9092',
    value_serializer=lambda v: json.dumps(v).encode('utf-8'))

for msg in consumer:
    bid = msg.value
    if bid['amount'] > 0:
        print("Valid:", bid)
        producer.send('valid-bids', bid)
    else:
        print("Invalid:", bid)

Step 4: Track Highest Bid

consumer_highest_bid.py

from kafka import KafkaConsumer
import json

consumer = KafkaConsumer('valid-bids',
    bootstrap_servers='localhost:9092',
    group_id='highest-bid',
    auto_offset_reset='earliest',
    value_deserializer=lambda x: json.loads(x.decode('utf-8')))

highest_bids = {}

for msg in consumer:
    bid = msg.value
    auction = bid['auction_id']
    current = highest_bids.get(auction, {"amount": 0})

    if bid['amount'] > current['amount']:
        highest_bids[auction] = bid
        print(f"New high for {auction}: {bid['amount']} by User {bid['user_id']}")

Step 5: Real-Time Notifications

consumer_notifier.py

from kafka import KafkaConsumer
import json

consumer = KafkaConsumer('valid-bids',
    bootstrap_servers='localhost:9092',
    group_id='notifier',
    auto_offset_reset='earliest',
    value_deserializer=lambda x: json.loads(x.decode('utf-8')))

for msg in consumer:
    bid = msg.value
    print(f"Notification: User {bid['user_id']} bid {bid['amount']} on {bid['auction_id']}")

Step 6: Persist Bids

consumer_persist.py

from kafka import KafkaConsumer
import json

consumer = KafkaConsumer('valid-bids',
    bootstrap_servers='localhost:9092',
    group_id='persistence',
    auto_offset_reset='earliest',
    value_deserializer=lambda x: json.loads(x.decode('utf-8')))

with open("bids_log.txt", "a") as f:
    for msg in consumer:
        f.write(json.dumps(msg.value) + "\n")
        print("Saved:", msg.value)

Commands to Run the System

  1. Start Kafka:
docker-compose up -d
  1. Run producer:
python producer_bidder.py
  1. Run consumers in separate terminals:
python consumer_validator.py
python consumer_highest_bid.py
python consumer_notifier.py
python consumer_persist.py

What You’ll Learn

  • Kafka’s publish-subscribe model makes it easy to build decoupled systems.
  • You can plug in new consumers (like analytics or fraud detection) without touching the producer.
  • Kafka topics serve as a single source of truth for streaming data.

Possible Enhancements

  • Use Kafka Streams or Apache Flink for stateful bid tracking.
  • Store bids in a real database like PostgreSQL or MongoDB.
  • Integrate WebSockets to stream live bids to users.
  • Host Kafka on Confluent Cloud or AWS MSK for production-grade reliability.

Conclusion

This project is a great starting point for anyone curious about Kafka. It shows how Kafka can serve as the real-time backbone of an event-driven application.

This is not a production-grade auction system, but rather a hands-on example to demonstrate how Kafka works. You can use this same architecture for many real-world problems: payment processing, live leaderboards, and notification systems.

If you’re new to Kafka, give this project a shot. It’s simple, practical, and gets you up and running with real-time data pipelines.

4 Likes