Load Testing Real-Time WebSocket Systems with k6

Problem Statement

How do you load test a real-time system built on WebSockets (Socket.IO) instead of traditional HTTP APIs?

Standard load-testing approaches often fail because:

  • Interactions happen over persistent WebSocket connections
  • Users must authenticate, join channels/rooms, and receive real-time events
  • Rate limits / cooldown rules restrict how frequently actions can occur
  • Correct behavior depends on multiple users interacting with the same resources
  • Success is defined by coordination and timing, not just request throughput

We needed to simulate dozens of concurrent users performing thousands of real-time actions in a controlled and realistic way.


What We Tried

Attempt 1: All Users Act at the Same Time

Approach:
Every virtual user (VU) triggered the same action on the same resource simultaneously.

Problem:
Most real-time systems enforce cooldowns or rate limits to prevent spam. When many users act on the same resource within a short window, most actions are rejected.

Result:
High rejection rate and misleading test results.


Attempt 2: Assign Each User a Dedicated Resource

Approach:
Each user interacted with only one unique resource.

Problem:
No real contention. Many systems optimize repeated actions from the same user by updating internal state rather than creating new events.

Result:
Far fewer events than expected; the test didn’t reflect real-world behavior.


Attempt 3: Use sleep() for Delays

Approach:
Insert sleep() calls between actions to respect cooldowns.

Problem:
In k6, sleep() blocks the JavaScript event loop. When used inside WebSocket callbacks, it prevents incoming messages from being processed, leading to dropped connections.

Result:
Connection closures and zero meaningful activity.


What Worked :white_check_mark:

Distributed Round-Robin Interaction Pattern

Key insight:
At any moment, each user should interact with a different resource, and over time, all users rotate across all resources.

This achieves:

  • No rate-limit collisions (different resources at the same time)
  • Real competition (same resource receives actions from different users over time)
  • Predictable timing and clean metrics

Algorithm

// At step N, user i interacts with resource (i + N) % totalResources
const resourceIndex = (userIndex + stepNumber) % totalResources;

Example: 3 Users × 3 Resources

Step User 0 User 1 User 2
0 Resource 0 Resource 1 Resource 2
1 Resource 1 Resource 2 Resource 0
2 Resource 2 Resource 0 Resource 1

Each resource receives actions from different users, spaced correctly in time.


Common Pitfalls & Solutions

Pitfall Solution
High rejection rates Distribute actions across resources
No meaningful contention Rotate users across shared resources
WebSocket disconnects Avoid sleep() in callbacks
Server overload at startup Stagger connections
Early socket closure Compute keep-alive duration correctly
Missing local reports Use hybrid execution

Pipeline Integration Tips

  • Make load testing optional via pipeline flags
  • Run it in parallel with other test stages
  • Publish only reports
  • Store credentials in secure variables

Results

Using this approach, we were able to simulate:

  • Dozens of concurrent WebSocket users
  • Thousands of real-time interactions
  • Sustained load over extended durations
  • Consistently high success rates

Key Learnings

  • WebSocket testing requires event-driven thinking
  • Domain constraints shape your test strategy
  • Staggering avoids the thundering herd problem
  • Round-robin patterns create realistic contention
  • k6 supports powerful real-time testing when used correctly

Reusable For

This pattern applies to any real-time, multi-user system, such as:

  • Live collaboration tools
  • Multiplayer or matchmaking systems
  • Real-time dashboards
  • Messaging platforms with rate limits
  • Event-driven backends

Tags:
#LoadTesting #k6 #WebSocket #PerformanceTesting #RealTime #SocketIO

2 Likes