Sliding Window Rate Limiter in ASP.NET Core

Have you ever noticed that some APIs don't receive traffic evenly?

Instead of steady, predictable calls, certain endpoints get sudden bursts - a few calls at once, followed by quiet periods, and then another burst.

When multiple users trigger these bursts at the same time, the API can become overloaded. Response times increase, CPU usage spikes, and users start experiencing delays. This is where a more advanced rate limiting technique comes into play - the Sliding Window Rate Limiter.

This blog focuses on how the Sliding Window limiter works in ASP.NET Core and how it control unpredictable traffic spikes.

Rate Limiting: What Is It?

Rate limiting is a technique that controls how many requests a client can send to an API in a given time period so that the system stays stable, secure, and fair for all users.

Traffic suddenly increases
Scripts or scheduled jobs repeatedly make calls
Brute-force attacks or accidental request floods occur

With rate limiting in place, the API stays fair and consistent for all clients.

For a general introduction to ASP.NET Core rate limiting, refer to the first blog in this series:

"Introduction: Rate Limiting Middleware in ASP.NET Core"

What Is a Sliding Window Rate Limiter?

The Sliding Window Rate Limiter is designed to handle bursty, uneven traffic patterns.

Unlike the Fixed Window limiter (which resets after every full window), the Sliding Window takes a more dynamic approach. It divides the total time window into smaller segments and tracks the number of requests within each segment. As time moves forward, the window slides, and the limiter continuously recalculates the request count based on the recent activity.

This results in a much smoother, fairer distribution of requests over time.

Sliding window rate limiter example — Sliding Window Rate Limiter Example

Think of the Sliding Window as a rolling time period.

If the limit is 5 requests per minute, it always counts requests in the last 60 seconds, not only between minute boundaries.
The window is divided into smaller segments (e.g., 3 segments of 20 seconds each).
When a segment expires, its count is recycled and deducted cleanly.
This prevents a user from making 5 requests at the end of one window and another 5 at the beginning of the next - a common issue in the Fixed Window algorithm.

The Sliding Window limiter gives a more realistic limit based on actual traffic patterns.

Example Scenario

Taking an example of a real-world production scenario - where an API is responsible for generating an Inventory Report.

This Inventory report is heavy, involves multiple database calls, and takes time to process.

Under normal conditions, the API received 1–2 requests per minute, which the system handled easily. During busy periods, or when partner systems triggered reports at slightly different times, requests arrived in short bursts. 5 requests arrived within 30 seconds, followed by a brief pause, and then another burst of 5 requests.

Even though the total number of requests within 1 minute was not very high, the uneven timing of these requests created sudden pressure on the system.

This bursty traffic overloaded the database, slowed down report generation, and affected other APIs running on the same resources.

What happened without a rate limiter?

CPU usage spiked
Database queries slowed down
Report generation time increased
Other APIs sharing the same database became slower
Occasional timeouts occurred during peak usage

The issue wasn’t the overall volume - it was the uneven timing of the requests.

The system needed a way to smooth out these bursts and prevent too many report requests from arriving too close to each other.

What happened after adding a Sliding Window limiter?

A Sliding Window limiter solved this problem effectively.

Instead of looking at requests only inside a strict 1 minute window, the Sliding Window algorithm:

Divided the 1 minute window into 3 smaller segments of 20 seconds
Counted requests across the actual last 1 minute
Smoothed out inconsistent spikes
Prevented bursts from piling up too quickly
Ensured fair request distribution over time

By smoothing out the bursty traffic, the Sliding Window limiter kept the Inventory Report API reliable and prevented sudden overload.

Below is a sample implementation

Program.cs

// No additional package installation is required. 
// Rate limiting is built into ASP.NET Core starting from .NET 7.

using Microsoft.AspNetCore.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddControllers();
builder.Services.AddEndpointsApiExplorer();

builder.Services.AddRateLimiter(options =>
{
    options.AddSlidingWindowLimiter("InventoryReportLimit", limiterOptions =>
    {
        limiterOptions.PermitLimit = 5;                    // allow 5 total requests
        limiterOptions.Window = TimeSpan.FromSeconds(60);  // sliding 60-second window
        limiterOptions.SegmentsPerWindow = 3;              // splits into sub-windows
        limiterOptions.QueueLimit = 0;                     // no queuing
    });

    options.OnRejected = async (context, cancellationToken) =>
    {
        context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;

        await context.HttpContext.Response.WriteAsync(
            "Too many report requests. Try again later."
        );
    };
});

var app = builder.Build();

app.UseRouting();
app.UseRateLimiter();

app.MapControllers();

app.Run();

Setting	Meaning (Sliding Window limiter)
PermitLimit = 5	Maximum number of allowed requests in the rolling window for one client.
Window = 60 seconds	Total length of the sliding time window used to calculate the request rate.
SegmentsPerWindow=3	Number of sub‑windows that the 60‑second window is divided into (20 seconds each here).
QueueLimit = 0	Maximum number of extra requests that can wait in the queue when the limit is reached (0 = none).

This configuration allows 5 requests in any rolling 60 second period, not just per minute block.

Advantages of Sliding Window rate limiter

Smooths out uneven, bursty traffic
Prevents double-burst issues at window boundaries
More accurate and fair than Fixed Window
Ideal for real-time user actions (typing, scrolling, clicking)
Reduces sudden CPU spikes
Keeps APIs responsive during peak usage
Works well for external integrations and mobile apps

When should you use a Sliding Window rate limiter?

Use this limiter when:

Traffic is bursty and unpredictable
Users perform rapid actions (typing, searching, tapping)
Partner systems send requests in uneven batches
You want a fair limit based on the actual last time period
Fixed Window is too rigid for your workload

When should you avoid using a Sliding Window rate limiter?

Avoid this limiter when:

Traffic is consistent and predictable → Fixed Window rate limiter option works better in this case
You want to allow intentional short bursts → Token Bucket rate limiter works better in this case
You want to limit concurrent requests instead of rate → Concurrency rate limiter works better in this case

Benchmark: Sliding Window Rate Limiter (Local Runtime Test)

To understand the runtime impact of the Sliding Window Rate Limiter, a local benchmark was performed using Visual Studio Diagnostic Tools.

Test Environment

Environment: Local development machine
Framework: ASP.NET Core
Monitoring: Visual Studio Diagnostic Tools (Debug mode)
Endpoint: Inventory Report API

Test Scenarios

The same load pattern was applied in both scenarios to ensure a fair comparison.

Concurrent users: 50
Traffic pattern: Continuous requests for 2 minutes
Rate limit configuration: Sliding Window – 5 requests per 1 minute

Metric Category	Metric	Without Rate Limiter	Sliding Window Rate Limiter
CPU Usage	Peak CPU Usage	72%	54%
	Average CPU Usage	35-40%	15–25%
Thread Pool	Peak Threads	45	25
	Thread Starvation	Yes	No
Requests	Successful Requests	100% (slow)	12%
	Rejected (429)	0	82%
Response Time	P95	9.8s	130ms

Note: These benchmarks were captured on a local development machine using Visual Studio Diagnostic Tools. Results may vary based on hardware, workload, and runtime configuration.

Summary

The Sliding Window Rate Limiter is a great choice for APIs that receive uneven, bursty traffic.

By splitting the time window into smaller segments and sliding the window continuously, it keeps your API fast, fair, and stable - even when users or systems send multiple requests in a short time.

If you want to explore other rate limiting strategies, check out the blogs on Fixed Window, Token Bucket, and Concurrency limiters.

FAQs

Q1: Can Sliding Window rate limiting work across multiple servers?

A: By default, No.

Sliding Window rate limiting works per server by default and does not synchronize across multiple servers.

To make it work across multiple servers, a distributed store like Redis is required so all instances share the same counters. We’ll explore distributed rate limiting in an upcoming blog post.

builder.Services.AddStackExchangeRedisRateLimiter(options =>
{
options.ConnectionMultiplexer =
ConnectionMultiplexer.Connect("localhost:6379");
});

Q2: How do I test if my rate limiting is working correctly?

A: Rate limiting can be verified using a unit test that sends multiple requests in quick succession and checks that the API returns 429 Too Many Requests once the configured limit is exceeded.

[Fact]
 public async Task SlidingWindowLimiter_Returns_429_On_6th_Request()
 {
     // Arrange
     var responses = new List<HttpResponseMessage>();
     // Act
     for (int i = 0; i < 6; i++)
     {
         responses.Add(await _client.GetAsync("/api/inventory-report"));
     }
     // Assert
     Assert.All(
         responses.Take(5),
         r => Assert.Equal(HttpStatusCode.OK, r.StatusCode)
     );
     Assert.Equal(
         HttpStatusCode.TooManyRequests,
         responses.Last().StatusCode
     );
 }

Q3: Can I use multiple rate limiters on the same endpoint?

A: Yes.

ASP.NET Core allows you to combine multiple rate limiters on a single endpoint.

For example, you can use:

A Sliding Window limiter to control bursty traffic
An IP-based limiter to prevent abuse from a single client

This approach provides better protection for public or high-risk APIs.

Q4: What happens to requests in the queue (QueueLimit)?

A: When the limit is reached, requests can optionally be queued instead of rejected.

Queued requests wait until permits become available. If the queue is full, new requests are rejected with HTTP 429.

For most production APIs - especially long-running ones - it is recommended to set QueueLimit = 0 and reject excess requests immediately.

References

Rate limiting middleware in ASP.NET Core

Rate limiting middleware samples

Rate Limiting Fundamentals

Related Blogs

.NET Development

Introduction To Rate Limiting Middleware in ASP.NET Core

Discover how rate limiting middleware boosts performance in ASP.NET Core

Dec 04, 2025 Read More

.NET Development

Fixed Window Rate Limiter in ASP.NET Core

How the Fixed Window Rate Limiter keeps ASP.NET Core APIs stable and controlled.

Dec 11, 2025 Read More

.NET Development

Token Bucket Rate Limiter in ASP.NET Core: What is it and when to use it?

Token Bucket rate limiting to support burst traffic while protecting ASP.NET Core APIs from overload.

Mar 02, 2026 Read More

Need Expert Guidance?

Sliding Window Rate Limiter in ASP.NET Core: What is it and when to use it?

Rate Limiting: What Is It?

What Is a Sliding Window Rate Limiter?

Example Scenario

What happened without a rate limiter?

What happened after adding a Sliding Window limiter?

Below is a sample implementation

Setting

Meaning (Sliding Window limiter)

Advantages of Sliding Window rate limiter

When should you use a Sliding Window rate limiter?

When should you avoid using a Sliding Window rate limiter?

Benchmark: Sliding Window Rate Limiter (Local Runtime Test)

Test Environment

Test Scenarios

Summary

FAQs

References

Related Blogs

Introduction To Rate Limiting Middleware in ASP.NET Core

Fixed Window Rate Limiter in ASP.NET Core

Token Bucket Rate Limiter in ASP.NET Core: What is it and when to use it?

Tags

.NET Development

QA and Testing

Backend Development

Frontend Development

Umbraco Development

Cloud Services

.NET Development

QA and Testing

Backend Development

Frontend Development

Umbraco Development

Cloud Services

Need Expert Guidance?

About Us

Contact Us

Careers

Sliding Window Rate Limiter in ASP.NET Core: What is it and when to use it?

Rate Limiting: What Is It?

What Is a Sliding Window Rate Limiter?

Example Scenario

What happened without a rate limiter?

What happened after adding a Sliding Window limiter?

Below is a sample implementation

Setting

Meaning (Sliding Window limiter)

Advantages of Sliding Window rate limiter

When should you use a Sliding Window rate limiter?

When should you avoid using a Sliding Window rate limiter?

Benchmark: Sliding Window Rate Limiter (Local Runtime Test)

Test Environment

Test Scenarios

Summary

FAQs

References

Related Blogs

Introduction To Rate Limiting Middleware in ASP.NET Core

Fixed Window Rate Limiter in ASP.NET Core

Token Bucket Rate Limiter in ASP.NET Core: What is it and when to use it?

Tags