✨ Shield now has support for Avalonia UI

Microsoft Releases New .NET Rate Limiter in .NET 7 - And It’s Amazing!

Jul 20, 2022 | .NET

It’s official — Microsoft has released the new .NET Rate Limiter in .NET 7! It’s amazing, and it will avoid any problems you have had with rate limiting in the past! Check out this blog post to learn more about it!

As far as performance goes, this new .NET rate limiter has been designed to handle thousands of concurrent requests with minimal overhead — making it perfect for high traffic sites with millions of active users.

First of all, before we start talking about the amazing features that Microsoft’s rate limiting brings to .NET 7, we need to understand what rate limiting is, what it is for and how it works.

What is Rate Limiting?

Before we are building our project (.NET API, Blazor application orwhatever…) we must make sure that we don’t have any security problem and the app throttling has already a long time. That’s why this rate limiting concept is not new. Rate limiting is about (as its name suggests) limiting or restricting access to certain resources in a specific time window by setting maximum access rates.

To explain this concept in a more practical way, imagine that you have an application that connects to a database through an API. Let’s assume that this database and API is not hosted on a quantum super server and we know that it can handle about 100,000 requests daily. At this point many developers will be wondering things like:

  • Will the API and database be able to handle more than 100,000 requests daily?
  • What would happen if it exceeded that number of requests?
  • If it can’t handle any more requests, will it go down?

All these questions raise doubts as to whether the database and API have the capacity to handle more requests.

This is why the concept of rate limiting has been created. A rate limiter (using different types of algorithms) would accept, in this case, 100,000 requests each day and deny or block any request that exceeds that number. In this way we could ensure that the service is never saturated.

If rate limiting is not implemented in our developments we create a nice gateway for any attacker who wants to harm our application or service (or even unintentionally by excessive requests!).

Implementing rate limiting within .NET services and controlling the maximum number of requests will prevent possible API and database saturation (API limit rate), but this is not the only problem it solves.

Why do we need and use Rate Limiting?

As I said before, the problem of saturation of a service or functionality can not only be due to an involuntary action. It can also be intentionally thought and performed by a hacker or attacker causing a DoS (Denial of Service) attack.

If we assume the case where we have a public API, which can be accessed by many clients at the same time. Even if most of them are legitimate users, just by having the API or public service opens the possibility of attacks from the outside.

This depends on the requirements that the API may need, as the throttling may affect a specific endpoint or all endpoints.

Using API rate limit will also help protect us from possible Dos attacks by human attackers or even bot attacks, which are multiple infected computers controlled by the attacker (botnets).

These controls can limit by IP address (commonly), by user or by any identifier.

Another (more commercial) use of Rate Limiting is to offer a charge or subscription for using a service or API (called pay as you go). In this way, many IaaS companies and cloud providers that offer public APIs ensure that their users and customers do not use the service more than they have paid for.

Now that the concept of Rate Limiting is clear, let’s see and explain the new Rate Limiter that Microsoft has released in .NET 7.

.NET Rate Limiting

First of all, let’s remember that we have AspNetCoreRateLimit nuget package for a MVC app or web API and it’s also in Github by Stefan Prodan. This new Rate Limiter is integrated in .NET 7 and according to Microsoft it will help us to keep the traffic of our application on a safe level and will prevent us from overwhelming the application:

“Rate limiting provides a way to protect a resource in order to avoid overwhelming your app and keep traffic at a safe level.”

There are actually multiple combination of algorithms and different ways to control the flow of all the requests that a .NET application may have and Microsoft has decided to present the 4 main algorithms for .NET applications provided in .NET 7

Concurrency limit

Concurrency limiter is the “simplest” solution for rate limiting. This limiter presented by Microsoft is in charge of limiting the maximum number of concurrent requests. By specifying a limit number, the limiter will deny the next request because it has exceeded the maximum number of allowed requests.

Imagine you set the limit to 50 concurrent requests, which would mean that only 50 requests would be allowed at a time.

Source: ByteHide

If for any reason a 51st request is generated, it would be denied for exceeding the specified limit.

Source: ByteHide

Using RateLimitLease with RateLimiter class, we can increase the number of requests allowed each time a request is completed. Let’s check this code example:

public abstract class RateLimiter : IAsyncDisposable, IDisposable
    public abstract int GetAvailablePermits();
    public abstract TimeSpan? IdleDuration { get; }

    public RateLimitLease Acquire(int permitCount = 1);
    public ValueTask<RateLimitLease> WaitAsync(int permitCount = 1, CancellationToken cancellationToken = default);

    public void Dispose();
    public ValueTask DisposeAsync();

RateLimitLease is part of System.Threading.RateLimiting, the new nuget package that provides the built-in “primitives” and algorithms for creating and configuring rate limiters and is included in .NET 7.

Token bucket limit

The token bucket is the second algorithm released by Microsoft. This algorithm limits the number of requests based on a defined amount of allowed requests. As Microsoft says, its name describes how it works (a bit abstactly) but yes, let’s understand it with a simple example.

Suppose we have an application and this application has an imaginary bucket. This bucket has a limit of tokens (requests to the application) and only 50 tokens can fit in it, for example.

Source: ByteHide

If a user comes and makes a request (1 token), this will be consumed from the bucket and 49 tokens will remain.

Source: ByteHide

Now let’s imagine that an attacker comes with a botnet and generates 100 tokens (100 requests). As there were 49 free tokens left before, only the first 49 of the 100 requests made by the attacker will be processed and the remaining 51 will be denied.

Source: ByteHide

In this current scenario, no request will be processed until some free token is generated in the bucket. This process happens every minute.

Fixed window limit

The third algorithm released by Microsoft in the new .NET rate limiter is fixed window limit. This algorithm is somewhat similar in some respects to token bucket limit but it does have its differences.

Fixed window limit is in charge of allowing a certain amount of requests through a fixed window in a given time.

To explain it in a more practical way I will reuse Microsoft’s example of the cinema.

Let’s imagine that we have in front of us a cinema with 50 seats (maximum capacity of 50). Now, in this cinema is projected the Fast and Furious 9 movie that lasts 2h 15m.

Source: ByteHide

50 people enter and watch the movie, but at the same time another 50 people can queue up to watch the next session.

Source: ByteHide

Once the 2h 15m of the first session is over, the 50 people who were watching the movie will leave and leave the 50 seats free for the next 50 people and so on.

Source: ByteHide

Let’s say it goes in cycles of 50 at a time in this example.

Sliding window limit

The fourth algorithm for rate limiting in .NET is sliding window. The sliding window limit is similar to fixed window limit but has an addition of divisions (also known as segments).

Let’s understand it based on the example provided by Microsoft again.

Let’s imagine that we have a window of 2 hours, this window is divided into 2 segments of 1 hour each and can accept a maximum of 40 requests simultaneously. Now, we also have an index that will point to the current segment of the window (the most recent or new one).

Source: ByteHide

Now in the first hour we receive 20 requests. These 20 requests will go directly to the current segment pointed to by the index.

Source: ByteHide

When the first hour has passed, the window index will go to the next segment with a total capacity of 20 requests (since the other 20 are occupied in the previous segment).

Source: ByteHide

In the second hour 10 new requests come in, which go back to the segment indicated by the index and the number of available requests would drop to 10 requests.

Source: ByteHide

Again, when this second hour has passed, the window index will move to the next position and, as the first 20 requests that entered in the first hour have been left out of the window, those 20 requests are recovered leaving a total of 30 free requests at the moment.

Source: ByteHide

Having explained this, we also have a couple of abstractions inside the nuget package System.Threading.RateLimiting like PartitionedRateLimiter or RateLimiting middleware and let’s explain them to understand them and be able to move our .NET applications and APIs away from throttle.


As I said a moment ago, PartitionedRateLimiter belongs to the popular nuget package System.Threading.RateLimiting. It is similar in some aspects to RateLimiter that I explained above, but the main difference is that PartitionedRateLimiter allows to have arguments for methods on TResource instances.

As Microsoft explains, Acquire becomes:

Acquire(TResource resourceID, int permitCount = 1)

Let’s see how Microsoft provides the example:

enum MyPolicyEnum

PartitionedRateLimiter<string> limiter = PartitionedRateLimiter.Create<string, MyPolicyEnum>(resource =>
    if (resource == "Policy1")
        return RateLimitPartition.Create(MyPolicyEnum.One, key => new MyCustomLimiter());
    else if (resource == "Policy2")
        return RateLimitPartition.CreateConcurrencyLimiter(MyPolicyEnum.Two, key =>
            new ConcurrencyLimiterOptions(permitLimit: 2, queueProcessingOrder: QueueProcessingOrder.OldestFirst, queueLimit: 2));
    else if (resource == "Admin")
        return RateLimitPartition.CreateNoLimiter(MyPolicyEnum.Admin);
        return RateLimitPartition.CreateTokenBucketLimiter(MyPolicyEnum.Default, key =>
            new TokenBucketRateLimiterOptions(tokenLimit: 5, queueProcessingOrder: QueueProcessingOrder.OldestFirst,
                queueLimit: 1, replenishmentPeriod: TimeSpan.FromSeconds(5), tokensPerPeriod: 1, autoReplenishment: true));
RateLimitLease lease = limiter.Acquire(resourceID: "Policy1", permitCount: 1);

// ...

RateLimitLease lease = limiter.Acquire(resourceID: "Policy2", permitCount: 1);

// ...

RateLimitLease lease = limiter.Acquire(resourceID: "Admin", permitCount: 12345678);

// ...

RateLimitLease lease = limiter.Acquire(resourceID: "other value", permitCount: 1);

RateLimiting middleware

RateLimiting middleware is also part of System.Threading.RateLimiting and its main function is to be able to configure and add custom rate limiting policies to endpoints.

Going back to the Microsoft examples, in this case we can find that:

Func<HttpContext, RateLimitPartition<TPartitionKey>>

Has the same functionality as PartionedRateLimiter.

It also has the option RateLimiterOptions, which includes RejectionStatusCode with which we can return a status code (default is 503).

    OnRejected = (context, cancellationToken) =>
        context.HttpContext.StatusCode = StatusCodes.Status429TooManyRequests;
        return new ValueTask();

I hope these explanations have cleared up a lot of doubts about .NET rate limiting. I would like to know if any developer is going to implement or has already implemented the .NET rate limiter for API limiting in API.

You May Also Like