The 5 Pillars of a Scalable ASP.NET Core API

1 min read

Interview Question: 'How do you design a .NET API to be scalable?'

'Scalability' is the ability of your API to handle a growing amount of load, whether that's 10 requests per second or 10,000. It's not one single feature, but an architectural approach.

An interviewer wants to see if you understand the bottlenecks of a web application (CPU, I/O, Database) and how to mitigate them.

Here are the 5 non-negotiable pillars for a scalable API:

1. Embrace Asynchronous Programming (`async/await`)

The Goal: To not block the request thread. This is the single most important factor for I/O-bound applications (like APIs). We'll dive deep into why this is the key to scalability in this cluster.

2. Implement Smart Caching

The Goal: To reduce database load. Your database is almost always your slowest bottleneck. A well-placed cache can serve 99% of your requests without ever touching the database, making your API blazing fast.

3. Use API Pagination

The Goal: To not send huge data payloads. An endpoint that returns 1 million rows from the database will crash your server, the network, and the client's browser. Pagination is a must-have.

4. Design for Decoupling (DI and SOLID)

The Goal: To have a maintainable and flexible system. Using Dependency Injection (which we covered in the SOLID cluster) allows you to build loosely-coupled components that are easy to test, maintain, and swap out.

5. Monitor, Profile, and Add Health Checks

The Goal: You can't optimize what you can't measure. A scalable API must be observable. This means using monitoring tools to find bottlenecks and health checks to tell the load balancer if the API is healthy.

This cluster will provide a deep, practical dive into the most critical of these pillars.

Async/Await: The #1 Key to API Scalability

Interview Question: 'Why does `async/await` make an API more scalable?'

This is a critical concept. A common mistake is to say 'it makes the code run faster'—this is wrong.

Answer: 'Async/await doesn't make a single request faster. It makes the server able to handle thousands more concurrent requests. It's about scalability and throughput, not speed.'

The 'Thread Pool' Problem

Your web server (e.g., Kestrel) has a limited number of 'worker threads' in its thread pool (say, 200 threads). Every incoming request 'borrows' one thread to do its work.

'Before': Synchronous (BAD 👎)

A thread is blocked, waiting for I/O (like a database query).

// This is a SYNC controller action
[HttpGet("bad/{id}")]
public IActionResult GetProduct(int id) {
  // 1. Request 1 borrows Thread 1.
  
  // 2. Thread 1 calls the database. The database takes 3 seconds.
  //    During this time, Thread 1 is BLOCKED. It sits and does
  //    nothing but wait. It cannot be used for anything else.
  var product = _db.Products.Find(id); // Synchronous call
  
  // 3. After 3 seconds, Thread 1 gets the result and returns.
  return Ok(product);
}

The Scalability Collapse: If 200 users make this request at the same time, all 200 threads are borrowed and blocked, waiting for the database. When User 201 arrives, their request is rejected or queued. Your server is full. This is 'thread starvation'.

'After': Asynchronous (GOOD 👍)

A thread is released while waiting for I/O.

// This is an ASYNC controller action
[HttpGet("good/{id}")]
public async Task GetProductAsync(int id) {
  // 1. Request 1 borrows Thread 1.
  
  // 2. Thread 1 calls the database.
  // 3. The 'await' keyword tells Thread 1: 'My work is done for now.
  //    Go back to the thread pool.'
  //     Thread 1 is now FREE to handle a new request from User 201! 
  var product = await _db.Products.FindAsync(id); // Asynchronous call
  
  // 4. After 3 seconds, the database is done. A callback is queued.
  // 5. A free thread from the pool (maybe Thread 74) picks up the
  //    work, finishes the method, and returns the response.
  return Ok(product);
}

Conclusion: With async/await, your 200 threads can easily handle thousands of requests. While 500 requests are 'awaiting' the database, the 200 threads are busy handling new incoming requests. This is the #1 key to API scalability.

Caching Strategies: In-Memory vs. Distributed (Redis)

Interview Question: 'How would you implement caching in a .NET API?'

Answer: 'Caching is essential for scalability because it reduces load on the database. I would use the built-in IMemoryCache for simple, single-server applications, but for a truly scalable, multi-server environment, I would use a Distributed Cache like Redis.'

Strategy 1: In-Memory Cache (`IMemoryCache`)

What it is: A cache stored inside your API's own process memory. It's the simplest to set up.

Setup (Program.cs):

builder.Services.AddMemoryCache();

Usage:

public class ProductController : ControllerBase {
  private readonly IMemoryCache _cache;
  private readonly MyDbContext _db;

  public ProductController(IMemoryCache cache, MyDbContext db) {
    _cache = cache;
    _db = db;
  }

  [HttpGet("{id}")]
  public async Task GetProduct(int id) {
    string cacheKey = $"product-{id}";

    // 'GetOrCreateAsync' is the magic method
    var product = await _cache.GetOrCreateAsync(cacheKey,
      async (entry) => {
        // This code only runs if the item is NOT in the cache
        entry.SetAbsoluteExpiration(TimeSpan.FromMinutes(5));
        return await _db.Products.FindAsync(id);
      });

    if (product == null) return NotFound();
    return Ok(product);
  }
}

Scalability Trap: If you run your API on 5 servers (load-balanced), each server has its own private cache. User A (hits Server 1) caches the data. User B (hits Server 2) gets a cache miss and hits the DB. This is inefficient, and the caches can become out-of-sync.

Strategy 2: Distributed Cache (`IDistributedCache` / Redis)

What it is: A shared, external cache (like a Redis server) that all your API servers talk to. This is the truly scalable solution.

Setup (Program.cs):

// This requires the 'Microsoft.Extensions.Caching.StackExchangeRedis' package
builder.Services.AddStackExchangeRedisCache(options => {
    options.Configuration = builder.Configuration.GetConnectionString("Redis");
    options.InstanceName = "MyApi_";
});

Usage:

Distributed caches only store strings or byte arrays, so you must serialize/deserialize your objects (e.g., as JSON).

public class ProductController : ControllerBase {
  private readonly IDistributedCache _cache;
  // ...

  [HttpGet("redis/{id}")]
  public async Task GetProductRedis(int id) {
    string cacheKey = $"product-{id}";
    Product product;

    // 1. Try to get from cache
    string jsonProduct = await _cache.GetStringAsync(cacheKey);

    if (jsonProduct != null) {
      // 2. Found in cache! Deserialize.
      product = JsonSerializer.Deserialize(jsonProduct);
    } else {
      // 3. Not found. Get from DB.
      product = await _db.Products.FindAsync(id);
      if (product == null) return NotFound();

      // 4. Serialize and save to cache
      jsonProduct = JsonSerializer.Serialize(product);
      var options = new DistributedCacheEntryOptions()
          .SetSlidingExpiration(TimeSpan.FromMinutes(2))
          .SetAbsoluteExpiration(TimeSpan.FromHours(1));
      await _cache.SetStringAsync(cacheKey, jsonProduct, options);
    }
    
    return Ok(product);
  }
}

Stop Sending 1 Million Rows: Implementing API Pagination

Interview Question: 'Your `GET /api/products` endpoint is slow. Why?'

Answer: 'My first guess would be that it's returning too much data. An endpoint that tries to return 1 million products in a single JSON payload will crash. The solution is pagination.'

Pagination means only returning a 'page' of data at a time (e.g., 20 items) and letting the client ask for the next page.

How to Implement Pagination

The client must be able to ask for what it wants using query parameters.

Client Request: GET /api/products?pageNumber=2&pageSize=25

Backend Implementation (LINQ):

The most important part is using Skip() and Take() on the IQueryable before calling ToListAsync(). This ensures the database does the work, and you don't pull 1 million rows into your API's memory.

[HttpGet("products")]
public async Task GetProducts(
    [FromQuery] int pageNumber = 1,
    [FromQuery] int pageSize = 20)
{
    // 1. Get the IQueryable. This does not hit the database yet.
    var query = _context.Products
        .Where(p => p.IsActive) // Example filter
        .OrderBy(p => p.Name);

    // 2. Get the total count of all products before pagination
    // This is for the client to know how many pages there are.
    var totalCount = await query.CountAsync();

    // 3. Apply pagination to the IQueryable
    // This modifies the SQL query!
    var products = await query
        .Skip((pageNumber - 1) * pageSize)
        .Take(pageSize)
        .ToListAsync();
    
    // 4. The generated SQL will be efficient:
    // '... ORDER BY Name
    //   OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY'

    // 5. Return the paged data + the total count (often in a header)
    Response.Headers.Add("X-Total-Count", totalCount.ToString());
    Response.Headers.Add("X-Page-Number", pageNumber.ToString());
    Response.Headers.Add("X-Page-Size", pageSize.ToString());

    return Ok(products);
}

Conclusion: Always paginate endpoints that can return large sets of data. Let the database do the work using Skip() and Take().

Monitoring & Health Checks: Knowing When You're Slow

Interview Question: 'Your API is in production and users say 'it's slow'. What do you do?'

Answer: 'You can't fix what you can't measure. The first step is to use an APM (Application Performance Monitoring) tool to find the bottleneck. I would also implement Health Checks to ensure the system is resilient and can self-report when it's unhealthy.'

1. APM (e.g., Application Insights)

What it is: A tool like Azure Application Insights, Datadog, or New Relic. You add their SDK to your project, and they automatically trace every single request.

How it helps: It gives you a dashboard that answers questions like:

'What is my slowest endpoint?' (e.g., GET /api/reports)
'Where is it slow?' (e.g., 'It spends 95% of its time on this one SQL query.')
'What's my error rate?'

Without an APM, you are just guessing. With an APM, you know exactly where the bottleneck is.

2. Health Checks

What it is: A built-in ASP.NET Core feature. It's a special endpoint (e.g., /health) that reports the status of your API and its dependencies (database, cache, etc.).

How it helps: A load balancer (like Nginx) or an orchestrator (like Kubernetes) will 'ping' this /health endpoint every few seconds.

If it returns 200 OK, the load balancer knows this server is healthy and sends it traffic.
If your database connection dies, the health check will fail and return 503 Service Unavailable. The load balancer will see this and immediately stop sending traffic to this 'sick' server, re-routing users to a healthy one.

This provides automatic self-healing and is critical for a scalable, resilient system.

Setup (Program.cs):

// 1. Register the health checks
builder.Services.AddHealthChecks()
    // Add a check for our database
    .AddDbContextCheck("Database")
    // Add a check for our Redis cache
    .AddRedis(builder.Configuration.GetConnectionString("Redis"), "Redis");

var app = builder.Build();

// 2. Map the endpoint for the load balancer to hit
app.MapHealthChecks("/health");

// ...

Cloud Messaging: Queue vs. Service Bus vs. Event Hub
12 Feb 2025
Distributed Transactions (The Saga Pattern)
12 Feb 2025
Core .NET Concepts
12 Feb 2025