How I Reduced My API Response Time (and What Actually Worked)

api testing tool

If you’ve ever stared at your network tab wondering why your API feels like it’s taking a nap before sending a response, you’re not alone. I’ve been there many times. In my last few projects, I spent weeks testing and tuning API performance to squeeze out every millisecond possible. In this post, I’ll share what actually worked for me, what didn’t, and the real-world numbers I got along the way.

Step 1: Measure Before You Optimize

I can’t stress this enough: don’t optimize without measuring first. Before touching a single line of code, I used tools like:

  • Postman for manual API testing.
  • Apache Benchmark (ab) and k6 for load testing.
  • New Relic and Datadog for server-side profiling.

These tools gave me metrics such as response time, throughput, and time-to-first-byte (TTFB). For one API I tested (Node.js with Express), the average response time was around 420ms before any optimization. Not terrible, but not great either.

Step 2: Optimize Database Queries

My biggest bottleneck turned out to be the database queries.

Here’s what I did:

  • Added proper indexing on frequently queried columns. A simple CREATE INDEX on a foreign key dropped query time by nearly 60%.
  • Reduced N+1 queries by switching to eager loading in Sequelize. Instead of fetching related data in loops, I fetched everything in one go.
  • Cached common queries using Redis. This was a game changer. I set a TTL (time-to-live) of 60 seconds for high-traffic endpoints, and the average response time dropped from 420ms to 160ms.

Lesson: Databases are often the silent killers of performance. Profiling your queries is essential.

Step 3: Enable Compression

Next up is compression. Enabling Gzip or Brotli on the server reduced response time, not from processing but from data transfer.

import compression from 'compression';
app.use(compression());

This reduced payload size by about 70% on JSON responses. The response time went from 160ms to 130ms on average.

Step 4: Cache Strategically

After database caching, I also added HTTP-level caching for static or infrequently changing data.

Example: I used ETag and Cache-Control headers like this:

res.setHeader('Cache-Control', 'public, max-age=300');

This allowed browsers to reuse cached data for five minutes, reducing load on both the API and the client. For APIs that serve things like categories, settings, or user roles, this works extremely well.

Step 5: Reduce Payload Size

Sometimes the data we send is simply too large. I realized one of my APIs was returning full user objects (including profile settings, permissions, and preferences) when all the frontend needed was a name and avatar.

By selecting only required fields (for example, in MongoDB: .select('name avatar')), I reduced the payload by 80%. That change alone brought the response time from 130ms to around 90ms.

Step 6: Use Asynchronous and Parallel Processing

Wherever possible, I moved blocking tasks to background jobs. Sending emails, writing logs, and generating reports do not need to block the API response.

I used message queues (BullMQ with Redis) to handle these tasks asynchronously. The perceived response time dropped significantly because the API returned an acknowledgment instead of waiting for every operation to finish.

Step 7: CDN and Edge Caching (Bonus)

For APIs serving global users, physical distance affects latency. I integrated Cloudflare CDN to cache public GET responses. It reduced latency for users outside the main data center region, especially those in Asia, by nearly 40%.

Step 8: Upgrade Hosting and Runtime

When everything else has been optimized, sometimes the bottleneck is simply the server environment. I moved from a shared VPS to AWS Fargate with auto-scaling and an optimized Node.js runtime. The improvement was not massive (around 15–20%), but it made performance consistent under heavy load.

Final Results

Here’s how my API improved through each optimization step:

Optimization Stage Average Response Time
Before optimization 420ms
After database tuning 160ms
After compression and caching 130ms
After payload and async tweaks 90ms
After CDN and infrastructure tuning 70ms

A total reduction of 83%, and it felt instantaneous on the frontend.

What Didn’t Work

  • Premature microservices: Splitting too early added latency from service-to-service communication.
  • Over-caching: I once cached user sessions too aggressively and ended up with stale data issues.
  • Blind library upgrades: Some popular libraries actually increased latency because of heavier internal parsing.

My Takeaway

Optimizing API response time is both a science and an art. You don’t need to guess; just measure, make changes, and measure again. Start with the slowest layer (usually the database), then move outward to compression, caching, and delivery.

If your APIs feel slow, tackle one optimization at a time. Each improvement compounds, and before long, your API will be running fast and efficiently.

Your Turn: Have you tried any of these strategies or discovered something else that made a huge difference? I’d love to hear what worked for you.

Facebook
Twitter
LinkedIn
Pinterest