When designing or implementing a new system, one of the most critical areas of non-functional requirements (NFRs) is performance. A system may have all the required business features, but if it responds slowly, cannot handle scale, or struggles under load, users will quickly lose confidence in it.
Performance engineering typically focuses on three fundamental concepts:
- Latency
- Throughput
- Bandwidth
Although these terms are often used interchangeably, they represent very different characteristics of a system. Understanding the distinction between them is essential for architects, engineers, project managers, and infrastructure teams.
Latency — How Fast a Request Completes

Latency refers to the amount of time it takes for data or a request to travel from source to destination and complete processing.
In simpler terms:
Latency measures delay.
Examples include:
- API response time
- Database query execution time
- Network round-trip time
- Time taken to load a webpage
Latency is commonly measured in:
- milliseconds (ms)
- microseconds (µs)
Real-World Example
A customer clicks the “Checkout” button on an e-commerce site.
If the payment confirmation appears after 150 milliseconds, then the transaction latency is 150 ms.
Lower latency results in a faster and smoother user experience.
Throughput — How Much Work the System Can Handle

Throughput measures the amount of work completed within a specific period of time.
In simpler terms:
Throughput measures system productivity.
Examples include:
- Requests processed per second
- Transactions completed per minute
- Data transferred per hour
Common throughput metrics:
- Requests per second (RPS)
- Transactions per second (TPS)
- MB/s or GB/s
Real-World Example
An API gateway successfully processes:
* 25,000 requests per second
That number represents the throughput capacity of the system.
Bandwidth — The Maximum Capacity Available

Bandwidth refers to the maximum data transfer capacity of a network or communication channel.
In simpler terms:
Bandwidth measures the size of the pipe.
Bandwidth is commonly measured in:
- Mbps
- Gbps
Real-World Example
A network connection supports: 10 Gbps maximum transfer rate
This means the network can theoretically transfer up to 10 gigabits of data per second.
However, actual transfer speeds are often lower due to:
- congestion,
- packet loss,
- protocol overhead,
- latency,
- and hardware limitations.
The Difference Between Throughput and Bandwidth
One of the most common misconceptions is assuming throughput and bandwidth mean the same thing.
They do not.
Bandwidth
Theoretical maximum capacity.
Throughput
Actual achieved performance.
For example:
| Metric | Value |
|---|---|
| Available Bandwidth | 1 Gbps |
| Actual Throughput | 450 Mbps |
Even though the network supports 1 Gbps, the system only achieves 450 Mbps due to real-world constraints.
Why These Metrics Matter in System Design
Performance-related NFRs help teams:
- estimate infrastructure needs,
- design scalable architectures,
- plan capacity,
- and define acceptable service levels.
Examples of performance requirements include:
| Requirement Type | Example |
|---|---|
| Latency | 95% of API responses must complete within 200 ms |
| Throughput | System must support 100,000 requests per second |
| Bandwidth | Data center uplink must support 40 Gbps traffic |
TechE2E
A diverse group of technologists—ranging from beginners to experienced professionals—sharing insights, simplifying complex tech topics, and fostering meaningful discussions for readers at all stages of their journey.





