99.9% Uptime
Understand what 99.9% uptime really means in minutes of downtime per year. Includes SLA tier calculations, JavaScript formulas, and monitoring setup.
99.9% Uptime
Your hosting provider guarantees 99.9% uptime. Your status page shows 99.9% uptime for the last 30 days. Your sales team tells prospects you have 99.9% uptime. But none of you can answer the simplest follow-up question: how many minutes of downtime is that? The answer is 8 hours, 45 minutes, and 36 seconds per year. Not per decade — per year. That means your service can be completely unreachable for an entire workday and still technically meet a 99.9% SLA. This article breaks down what uptime percentages actually mean in real time, how to calculate them correctly, the difference between each SLA tier, and how to monitor and measure uptime in a way that reflects reality instead of marketing.
The Uptime Percentage Illusion
Uptime percentages feel high because humans are bad at intuitively understanding percentages above 99%. The difference between 99% and 99.9% sounds trivial — it is one-tenth of one percent. But in terms of allowed downtime, the gap is enormous:
Going from 99% to 99.9% means reducing your allowed downtime by a factor of 10. Going from 99.9% to 99.99% means reducing it by another factor of 10. Each additional nine is exponentially harder and more expensive to achieve.
The Complete Downtime Table
Here is the full breakdown across all common SLA tiers, calculated for yearly, monthly, weekly, and daily windows:
| SLA Level | Downtime/Year | Downtime/Month | Downtime/Week | Downtime/Day | |-----------|---------------|----------------|---------------|--------------| | 99% (two nines) | 3d 15h 39m | 7h 18m 17s | 1h 40m 48s | 14m 24s | | 99.5% | 1d 19h 49m | 3h 39m 8s | 50m 24s | 7m 12s | | 99.9% (three nines) | 8h 45m 36s | 43m 49s | 10m 4s | 1m 26s | | 99.95% | 4h 22m 48s | 21m 54s | 5m 2s | 43s | | 99.99% (four nines) | 52m 35s | 4m 23s | 1m 0s | 8.6s | | 99.999% (five nines) | 5m 15s | 26.3s | 6.0s | 0.86s |
Let those numbers sink in. A 99.9% SLA allows nearly 44 minutes of downtime per month. That is enough time for a failed deployment to take down your service, your team to notice via an alert, investigate the root cause, roll back, and verify recovery — if everything goes smoothly.
A 99.99% SLA allows 4 minutes and 23 seconds per month. At that level, a human cannot be in the incident response loop. You need automated failover, health checks running every few seconds, and zero-downtime deployment strategies.
Calculating Uptime: The JavaScript Formula
The formula is straightforward:
Uptime % = ((Total Time - Downtime) / Total Time) * 100Here is a complete JavaScript implementation that calculates uptime from incident data:
/**
* Calculate SLA uptime percentage from incident records.
*
* @param {Date} periodStart - Start of the measurement period
* @param {Date} periodEnd - End of the measurement period
* @param {Array<{start: Date, end: Date, partial: boolean}>} incidents
* @returns {Object} Uptime metrics
*/
function calculateUptime(periodStart, periodEnd, incidents) {
const totalMs = periodEnd.getTime() - periodStart.getTime();
// Calculate total downtime in milliseconds
let downtimeMs = 0;
for (const incident of incidents) {
// Clamp incident to the measurement period
const incidentStart = Math.max(
incident.start.getTime(),
periodStart.getTime()
);
const incidentEnd = Math.min(
incident.end.getTime(),
periodEnd.getTime()
);
if (incidentEnd > incidentStart) {
if (incident.partial) {
// Partial outage counts as 50% downtime (configurable)
downtimeMs += (incidentEnd - incidentStart) * 0.5;
} else {
// Full outage counts as 100% downtime
downtimeMs += incidentEnd - incidentStart;
}
}
}
const uptimeMs = totalMs - downtimeMs;
const uptimePercent = (uptimeMs / totalMs) * 100;
return {
totalMinutes: totalMs / 60000,
downtimeMinutes: downtimeMs / 60000,
uptimeMinutes: uptimeMs / 60000,
uptimePercent: uptimePercent,
uptimeFormatted: uptimePercent.toFixed(4) + "%",
slaMet: checkSlaCompliance(uptimePercent),
downtimeFormatted: formatDuration(downtimeMs),
};
}
/**
* Check SLA compliance against common tier thresholds.
*/
function checkSlaCompliance(uptimePercent) {
return {
"99%": uptimePercent >= 99,
"99.5%": uptimePercent >= 99.5,
"99.9%": uptimePercent >= 99.9,
"99.95%": uptimePercent >= 99.95,
"99.99%": uptimePercent >= 99.99,
"99.999%": uptimePercent >= 99.999,
};
}
/**
* Format milliseconds into a human-readable duration.
*/
function formatDuration(ms) {
const seconds = Math.floor(ms / 1000);
const minutes = Math.floor(seconds / 60);
const hours = Math.floor(minutes / 60);
const days = Math.floor(hours / 24);
if (days > 0) return ${days}d ${hours % 24}h ${minutes % 60}m;
if (hours > 0) return ${hours}h ${minutes % 60}m ${seconds % 60}s;
if (minutes > 0) return ${minutes}m ${seconds % 60}s;
return ${seconds}s;
}
// ── Example Usage ──────────────────────────────────────────
const periodStart = new Date("2026-06-01T00:00:00Z");
const periodEnd = new Date("2026-07-01T00:00:00Z");
const incidents = [
{
// Full outage: 45 minutes during a deployment
start: new Date("2026-06-12T14:00:00Z"),
end: new Date("2026-06-12T14:45:00Z"),
partial: false,
},
{
// Partial degradation: slow responses for 2 hours
start: new Date("2026-06-20T08:00:00Z"),
end: new Date("2026-06-20T10:00:00Z"),
partial: true,
},
];
const result = calculateUptime(periodStart, periodEnd, incidents);
console.log(result);
// {
// totalMinutes: 43200,
// downtimeMinutes: 105,
// uptimeMinutes: 43095,
// uptimePercent: 99.7569,
// uptimeFormatted: "99.7569%",
// slaMet: {
// "99%": true,
// "99.5%": true,
// "99.9%": false, <-- 99.9% SLA breached
// "99.95%": false,
// "99.99%": false,
// "99.999%": false
// },
// downtimeFormatted: "1h 45m 0s"
// }This implementation handles several real-world complexities:
What Counts as Downtime?
This is where SLA definitions get contentious. Different providers define downtime differently:
Total Outage Only
Some providers only count complete, total outages. If your API returns 500 errors for 80% of requests but 200 for the other 20%, they consider the service "up." This is the most provider-friendly definition and the least useful for customers.
Error Rate Threshold
A better definition: downtime begins when the error rate exceeds a threshold (e.g., 5% of requests return errors). This captures partial outages that meaningfully impact users.
Response Time Threshold
The strictest definition: downtime includes periods where response times exceed an acceptable threshold (e.g., p95 latency > 2 seconds). Slow is the new down for many applications.
Scheduled Maintenance
Most SLAs exclude scheduled maintenance from downtime calculations. This is reasonable as long as maintenance windows are communicated in advance and kept within agreed limits (e.g., no more than 4 hours per month).
A good SLA clearly defines what constitutes downtime. A bad SLA uses vague language that lets the provider redefine downtime after the fact.
The Cost of Each Nine
Adding a nine to your uptime is not just a technical challenge — it is an economic one:
| Level | Downtime/Year | What It Requires | Approximate Cost Premium | |-------|---------------|------------------|------------------------| | 99% | 3.65 days | Single server, basic monitoring | Baseline | | 99.9% | 8h 45m | Redundancy, health checks, alerting | 2-5x | | 99.99% | 52m | Multi-AZ, auto-failover, zero-downtime deploys | 10-20x | | 99.999% | 5m 15s | Multi-region, active-active, sub-second failover | 50-100x |
The jump from 99.9% to 99.99% typically requires:
Each of these adds operational complexity and infrastructure cost. The question is not "can we achieve 99.99%?" but "does the business value justify the engineering investment?"
Measuring Uptime Correctly
You cannot claim an uptime number without measuring it. And how you measure determines how accurate the number is.
Synthetic Monitoring
Synthetic monitors send requests to your service at regular intervals from multiple geographic locations. This is the industry standard for uptime measurement.
// PingCheck synthetic monitoring configuration
const createUptimeMonitor = async () => {
const response = await fetch("https://api.luxkern.com/v1/pingcheck/monitors", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
name: "Production API",
url: "https://api.yourproduct.com/health",
method: "GET",
interval: 30, // Check every 30 seconds
timeout: 10000, // 10 second timeout
expectedStatus: 200,
expectedBody: '{"status":"ok"}', // Optional response body check
regions: [
"us-east-1",
"eu-west-1",
"ap-southeast-1",
],
alertAfterFailures: 2, // Alert after 2 consecutive failures
alertChannels: [
{ type: "slack", webhookUrl: "https://hooks.slack.com/..." },
{ type: "email", address: "oncall@company.com" },
],
sla: {
target: 99.9, // Track against 99.9% SLA
period: "monthly",
notifyOnBreach: true,
},
}),
});
return response.json();
};Check Interval Matters
The frequency of your uptime checks directly affects the accuracy of your measurement:
If you are claiming 99.99% uptime but only checking every 5 minutes, your number is unreliable. The measurement granularity must match the SLA precision.
Multi-Region Checking
A single monitoring location gives you a single perspective. Your service might be up in Virginia but down in Frankfurt because of a DNS issue, a CDN misconfiguration, or a regional cloud provider outage. Always monitor from at least three regions.
SLA Credits and Financial Implications
Most SLAs include a credit mechanism: if the provider fails to meet the uptime guarantee, the customer receives a credit against their bill. Here is what typical credit structures look like:
| Monthly Uptime | Credit (% of monthly bill) | |----------------|---------------------------| | 99.0% - 99.9% | 10% | | 95.0% - 99.0% | 25% | | < 95.0% | 50% |
Notice the asymmetry: the provider risks a 10-50% credit, but the customer bears the full cost of the outage — lost revenue, damaged reputation, broken integrations, and engineering time investigating the impact. SLA credits never make you whole. They are a signal of commitment, not compensation.
This is why you monitor your providers independently rather than trusting their self-reported numbers.
The SLA Calculator: A Free Tool
If you do not want to do the math manually, use the Luxkern SLA Calculator — a free tool that:
Bookmark it. You will use it every time you negotiate a vendor contract or review your own SLA.
Building an Internal SLA Dashboard
For teams that want real-time SLA tracking, here is a pattern using the PingCheck API:
/**
* Fetch SLA compliance data for all monitors
* and generate a dashboard summary.
*/
async function generateSlaDashboard() {
const response = await fetch(
"https://api.luxkern.com/v1/pingcheck/monitors?include=sla",
{
headers: { "Authorization": "Bearer YOUR_API_KEY" },
}
);
const { monitors } = await response.json();
const dashboard = monitors.map((monitor) => ({
name: monitor.name,
url: monitor.url,
currentStatus: monitor.status, // "up" | "down" | "degraded"
sla: {
target: monitor.sla.target,
currentMonth: monitor.sla.currentMonthUptime,
previousMonth: monitor.sla.previousMonthUptime,
trailing90Days: monitor.sla.trailing90DaysUptime,
compliance: monitor.sla.currentMonthUptime >= monitor.sla.target
? "COMPLIANT"
: "BREACHED",
remainingBudget: calculateRemainingBudget(
monitor.sla.target,
monitor.sla.currentMonthUptime,
monitor.sla.currentMonthTotalMinutes,
monitor.sla.currentMonthDowntimeMinutes
),
},
incidents: {
thisMonth: monitor.incidents.currentMonthCount,
mttr: monitor.incidents.meanTimeToRecover, // in minutes
},
}));
return dashboard;
}
function calculateRemainingBudget(target, current, totalMin, downtimeMin) {
const allowedDowntimeMin = totalMin * (1 - target / 100);
const remainingMin = allowedDowntimeMin - downtimeMin;
return {
allowedMinutes: allowedDowntimeMin.toFixed(1),
usedMinutes: downtimeMin.toFixed(1),
remainingMinutes: Math.max(0, remainingMin).toFixed(1),
percentUsed: ((downtimeMin / allowedDowntimeMin) * 100).toFixed(1) + "%",
};
}This gives you a real-time view of your error budget — how much downtime you can still "spend" before breaching your SLA this month. When the remaining budget drops below 25%, it is time to freeze risky deployments and focus on stability.
Practical Recommendations by SLA Tier
Targeting 99.9% (Most SaaS Products)
Targeting 99.99% (High-Value Enterprise)
Targeting 99.999% (Financial, Healthcare)
Try PingCheck free — no credit card required.
The Uptime Number Is Not the Goal
Uptime is a proxy metric. The real goal is user experience. A service with 99.99% uptime but 5-second response times is worse than a service with 99.95% uptime and 100ms responses. When you optimize for uptime alone, you miss the forest for the trees.
Track uptime alongside latency, error rate, and throughput. Together, these four metrics (the "four golden signals" from Google's SRE book) give you a complete picture of service health.
For the math behind uptime calculations, read our detailed tutorial on how to calculate SLA uptime and downtime. For a broader introduction to monitoring, see our guide on what is uptime monitoring.