How to Write an Incident Communication That Doesn't Make Things Worse
Bad incident communications can cause more damage than the outage itself. Here's how to write updates that inform, reassure, and actually help your customers.
How to Write an Incident Communication That Doesn't Make Things Worse
Your site is down. You need to tell your customers. And the message you write in the next few minutes will shape their perception of your company for months.
Get it right, and customers will praise your transparency. Get it wrong, and you'll turn a technical incident into a PR crisis.
The Golden Rules
1. Be Fast
The first update should go out within 5 minutes of confirming an incident. A quick, brief message is infinitely better than a perfect message 30 minutes later. During those 30 minutes, customers are assuming the worst.
2. Be Honest
Never downplay the severity. If everything is broken, don't say "some users may experience intermittent issues." Customers can tell when you're minimizing, and it destroys trust.
3. Be Specific
Name the affected service. Describe the impact in terms customers understand. "Our API is returning errors on all endpoints" is better than "We're experiencing some difficulties."
4. Show Progress
Update every 10-15 minutes during active incidents. Even if the update is "We're still investigating," it shows you're working on it.
5. Skip the Jargon
Your customers don't need to know about Kubernetes pod evictions or PostgreSQL vacuum issues. Tell them what's broken and when it'll be fixed.
Templates for Each Phase
First Acknowledgment (Within 5 Minutes)
[Service Name] — Investigating Issues
We're aware that [specific service/feature] is currently [unavailable/experiencing errors/slower than normal]. Our engineering team is investigating.
We'll provide an update within 15 minutes.
Investigation Update (Every 10-15 Minutes)
[Service Name] — Update
We've identified the cause: [brief, non-technical description]. We're working on [what you're doing to fix it].
Estimated resolution: [time if known, or "We'll update in 15 minutes"]
Impact: [What customers are experiencing]
Mitigation Applied
[Service Name] — Fix Being Applied
We've implemented a fix for the [service/feature] issues. Service is being restored and we're monitoring closely.
Some customers may still experience [residual effects] for the next [timeframe].
Resolution
[Service Name] — Resolved
The issue affecting [service/feature] has been fully resolved as of [time]. All systems are operating normally.
What happened: [1-2 sentence summary] Duration: [start time] to [end time] ([X] minutes) What we're doing to prevent this: [Brief description]
We apologize for the disruption and appreciate your patience.
Anti-Patterns to Avoid
The Non-Apology
❌ "We apologize for any inconvenience this may have caused." ✅ "We're sorry this happened. Your [orders/data/access] matter to us."
The Blame Shift
❌ "Due to an issue with our third-party provider..." ✅ "Our payment processing is currently unavailable. We're working with our providers to restore service."
The Vanishing Act
Post an initial update, then go silent for an hour. This is worse than not posting at all, because customers know you're aware but appear to have stopped caring.
The Premature All-Clear
❌ Declaring "resolved" before thoroughly verifying. Having to reopen an incident 20 minutes later erodes confidence. ✅ Monitor for at least 15 minutes after the fix before declaring resolved.
The Blame-the-User
❌ "Please clear your cache and try again." ✅ Only suggest user actions when you've genuinely confirmed it's a client-side issue.
Internal vs External Communication
External (Status Page, Email, Social)
- Keep it simple and non-technical
- Focus on customer impact and resolution
- Provide estimated timelines when possible
- Show empathy
Internal (Slack, Incident Channel)
- Be detailed and technical
- Include debugging information
- Track who's doing what
- Document decisions and their rationale
Preparing in Advance
Don't write incident communications under pressure. Prepare now:
- Create templates for each phase (saved in your runbook)
- Define who communicates — one person owns external comms during incidents
- Pre-authorize the communications person to post without management approval
- Set up distribution channels — status page, email lists, social accounts
- Practice — run through a simulated incident including the communication steps
The best incident communication feels honest, human, and helpful. It turns a negative experience into a moment that strengthens the customer relationship.
Written by
UptimeGuard Team
Related articles
Incident Management Playbook: From Alert to Resolution in Minutes
A practical, step-by-step incident management playbook your team can adopt today. No enterprise complexity — just clear processes that work.
Read morePost-Mortem Template: How to Learn from Every Incident
The most valuable part of any incident isn't the fix — it's the post-mortem. Here's a battle-tested template and process that turns outages into improvements.
Read moreIncident Retrospective: Our Worst Outage and What We Learned
Complete transparency about our longest outage — the timeline, the root cause, what failed, and the 14 changes we made to ensure it never happens again.
Read more