Incidents

An incident in Scaling represents a service disruption or degradation that requires your team’s attention. Each incident has a title, a severity level, and a status that moves forward through a fixed lifecycle — from the moment it’s opened until it’s fully resolved. Incidents can be assigned directly to a user, or handed to the current on-call responder for a schedule — which also attaches the matching escalation policy. Every transition is recorded in a permanent audit trail.

Severity levels

Severity indicates the impact of an incident. Set it when you create the incident and update it if conditions change.

Severity	When to use
`critical`	Complete outage or data loss affecting all users. Requires immediate action.
`high`	Major feature broken or significant performance degradation. Urgent response needed.
`medium`	Partial or intermittent impact. A workaround may exist.
`low`	Minor issue with minimal user impact. Can be addressed in normal working hours.

Status lifecycle

Every incident starts at investigating and moves forward through four statuses. Transitions are one-way — you cannot skip a step or go back to a previous status.

investigating → identified → monitoring → resolved

Status	Meaning
`investigating`	Your team is looking into the cause. Impact is not yet confirmed.
`identified`	The root cause is known. A fix is in progress.
`monitoring`	A fix has been applied. You are watching to confirm it holds.
`resolved`	The incident is over. No further action is required.

Status transitions are enforced. You cannot move an incident from monitoring back to identified, and you cannot skip directly from investigating to resolved. Progress through each step in order.resolved is terminal. Once an incident is resolved, no further updates can be posted on it. The only mutation that survives the terminal status is redaction of an existing update. If you need a post-incident write-up, post it as a final public update before transitioning to resolved.

Incident Updates

An Incident Update is the single unit of timeline activity on an incident. Every update carries:

Field	Meaning
`body`	Free-text message (up to 10,000 characters). Optional for internal updates, required for public.
`statusChange`	Optional lifecycle transition. When set, the incident’s `status` advances in the same write.
`visibility`	Either `internal` (staff-only) or `public` (rendered on your status page). Default is internal.
`postedBy`	The user (or API key) who posted the update.
`postedAt`	Server-set ISO timestamp.

At least one of body or statusChange must be present. A public update always requires a non-empty body. This single concept replaces what used to be two separate things: free-form internal notes, and the implicit “row written every time you transitioned status.” Today, both flow through the same shape — the difference is whether body, statusChange, or both are set.

Visibility — internal vs public

Visibility	Where it renders
`internal`	Staff incident timeline only. If `statusChange` is set, the transition still appears as a bare row on the public page, but the body never does.
`public`	Staff timeline and the public status page — the body is the message customers read.

Every input surface (web, Slack, MCP, public API) defaults to internal. Publishing is always an explicit, deliberate action — you cannot publish by accident.

Publishing requires a covering Status Page

Posting a public update is rejected with NO_PUBLIC_SURFACE (400) unless your org has at least one published Status Page whose selected components overlap with the incident’s affected components. This is enforced server-side before any write. This prevents the silent-failure case where you publish into the void — i.e., write a public message that no surface actually renders. Configure your Status Page to include the affected components before publishing. See Status pages for component selection.

Redacting a published mistake

Updates are append-only. To correct or remove a previous statement, redact it and post a new one — the system never silently edits a customer-visible record.

Caller	Can redact when
The original author	Within 5 minutes of `postedAt`.
An organization admin	At any time.

Redaction wipes the body, sets redactedAt and redactedBy, and preserves any statusChange the update carried — the system does not lie about lifecycle state. On the public status page, the slot remains visible at its original timestamp, rendered as “This update has been removed.” The original wording is gone, but the fact that something existed and was pulled back is visible. Redaction is permitted even on resolved incidents — it is the only mutation that survives the terminal status.

Status history and audit trail

The incident detail view shows the full ordered timeline of updates: internal notes, public messages, and status transitions interleaved at their actual post times. Each entry records who posted it, when, and (for redacted updates) who redacted it and when. The legacy statusHistory field on the Get Incident response remains populated for backwards compatibility — it surfaces just the status transitions. For the full timeline (notes + transitions + public updates), call List Incident Updates.

Creating an incident

When you create an incident, the following fields are available:

Field	Required	Notes
`title`	Yes	1–100 characters.
`description`	No	Additional context. Up to 10,000 characters.
`severity`	Yes	`critical`, `high`, `medium`, or `low`.
`ownerId`	Either	User ID to assign directly as the incident owner.
`ownerScheduleId`	Either	On-call schedule whose current responder becomes the owner. See below.
`componentIds`	No	UUIDs of affected components, up to 50.

At least one of ownerId or ownerScheduleId is required so the paging path always has a target.

Assigning an on-call owner

Pass ownerScheduleId at creation time to hand the incident to the team that is currently on-call. The server resolves ownership and escalation in one step:

It looks up the current on-call responder for the schedule (including active overrides) and sets them as the incident owner.
It searches your escalation policies for one whose layers target that schedule, and attaches the match.

If you also supply ownerId, that user wins as the incident owner — the schedule is still used to find a matching escalation policy. If no one is currently on-call and no ownerId was supplied, the incident is still created without an owner; you can assign one later from the incident detail page.

For critical and high severity incidents, pass ownerScheduleId so the right responders are paged automatically through the matching escalation policy. See Escalation Policies for how policies and schedules connect.

Get Started

Core Concepts

Integrations

Guides

Severity levels

Status lifecycle

Incident Updates

Visibility — internal vs public

Publishing requires a covering Status Page

Redacting a published mistake

Status history and audit trail

Creating an incident

Assigning an on-call owner

Get Started

Core Concepts

Integrations

Guides

Documentation Index

​Severity levels

​Status lifecycle

​Incident Updates

​Visibility — internal vs public

​Publishing requires a covering Status Page

​Redacting a published mistake

​Status history and audit trail

​Creating an incident

​Assigning an on-call owner

Severity levels

Status lifecycle

Incident Updates

Visibility — internal vs public

Publishing requires a covering Status Page

Redacting a published mistake

Status history and audit trail

Creating an incident

Assigning an on-call owner