Skip to content

bearify

bearify
Home
Use Cases

Generate Incident Postmortems Automatically

datadog logo

datadog

splunk logo

splunk

github logo

github

slack logo

slack

When an incident occurs in Datadog, this runbook compiles logs from Splunk and code context from GitHub into a draft postmortem. The summary is posted to Slack with a link to the generated document.

TL;DR

This runbook streamlines post-incident reviews by gathering alert data, relevant logs, and code diffs to generate a first-draft postmortem document and post it in Slack.

Who is this for?

SREs, platform engineers, and tech leads who want faster, more consistent postmortem documentation following production incidents.

What problem does this solve?

After incidents, engineers often forget to capture key context — or spend hours gathering it from multiple tools. This automation saves time and improves accountability.

Solves:

  • Inconsistent postmortems
  • Manual digging through logs and commits
  • Poor knowledge sharing after incidents

What this workflow accomplishes

  • Listens for Datadog incidents with severity “critical”
  • Queries Splunk logs from the incident time window
  • Fetches GitHub commits and PRs tagged with the incident ID
  • Generates a Google Doc or Markdown summary
  • Posts a link and summary to the #incidents Slack channel

Integrations

This runbook uses the following integrations:

  • Datadog logoDatadog Agent: Detects new incidents and extracts tags and timestamps.
  • Splunk logoSplunk Agent: Pulls recent logs tied to the incident scope.
  • GitHub logoGitHub Agent: Queries PRs and commits matching the incident tag or affected service.
  • Slack logoSlack Agent: Notifies the team with the generated report and preview.

Setup

  • Datadog:

    • Valid API and App Keys
    • Incident integration enabled
    • Datadog logoDatadog Agent installed
  • Splunk:

    • API token with access to logs
    • Logs tagged with service, env, etc.
    • Splunk logoSplunk Agent installed
  • GitHub:

    • OAuth or PAT with repo scope
    • Commit messages or PRs reference incident IDs (e.g. INC-123)
    • GitHub logoGitHub Agent installed
  • Slack:

    • Bot token with chat:write permissions
    • Slack logoSlack Agent installed

Runbook Template

📚 runbook.mdx
Runbook

Objective: Automatically generate an incident postmortem by compiling logs, code context, and posting a report to Slack.

Steps:

(1) Use the Datadog logoDatadog Agent with the list_incidents tool.

  • Filter for incidents where severity = critical and status = active
  • Extract: incidentId, title, service, tags, created_at

(2) Use the Splunk logoSplunk Agent with search_splunk.

  • search_query: logs for the service tag or incident ID
  • earliest_time: 1 hour before created_at
  • latest_time: now
  • Limit: 100 lines
  • Extract sample logs for triage

(3) Use the GitHub logoGitHub Agent with search_code.

  • Query for INCIDENT_ID or related service name across commits and PRs
  • Extract commit messages, PR titles, timestamps, authors

(4) Assemble a postmortem report with:

🛠️ Incident Postmortem Draft

Incident: {{title}}

ID: {{incidentId}}

Service: {{service}}

Time: {{created_at}}

Top Logs:

{{splunk_logs}}

Relevant Commits / PRs:

{{github_refs}}

Please update the draft report here: [Google Doc or Markdown link]

(5) Use the Slack logoSlack Agent to post the above message to #incidents.

  • Tag the @incident-review group
  • Include link to editable doc

Alexis Warner

Marketing

Jul 21, 2025

5 min read

Categories

    engineering

    incident-response

    postmortem

    datadog

    github

    splunk

    slack

About this post

Alexis Warner

Marketing

Last updated: Jul 21, 2025

5 min read

Agents Used

Datadog logoDatadog AgentSplunk logoSplunk AgentGitHub logoGitHub AgentSlack logoSlack Agent

Categories

    engineering

    incident-response

    postmortem

    datadog

    github

    splunk

    slack

Follow us

Product

IntegrationsUse Cases

2025 © Bearify All Rights Reserved

Terms of ServicePrivacy Policy