Spike.sh
DashboardWebsiteHUB
Spike.sh
Spike.sh
  • Introduction
  • What is Spike.sh?
  • Incidents
    • What is an incident?
    • Incident lifecycle
    • Incident statuses
    • How to change incident status?
    • Acknowledge timeout
    • Grouping incidents
    • Priority and Severity
    • Mute alerts
    • Reassign incidents
    • Sharing incidents
    • Why does message parsing fail?
    • Rate limiting on duplicate incidents
  • Alerts
    • Phone
    • SMS
    • Mobile app alerts
    • E-mail
    • Slack
    • Microsoft Teams
    • WhatsApp
    • Telegram
    • Pushover
    • Discord
    • Alert rules
    • Title Remapper
    • Personal alerts management
      • During office hours
      • Out of office
      • Resolved incident alerts
      • Deep work & Cooldown modes
  • Services
    • Introduction to services on Spike
  • Escalations
    • Introduction to escalations
    • How to create an escalation policy?
    • Repeat escalations
    • Archive escalation policy
  • Collaboration
    • Task management integrations
      • JIRA Cloud
      • JIRA server (self-hosted)
      • ClickUp
      • Linear
      • Shortcut
    • Helpdesk integrations
      • Zendesk
      • Freshdesk
      • Supportpal
    • War rooms
  • On-call schedules
    • Introduction to On-call schedules
    • Create On-call schedule
    • Introduction to Slots in schedules
    • Introduction to Layers in on-call schedules
    • Override an on-call
    • Add on-call schedule to your calendar
    • Notifications for on-call shifts
    • Change on-call rotation day
  • Playbooks
    • Introduction to Playbooks
    • Actions in Playbooks
    • Automating your Playbooks
    • Run Playbooks manually
  • Status Pages
    • Create Status page
    • Style your status page
    • Incidents on status page
    • Create Planned Maintenance on status page
    • Edit Planned Maintenance
    • Add custom domain to status page
    • Manage your subscribers
    • Embed status page notifications on your website
    • Live status widget
  • Uptime
    • Create Uptime monitor
    • Link uptime to Status Page
  • Integrations guidelines
    • Create integration and service on our dashboard
    • Setup integrations
    • Archive an integration
    • Integrating with Webhooks
    • Integrate Spike with Email
    • Integrate Spike with AWS
    • Integrate Spike with Google Cloud
    • Integrate Spike with Sematext
    • Integrate Spike with Healthchecks
    • Integrate Spike with Pingdom
    • Integrate Spike.sh with Sentry
    • Integrate Spike with Apex ping
    • Integrate Spike with Uptime Robot
    • Integrate Spike with Twilio
    • Integrate Spike with Microsoft Azure
    • Integrate Spike with Honeybadger
    • Integrate Spike with Rollbar
    • Integrate Spike with Travis CI
    • Integrate Spike with Heroku
    • Integrate Spike with Datadog
    • Integrate Spike with Axiom
    • Integrate Spike with Needle.sh
    • Integrate Spike with Cronitor
    • Integrate Spike with Bugsnag
    • Integrate Spike with Grafana
    • Integrate Spike with Prometheus
    • Integrate Spike with Instana
    • Integrate Spike with Zapier
    • Integrate Spike with Librato
    • Integrate Spike with Checkly
    • Integrate Spike with AppSignal
    • Integrate Spike with New Relic
    • Integrate Spike with Site24x7
    • Integrate Spike with Stackify
    • Integrate Spike with Scout-apm
    • Integrate Spike with Oh-Dear
    • Integrate Spike with Nixstats
    • Integrate Spike with Server Density
    • Integrate Spike with Raygun
    • Integrate Spike with Lightstep
    • Integrate Spike with Runscope
    • Integrate Spike with Honeycomb
    • Integrate Spike with Graylog
    • Integrate Spike with Checkmk
    • Integrate Spike with Hyperping
    • Integrate Spike with Epsagon
    • Integrate Spike with Uptime
    • Integrate Spike with Splunk
    • Integrate Spike with Sumo Logic
    • Integrate Spike with Thousand Eyes
    • Integrate Spike with Loggly
    • Integrate Spike with Elastic Cloud
    • Integrate Spike with App Optics
    • Integrate Spike with NodePing
    • Integrate Spike with Scalyr
    • Integrate Spike with Moogsoft
    • Integrate Spike with AppDynamics
    • Integrate Spike with Dynatrace
    • Integrate Spike with CopperEgg
    • Integrate Spike with Coralogix
    • Integrate Spike with ElastAlert
    • Integrate Spike with LogDNA
    • Integrate Spike with Zebrium
    • Integrate Spike with LibreNMS
    • Integrate Spike with Uptime Kuma
    • Integrate Spike with Logentries
    • Integrate Spike with Logz
    • Integrate Spike with Ghost Inspector
    • Integrate Spike with Hetrix Tools
    • Integrate Spike with LogRocket
    • Integrate Spike with StatusCake
    • Integrate Spike with Sysdig
    • Integrate Spike with Wavefront
    • Integrate Spike with Buildkite
    • Integrate Spike with Semaphore
    • Integrate Spike with Better Uptime
    • Integrate Spike with Hexowatch
    • Integrate Spike with PM2
    • Integrate Spike with Cloudflare
    • Integrate Spike with Zabbix
    • Integrate Spike with Tenderly
    • Integrate Spike with Xitoring
    • Integrate Spike with Crowdstrike
    • Integrate Spike with GitHub Workflows
    • Integrate Spike with Solarwinds Orion
    • Integrate Spike with Airbrake.io
    • Updown.io
  • Administration
    • Roles and access
    • SSO
    • Enforce login
    • Our notification numbers
    • Contact the support team
    • Adding team members
    • Check if team is getting alerts
    • Removing team members
  • Additional resources
    • Create a badge
Powered by GitBook
On this page
  • Setting Up Automation for Playbooks
  • Step 1: Choose Your Trigger Event
  • Step 2: Apply to Integrations or Services
  • Step 3: Define Conditions (Optional)
  • Step 4: Select Incident Status Change Condition
  • Example use cases for automations
  • When Any incident is Triggered:
  • When one or all incidents are Acknowledged or Resolved:

Was this helpful?

Edit on GitHub
  1. Playbooks

Automating your Playbooks

Learn how to set up automated Playbooks so they run without any intervention from users

Automating Playbooks empowers teams to handle incidents more effectively by ensuring that predefined actions are run automatically under certain conditions. This capability will take a step further in streamlining the incident response process and guarantees consistency in handling incidents, regardless of the time of day or the availability of personnel. Here are a few use cases where automating Playbooks can significantly enhance your incident management workflow:

By leveraging automation, teams can respond to incidents with the speed and precision required to minimize impact and maintain high levels of service availability and performance.

Setting Up Automation for Playbooks

Follow these steps to configure your Playbooks for automation in Spike.sh, making your incident response process more efficient and reliable.

Step 1: Choose Your Trigger Event

Incident Event: Currently, the primary trigger for automating a Playbook is an incident event, such as the creation of a new incident. Future updates will introduce more trigger options, like on-call shift changes.

Step 2: Apply to Integrations or Services

Selection: Decide whether the automation should apply to all integrations and services on Spike or only specific ones. This choice tailors the Playbook's execution to relevant areas of your operations.

Step 3: Define Conditions (Optional)

Incident Title: Include keywords or use regular expressions to identify incidents by their titles. Incident Details: Specify keywords or regex patterns in incident details, focusing on payload keys and values. Incident Occurrences: Trigger the Playbook based on the frequency of an incident within a given timeframe. Incident Priority or Severity: Set conditions based on the incident's priority (P1 to P5) or severity (SEV1 to SEV3). Conditions can be combined using AND/OR logic for precise control, e.g., "Run Playbook if incident is P1 priority OR contains 'Down' in its title."

Step 4: Select Incident Status Change Condition

Run on Any Incident: Choose to run the Playbook for any incident status change—triggered, acknowledged, or resolved. This setting ensures the Playbook runs for any incident that matches the conditions from Step 3.

Run When All Incidents: Opt to run the Playbook when all incidents in a set reach a certain status, like all being acknowledged or resolved. This approach is useful for coordinated actions across multiple incidents.

Once you've configured the trigger, applied the Playbook to the desired integrations or services, and set your conditions, your Playbook is ready to automate. Remember, every Playbook can also be triggered manually, offering flexibility in how you respond to incidents.

Example use cases for automations

There are 2 types of use cases here. First when an incident is trigger and second when incidents are all acknowledged or resolved.

When Any incident is Triggered:

  1. Automatic service recovery: Detect a common incident, trigger external scripts via outbound webhook for restarting servers, log diagnostics, run GitHub workflows, resource scaling, backup your dbs, etc.

  2. Keyword "Down" Incident: Automatically detect severity on keyword, alert the teams, and set system updates on your status page. Optionally, create a support ticket on Freshdesk, Zendesk, etc or create a new issue on Linear, JIRA, etc.

  3. Critical Integration Incident: For SEV1 incidents on select integration, set priority P1, launch a War room (video conference), and run external scripts with outbound webhook.

  4. Customer-Service Outage: Automaticaly Acknowledge incident, create Zendesk ticket, and prepare customer support for communication.

  5. Service Self-Recovery Attempts: Auto-Acknowledge or Resolve service restart attempts

  6. Performance Degrade Alert: Trigger diagnostics via outbound webhooks and alert relevant teams if performance metrics fall below thresholds.

When one or all incidents are Acknowledged or Resolved:

  1. Post-Incident Review Initiation: Trigger an outbound webhook to schedule a review and compile reports once all deployment-related incidents are resolved.

  2. Automated System Health Rechecks: After acknowledging connectivity issues, automate a health check to confirm stability. Again, via Outbound webhooks.

  3. Customer Communication Update: Automatically update the status page and email resolution summaries to customers when service incidents are resolved.

  4. Resource Scaling Down: Initiate resource optimization and scaling down after high-usage incidents are resolved.

  5. Infrastructure Restoration and Backup Verification: Automate service restoration and backup checks following database instability resolutions.

With the power of automation of Playbooks, the possibilities for streamlining your incident management process are truly endless.

PreviousActions in PlaybooksNextRun Playbooks manually

Last updated 1 year ago

Was this helpful?