> adobe-incident-runbook

Execute Adobe incident response procedures with triage, mitigation, and postmortem for Firefly Services, PDF Services, and I/O Events outages. Use when responding to Adobe-related incidents, investigating API failures, or running post-incident reviews. Trigger with phrases like "adobe incident", "adobe outage", "adobe down", "adobe on-call", "adobe emergency".

fetch
$curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/adobe-incident-runbook?format=md"
SKILL.mdadobe-incident-runbook

Adobe Incident Runbook

Overview

Rapid incident response procedures for Adobe API-related outages, covering IMS authentication failures, Firefly/Photoshop API downtime, PDF Services quota exhaustion, and I/O Events delivery failures.

Prerequisites

  • Access to Adobe Developer Console and Admin Console
  • Access to application monitoring (Grafana, Datadog, etc.)
  • kubectl access to production cluster (if applicable)
  • Communication channels (Slack, PagerDuty)

Severity Matrix

LevelDefinitionResponse TimeExample
P1Complete Adobe integration failure< 15 minIMS auth broken, all APIs down
P2Single API degraded< 1 hourFirefly 429s, Photoshop timeouts
P3Minor impact< 4 hoursWebhook delays, slow PDF extraction
P4No user impactNext business dayMonitoring gap, metric anomaly

Quick Triage (Run These First)

# 1. Is Adobe itself down?
curl -s -o /dev/null -w "Adobe Status: %{http_code}\n" https://status.adobe.com

# 2. Can we generate an access token?
curl -s -o /dev/null -w "IMS Auth: %{http_code}\n" -X POST \
  'https://ims-na1.adobelogin.com/ims/token/v3' \
  -d "client_id=${ADOBE_CLIENT_ID}&client_secret=${ADOBE_CLIENT_SECRET}&grant_type=client_credentials&scope=${ADOBE_SCOPES}"

# 3. Can we reach each API endpoint?
for endpoint in firefly-api.adobe.io image.adobe.io pdf-services.adobe.io; do
  CODE=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "https://$endpoint" 2>/dev/null || echo "UNREACHABLE")
  echo "$endpoint: $CODE"
done

# 4. Check our app health
curl -sf https://your-app.com/health | python3 -m json.tool

# 5. Recent errors in our logs (last 5 min)
kubectl logs -l app=adobe-service --since=5m 2>/dev/null | grep -i "error\|failed\|429\|401\|500" | tail -20

Decision Tree

Adobe APIs returning errors?
├── YES: Is status.adobe.com reporting an incident?
│   ├── YES → Adobe-side outage. Enable fallback mode. Monitor status page.
│   └── NO → Check our credentials and config.
│       ├── 401 errors → Credentials expired/rotated. See "Auth Recovery" below.
│       ├── 429 errors → Rate limited. See "Rate Limit Recovery" below.
│       └── 500/503 errors → Adobe server issue (unreported). Open support ticket.
└── NO: Is our application healthy?
    ├── YES → Likely resolved or intermittent. Continue monitoring.
    └── NO → Our infrastructure issue. Check pods, memory, network.

Recovery Procedures

Auth Recovery (401/403)

# 1. Verify credentials are still valid in Developer Console
#    https://developer.adobe.com/console → Your Project → Credentials

# 2. Test credential directly
curl -v -X POST 'https://ims-na1.adobelogin.com/ims/token/v3' \
  -d "client_id=${ADOBE_CLIENT_ID}&client_secret=${ADOBE_CLIENT_SECRET}&grant_type=client_credentials&scope=${ADOBE_SCOPES}" 2>&1 | grep -E "HTTP|error"

# 3. If credentials were rotated, update in secret manager
gcloud secrets versions add adobe-client-secret --data-file=- <<< "new_p8_secret"
# OR
aws secretsmanager update-secret --secret-id adobe/production/credentials \
  --secret-string '{"client_id":"...","client_secret":"new_secret"}'

# 4. Restart application to clear cached token
kubectl rollout restart deployment/adobe-service

# 5. Verify recovery
curl -sf https://your-app.com/health | jq '.services.adobe'

Rate Limit Recovery (429)

# 1. Check if rate limiting is transient or sustained
# Look at 429 error rate over last 30 min

# 2. Reduce throughput immediately
# Option A: Scale down workers
kubectl scale deployment/adobe-batch-worker --replicas=1

# Option B: Enable rate limit queue mode
kubectl set env deployment/adobe-service ADOBE_RATE_LIMIT_MODE=queue

# 3. For sustained rate limiting, contact Adobe for limit increase
# Include: client_id, typical request volume, business justification

Fallback Mode

# Enable fallback mode (app continues working without Adobe)
kubectl set env deployment/adobe-service ADOBE_FALLBACK_MODE=true

# Verify fallback is working
curl -sf https://your-app.com/health | jq '.services.adobe'
# Should return { "status": "degraded", "mode": "fallback" }

Communication Templates

Internal (Slack)

P[1-4] INCIDENT: Adobe [API Name] Integration
Status: INVESTIGATING / IDENTIFIED / MONITORING / RESOLVED
Impact: [User-facing description]
Root cause: [Adobe outage / credential issue / rate limit / our bug]
Current action: [What you're doing right now]
Next update: [Time]
Commander: @[name]

Postmortem Template

## Incident: Adobe [API] [Error Type]
**Date:** YYYY-MM-DD
**Duration:** X hours Y minutes
**Severity:** P[1-4]

### Summary
[1-2 sentence description of what happened]

### Timeline
- HH:MM UTC — Alert fired: adobe_api_errors_total spike
- HH:MM UTC — On-call acknowledged, began triage
- HH:MM UTC — Root cause identified: [description]
- HH:MM UTC — Mitigation applied: [action taken]
- HH:MM UTC — Full recovery confirmed

### Root Cause
[Technical explanation — was it Adobe-side, credential issue, our bug?]

### Impact
- Users affected: N
- API calls failed: N
- Revenue impact: $X (if applicable)

### Action Items
- [ ] [Preventive measure] — Owner — Due date
- [ ] [Monitoring improvement] — Owner — Due date
- [ ] [Documentation update] — Owner — Due date

Output

  • Incident severity classified
  • Root cause identified via decision tree
  • Recovery procedure executed
  • Stakeholders notified with template
  • Evidence collected for postmortem

Error Handling

IssueCauseSolution
Can't reach status.adobe.comNetwork issueUse mobile data or check @AdobeCare on Twitter
kubectl auth expiredToken timeoutRe-authenticate with cloud provider
Secret manager access deniedIAM policyUse break-glass admin account
Fallback mode not implementedMissing code pathReturn cached/default data

Resources

Next Steps

For data handling, see adobe-data-handling.

┌ stats

installs/wk0
░░░░░░░░░░
github stars1.7K
██████████
first seenMar 23, 2026
└────────────

┌ repo

jeremylongshore/claude-code-plugins-plus-skills
by jeremylongshore
└────────────