> adobe-incident-runbook
Execute Adobe incident response procedures with triage, mitigation, and postmortem for Firefly Services, PDF Services, and I/O Events outages. Use when responding to Adobe-related incidents, investigating API failures, or running post-incident reviews. Trigger with phrases like "adobe incident", "adobe outage", "adobe down", "adobe on-call", "adobe emergency".
curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/adobe-incident-runbook?format=md"Adobe Incident Runbook
Overview
Rapid incident response procedures for Adobe API-related outages, covering IMS authentication failures, Firefly/Photoshop API downtime, PDF Services quota exhaustion, and I/O Events delivery failures.
Prerequisites
- Access to Adobe Developer Console and Admin Console
- Access to application monitoring (Grafana, Datadog, etc.)
- kubectl access to production cluster (if applicable)
- Communication channels (Slack, PagerDuty)
Severity Matrix
| Level | Definition | Response Time | Example |
|---|---|---|---|
| P1 | Complete Adobe integration failure | < 15 min | IMS auth broken, all APIs down |
| P2 | Single API degraded | < 1 hour | Firefly 429s, Photoshop timeouts |
| P3 | Minor impact | < 4 hours | Webhook delays, slow PDF extraction |
| P4 | No user impact | Next business day | Monitoring gap, metric anomaly |
Quick Triage (Run These First)
# 1. Is Adobe itself down?
curl -s -o /dev/null -w "Adobe Status: %{http_code}\n" https://status.adobe.com
# 2. Can we generate an access token?
curl -s -o /dev/null -w "IMS Auth: %{http_code}\n" -X POST \
'https://ims-na1.adobelogin.com/ims/token/v3' \
-d "client_id=${ADOBE_CLIENT_ID}&client_secret=${ADOBE_CLIENT_SECRET}&grant_type=client_credentials&scope=${ADOBE_SCOPES}"
# 3. Can we reach each API endpoint?
for endpoint in firefly-api.adobe.io image.adobe.io pdf-services.adobe.io; do
CODE=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "https://$endpoint" 2>/dev/null || echo "UNREACHABLE")
echo "$endpoint: $CODE"
done
# 4. Check our app health
curl -sf https://your-app.com/health | python3 -m json.tool
# 5. Recent errors in our logs (last 5 min)
kubectl logs -l app=adobe-service --since=5m 2>/dev/null | grep -i "error\|failed\|429\|401\|500" | tail -20
Decision Tree
Adobe APIs returning errors?
├── YES: Is status.adobe.com reporting an incident?
│ ├── YES → Adobe-side outage. Enable fallback mode. Monitor status page.
│ └── NO → Check our credentials and config.
│ ├── 401 errors → Credentials expired/rotated. See "Auth Recovery" below.
│ ├── 429 errors → Rate limited. See "Rate Limit Recovery" below.
│ └── 500/503 errors → Adobe server issue (unreported). Open support ticket.
└── NO: Is our application healthy?
├── YES → Likely resolved or intermittent. Continue monitoring.
└── NO → Our infrastructure issue. Check pods, memory, network.
Recovery Procedures
Auth Recovery (401/403)
# 1. Verify credentials are still valid in Developer Console
# https://developer.adobe.com/console → Your Project → Credentials
# 2. Test credential directly
curl -v -X POST 'https://ims-na1.adobelogin.com/ims/token/v3' \
-d "client_id=${ADOBE_CLIENT_ID}&client_secret=${ADOBE_CLIENT_SECRET}&grant_type=client_credentials&scope=${ADOBE_SCOPES}" 2>&1 | grep -E "HTTP|error"
# 3. If credentials were rotated, update in secret manager
gcloud secrets versions add adobe-client-secret --data-file=- <<< "new_p8_secret"
# OR
aws secretsmanager update-secret --secret-id adobe/production/credentials \
--secret-string '{"client_id":"...","client_secret":"new_secret"}'
# 4. Restart application to clear cached token
kubectl rollout restart deployment/adobe-service
# 5. Verify recovery
curl -sf https://your-app.com/health | jq '.services.adobe'
Rate Limit Recovery (429)
# 1. Check if rate limiting is transient or sustained
# Look at 429 error rate over last 30 min
# 2. Reduce throughput immediately
# Option A: Scale down workers
kubectl scale deployment/adobe-batch-worker --replicas=1
# Option B: Enable rate limit queue mode
kubectl set env deployment/adobe-service ADOBE_RATE_LIMIT_MODE=queue
# 3. For sustained rate limiting, contact Adobe for limit increase
# Include: client_id, typical request volume, business justification
Fallback Mode
# Enable fallback mode (app continues working without Adobe)
kubectl set env deployment/adobe-service ADOBE_FALLBACK_MODE=true
# Verify fallback is working
curl -sf https://your-app.com/health | jq '.services.adobe'
# Should return { "status": "degraded", "mode": "fallback" }
Communication Templates
Internal (Slack)
P[1-4] INCIDENT: Adobe [API Name] Integration
Status: INVESTIGATING / IDENTIFIED / MONITORING / RESOLVED
Impact: [User-facing description]
Root cause: [Adobe outage / credential issue / rate limit / our bug]
Current action: [What you're doing right now]
Next update: [Time]
Commander: @[name]
Postmortem Template
## Incident: Adobe [API] [Error Type]
**Date:** YYYY-MM-DD
**Duration:** X hours Y minutes
**Severity:** P[1-4]
### Summary
[1-2 sentence description of what happened]
### Timeline
- HH:MM UTC — Alert fired: adobe_api_errors_total spike
- HH:MM UTC — On-call acknowledged, began triage
- HH:MM UTC — Root cause identified: [description]
- HH:MM UTC — Mitigation applied: [action taken]
- HH:MM UTC — Full recovery confirmed
### Root Cause
[Technical explanation — was it Adobe-side, credential issue, our bug?]
### Impact
- Users affected: N
- API calls failed: N
- Revenue impact: $X (if applicable)
### Action Items
- [ ] [Preventive measure] — Owner — Due date
- [ ] [Monitoring improvement] — Owner — Due date
- [ ] [Documentation update] — Owner — Due date
Output
- Incident severity classified
- Root cause identified via decision tree
- Recovery procedure executed
- Stakeholders notified with template
- Evidence collected for postmortem
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Can't reach status.adobe.com | Network issue | Use mobile data or check @AdobeCare on Twitter |
| kubectl auth expired | Token timeout | Re-authenticate with cloud provider |
| Secret manager access denied | IAM policy | Use break-glass admin account |
| Fallback mode not implemented | Missing code path | Return cached/default data |
Resources
Next Steps
For data handling, see adobe-data-handling.
> related_skills --same-repo
> fathom-cost-tuning
Optimize Fathom API usage and plan selection. Trigger with phrases like "fathom cost", "fathom pricing", "fathom plan".
> fathom-core-workflow-b
Sync Fathom meeting data to CRM and build automated follow-up workflows. Use when integrating Fathom with Salesforce, HubSpot, or custom CRMs, or creating automated post-meeting email summaries. Trigger with phrases like "fathom crm sync", "fathom salesforce", "fathom follow-up", "fathom post-meeting workflow".
> fathom-core-workflow-a
Build a meeting analytics pipeline with Fathom transcripts and summaries. Use when extracting insights from meetings, building CRM sync, or creating automated meeting follow-up workflows. Trigger with phrases like "fathom analytics", "fathom meeting pipeline", "fathom transcript analysis", "fathom action items sync".
> fathom-common-errors
Diagnose and fix Fathom API errors including auth failures and missing data. Use when API calls fail, transcripts are empty, or webhooks are not firing. Trigger with phrases like "fathom error", "fathom not working", "fathom api failure", "fix fathom".