> coreweave-prod-checklist

Production readiness checklist for CoreWeave GPU workloads. Use when launching inference services, preparing GPU training for production, or validating deployment configurations. Trigger with phrases like "coreweave production", "coreweave go-live", "coreweave checklist", "coreweave launch".

fetch
$curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/coreweave-prod-checklist?format=md"
SKILL.mdcoreweave-prod-checklist

CoreWeave Production Checklist

Inference Services

  • GPU type and count validated for model size
  • Autoscaling configured (KServe or HPA)
  • Health and readiness probes set
  • Resource requests AND limits specified
  • Node affinity targeting correct GPU class
  • minReplicas >= 1 for production (no cold starts)

Storage

  • Model weights in PVC (not downloaded at startup)
  • Checkpoints saved to persistent storage
  • Storage class appropriate (SSD for inference, HDD for archival)

Security

  • Secrets for model tokens and registry access
  • Network policies applied
  • Container images from trusted registries

Monitoring

  • GPU utilization metrics collected
  • Inference latency and throughput tracked
  • Alert on pod restarts and OOM events
  • Log aggregation configured

Rollback

kubectl rollout undo deployment/my-inference
kubectl rollout status deployment/my-inference

Resources

Next Steps

For upgrades, see coreweave-upgrade-migration.

┌ stats

installs/wk0
░░░░░░░░░░
github stars1.7K
██████████
first seenMar 23, 2026
└────────────

┌ repo

jeremylongshore/claude-code-plugins-plus-skills
by jeremylongshore
└────────────