> benchmark
# Benchmark — Performance Baseline & Regression Detection ## When to Use - Before and after a PR to measure performance impact - Setting up performance baselines for a project - When users report "it feels slow" - Before a launch — ensure you meet performance targets - Comparing your stack against alternatives ## How It Works ### Mode 1: Page Performance Measures real browser metrics via browser MCP: ``` 1. Navigate to each target URL 2. Measure Core Web Vitals: - LCP (Largest Contentfu
curl "https://skillshub.wtf/affaan-m/everything-claude-code/benchmark?format=md"Benchmark — Performance Baseline & Regression Detection
When to Use
- Before and after a PR to measure performance impact
- Setting up performance baselines for a project
- When users report "it feels slow"
- Before a launch — ensure you meet performance targets
- Comparing your stack against alternatives
How It Works
Mode 1: Page Performance
Measures real browser metrics via browser MCP:
1. Navigate to each target URL
2. Measure Core Web Vitals:
- LCP (Largest Contentful Paint) — target < 2.5s
- CLS (Cumulative Layout Shift) — target < 0.1
- INP (Interaction to Next Paint) — target < 200ms
- FCP (First Contentful Paint) — target < 1.8s
- TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
- Total page weight (target < 1MB)
- JS bundle size (target < 200KB gzipped)
- CSS size
- Image weight
- Third-party script weight
4. Count network requests
5. Check for render-blocking resources
Mode 2: API Performance
Benchmarks API endpoints:
1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets
Mode 3: Build Performance
Measures development feedback loop:
1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time
Mode 4: Before/After Comparison
Run before and after a change to measure impact:
/benchmark baseline # saves current metrics
# ... make changes ...
/benchmark compare # compares against baseline
Output:
| Metric | Before | After | Delta | Verdict |
|--------|--------|-------|-------|---------|
| LCP | 1.2s | 1.4s | +200ms | ⚠ WARN |
| Bundle | 180KB | 175KB | -5KB | ✓ BETTER |
| Build | 12s | 14s | +2s | ⚠ WARN |
Output
Stores baselines in .ecc/benchmarks/ as JSON. Git-tracked so the team shares baselines.
Integration
- CI: run
/benchmark compareon every PR - Pair with
/canary-watchfor post-deploy monitoring - Pair with
/browser-qafor full pre-ship checklist
> related_skills --same-repo
> skill-comply
Visualize whether skills, rules, and agent definitions are actually followed — auto-generates scenarios at 3 prompt strictness levels, runs agents, classifies behavioral sequences, and reports compliance rates with full tool call timelines
> santa-method
Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships.
> safety-guard
# Safety Guard — Prevent Destructive Operations ## When to Use - When working on production systems - When agents are running autonomously (full-auto mode) - When you want to restrict edits to a specific directory - During sensitive operations (migrations, deploys, data changes) ## How It Works Three modes of protection: ### Mode 1: Careful Mode Intercepts destructive commands before execution and warns: ``` Watched patterns: - rm -rf (especially /, ~, or project root) - git push --force
> product-lens
# Product Lens — Think Before You Build ## When to Use - Before starting any feature — validate the "why" - Weekly product review — are we building the right thing? - When stuck choosing between features - Before a launch — sanity check the user journey - When converting a vague idea into a spec ## How It Works ### Mode 1: Product Diagnostic Like YC office hours but automated. Asks the hard questions: ``` 1. Who is this for? (specific person, not "developers") 2. What's the pain? (quantify