> monitor-experiment
Monitor running experiments, check progress, collect results. Use when user says "check results", "is it done", "monitor", or wants experiment output.
curl "https://skillshub.wtf/wanshuiyin/Auto-claude-code-research-in-sleep/monitor-experiment?format=md"Monitor Experiment Results
Monitor: $ARGUMENTS
Workflow
Step 1: Check What's Running
ssh <server> "screen -ls"
Step 2: Collect Output from Each Screen
For each screen session, capture the last N lines:
ssh <server> "screen -S <name> -X hardcopy /tmp/screen_<name>.txt && tail -50 /tmp/screen_<name>.txt"
If hardcopy fails, check for log files or tee output.
Step 3: Check for JSON Result Files
ssh <server> "ls -lt <results_dir>/*.json 2>/dev/null | head -20"
If JSON results exist, fetch and parse them:
ssh <server> "cat <results_dir>/<latest>.json"
Step 4: Summarize Results
Present results in a comparison table:
| Experiment | Metric | Delta vs Baseline | Status |
|-----------|--------|-------------------|--------|
| Baseline | X.XX | — | done |
| Method A | X.XX | +Y.Y | done |
Step 5: Interpret
- Compare against known baselines
- Flag unexpected results (negative delta, NaN, divergence)
- Suggest next steps based on findings
Step 6: Feishu Notification (if configured)
After results are collected, check ~/.codex/feishu.json:
- Send
experiment_donenotification: results summary table, delta vs baseline - If config absent or mode
"off": skip entirely (no-op)
Key Rules
- Always show raw numbers before interpretation
- Compare against the correct baseline (same config)
- Note if experiments are still running (check progress bars, iteration counts)
- If results look wrong, check training logs for errors before concluding
> related_skills --same-repo
> training-check
Periodically check WandB metrics during training to catch problems early (NaN, loss divergence, idle GPUs). Avoids wasting GPU hours on broken runs. Use when training is running and you want automated health checks.
> result-to-claim
Use when experiments complete to judge what claims the results support, what they don't, and what evidence is still missing. Codex MCP evaluates results against intended claims and routes to next action (pivot, supplement, or confirm). Use after experiments finish — before writing the paper or running ablations.
> paper-slides
Generate conference presentation slides (beamer LaTeX → PDF + editable PPTX) from a compiled paper, with speaker notes and full talk script. Use when user says "做PPT", "做幻灯片", "make slides", "conference talk", "presentation slides", "生成slides", "写演讲稿", or wants beamer slides for a conference talk.
> paper-poster
Generate a conference poster (article + tcbposter LaTeX → A0/A1 PDF + editable PPTX + SVG) from a compiled paper. Use when user says "做海报", "制作海报", "conference poster", "make poster", "生成poster", "poster session", or wants to create a poster for a conference presentation.