> ice-crawler-harvester
Run ICE-Crawler’s Frost→Glacier→Crystal pipeline to ingest repositories safely, emit bounded artifact bundles, and hand off sealed fossils for downstream agents.
curl "https://skillshub.wtf/jacksonjp0311-gif/Clawbot-skills/ice-crawler-harvester?format=md"ICE-Crawler Harvester
Use this skill to run the ICE-Crawler pipeline wherever you have the project cloned. Set an environment variable such as ICE_CRAWLER_ROOT that points to your local clone of the Ice-Crawler repo and run the commands from there. The instructions below assume PowerShell, but any shell works.
Reference: references/ice-crawler-workflow.md
Prerequisites
- Python 3.10+ with Tkinter (for the UI) and Git on PATH.
- ICE-Crawler repository checked out locally (path referenced via
ICE_CRAWLER_ROOT). - Optional:
agentics/hooks if you need partitioned follow-up tasks.
Workflow 1 — Full UI Run (interactive)
cd $env:ICE_CRAWLER_ROOT- Launch the UI:
python icecrawler.py - Paste any cloneable Git URL (browse/blob URLs are normalized automatically) and press the glowing PRESS TO SUBMIT TO ICE CRAWLER button.
- Watch the phase ladder (Frost → Glacier → Crystal → Residue) and log panels update in real time.
- When the run completes, open
state/runs/run_<timestamp>/to inspect the fossilized artifact bundle.
UI Controls
- Ctrl+B toggle left ladder; Ctrl+Shift+B toggle right logs; Ctrl+J toggle terminal.
- Drag PanedWindow sashes to resize panels.
- UI never touches git; it mirrors
ui_events.jsonlwritten by the orchestrator.
Workflow 2 — Headless CLI Run
cd $env:ICE_CRAWLER_ROOT
$run = "state/runs/run_$(Get-Date -Format 'yyyyMMdd_HHmmss')"
$temp = "state/_temp_repo"
New-Item -ItemType Directory -Force -Path $run | Out-Null
python -m engine.orchestrator "https://github.com/openclaw/openclaw.git" $run 80 256 $temp
Arguments: <repo_url> <state_run_dir> <max_files> <max_kb> <temp_dir>
max_filescontrols the Glacier selection cap.max_kbenforces a per-file size ceiling when copying intoartifact/.temp_diris purged automatically; failure to delete triggers a residue violation.
Outputs & Follow-up
state/runs/<run>/artifact/— crystallized file tree (repo-relative paths preserved).artifact_manifest.json+artifact_hashes.json— integrity anchors for downstream tools.ai_handoff/manifest_compact.json+root_seal.txt— sealed bundle for agent prompts.ui_events.jsonl,run_cmds.jsonl— truth logs for UI or automation.residue_truth.json— teardown attestation; treat violations as failures.- Extraction registry — append a row to
skills/ice-crawler-harvester/extractions/index.jsonland drop notes underextractions/<repo-slug>/so future skills can mine algorithms/tools (seeextractions/README.md).
Extending / Integrating
- Call the orchestrator from scripts or scheduled jobs to keep repo fossils fresh.
- Parse
artifact_manifest.jsonto feed other skills (e.g., code summarizers, diff analyzers). - Hook
agentics/when you need automatic partitioning of Frost metadata or Crystal artifacts into bounded tasks. - Adjust
max_files/max_kbper run to dial ingest size; keep limits conservative for safety.
Troubleshooting
- Missing Tkinter on Linux →
sudo apt install python3-tk(or distro equivalent). - Git credential prompts bubble through the orchestrator; ensure SSH keys or tokens are configured.
- Residue violation (
state/_temp_reponot deleted) aborts the run; rerun after manual cleanup if needed.
Follow this skill to get deterministic repo fossils with ICE-Crawler’s provenance guarantees.
> related_skills --same-repo
> harvest-proposal-pipeline
Automate ICE-Crawler ingestion → registry update → proposal stub creation. Use when you want a full harvest loop that ends with a ready-to-review skill idea.
> extraction-proposer
Scan ICE-Crawler extraction logs, pick promising algorithms/tools, and emit skill creation proposals (name, scope, source files, next steps).