> scrapling-skill
Install, troubleshoot, and use Scrapling CLI to extract HTML, Markdown, or text from webpages. Use this skill whenever the user mentions Scrapling, `uv tool install scrapling`, `scrapling extract`, WeChat/mp.weixin articles, browser-backed page fetching, or needs help deciding between static and dynamic extraction.
curl "https://skillshub.wtf/daymade/claude-code-skills/scrapling-skill?format=md"Scrapling Skill
Overview
Use Scrapling through its CLI as the default path. Start with the smallest working command, validate the saved output, and only escalate to browser-backed fetching when the static fetch does not contain the real page content.
Do not assume the user's Scrapling install is healthy. Verify it first.
Default Workflow
Copy this checklist and keep it updated while working:
Scrapling Progress:
- [ ] Step 1: Diagnose the local Scrapling install
- [ ] Step 2: Fix CLI extras or browser runtime if needed
- [ ] Step 3: Choose static or dynamic fetch
- [ ] Step 4: Save output to a file
- [ ] Step 5: Validate file size and extracted content
- [ ] Step 6: Escalate only if the previous path failed
Step 1: Diagnose the Install
Run the bundled diagnostic script first:
python3 scripts/diagnose_scrapling.py
Use the result as the source of truth for the next step.
Step 2: Fix the Install
If the CLI was installed without extras
If scrapling --help fails with missing click or a message about installing Scrapling with extras, reinstall it with the CLI extra:
uv tool uninstall scrapling
uv tool install 'scrapling[shell]'
Do not default to scrapling[all] unless the user explicitly needs the broader feature set.
If browser-backed fetchers are needed
Install the Playwright runtime:
scrapling install
If the install looks slow or opaque, read references/troubleshooting.md before guessing. Do not claim success until either:
scrapling installreports that dependencies are already installed, or- the diagnostic script confirms both Chromium and Chrome Headless Shell are present.
Step 3: Choose the Fetcher
Use this decision rule:
- Start with
extract getfor normal pages, article pages, and most WeChat public articles. - Use
extract fetchwhen the static HTML does not contain the real content or the page depends on JavaScript rendering. - Use
extract stealthy-fetchonly afterfetchstill fails because of anti-bot or challenge behavior. Do not make it the default.
Step 4: Run the Smallest Useful Command
Always quote URLs in shell commands. This is mandatory in zsh when the URL contains ?, &, or other special characters.
Full page to HTML
scrapling extract get 'https://example.com' page.html
Main content to Markdown
scrapling extract get 'https://example.com' article.md -s 'main'
JS-rendered page with browser automation
scrapling extract fetch 'https://example.com' page.html --timeout 20000
WeChat public article body
Use #js_content first. This is the default selector for article body extraction on mp.weixin.qq.com pages.
scrapling extract get 'https://mp.weixin.qq.com/s/ARTICLE_ID?scene=1' article.md -s '#js_content'
Step 5: Validate the Output
After every extraction, verify the file instead of assuming success:
wc -c article.md
sed -n '1,40p' article.md
For HTML output, check that the expected title, container, or selector target is actually present:
rg -n '<title>|js_content|rich_media_title|main' page.html
If the file is tiny, empty, or missing the expected container, the extraction did not succeed. Go back to Step 3 and switch fetchers or selectors.
Step 6: Handle Known Failure Modes
Local TLS trust store problem
If extract get fails with curl: (60) SSL certificate problem, treat it as a local trust-store problem first, not a Scrapling content failure.
Retry the same command with:
--no-verify
Only do this after confirming the failure matches the local certificate verification error pattern. Do not silently disable verification by default.
WeChat article pages
For mp.weixin.qq.com:
- Try
extract getbeforeextract fetch - Use
-s '#js_content'for the article body - Validate the saved Markdown or HTML immediately
Browser-backed fetch failures
If extract fetch fails:
- Re-check the install with
python3 scripts/diagnose_scrapling.py - Confirm Chromium and Chrome Headless Shell are present
- Retry with a slightly longer timeout
- Escalate to
stealthy-fetchonly if the site behavior justifies it
Command Patterns
Diagnose and smoke test a URL
python3 scripts/diagnose_scrapling.py --url 'https://example.com'
Diagnose and smoke test a WeChat article body
python3 scripts/diagnose_scrapling.py \
--url 'https://mp.weixin.qq.com/s/ARTICLE_ID?scene=1' \
--selector '#js_content' \
--no-verify
Diagnose and smoke test a browser-backed fetch
python3 scripts/diagnose_scrapling.py \
--url 'https://example.com' \
--dynamic
Guardrails
- Do not tell the user to reinstall blindly. Verify first.
- Do not default to the Python library API when the user is clearly asking about the CLI.
- Do not jump to browser-backed fetching unless the static result is missing the real content.
- Do not claim success from exit code alone. Inspect the saved file.
- Do not hardcode user-specific absolute paths into outputs or docs.
Resources
- Installation and smoke test helper:
scripts/diagnose_scrapling.py - Verified failure modes and recovery paths:
references/troubleshooting.md
> related_skills --same-repo
> youtube-downloader
Download YouTube videos and HLS streams (m3u8) from platforms like Mux, Vimeo, etc. using yt-dlp and ffmpeg. Use this skill when users request downloading videos, extracting audio, handling protected streams with authentication headers, or troubleshooting download issues like nsig extraction failures, 403 errors, or cookie extraction problems.
> windows-remote-desktop-connection-doctor
Diagnose Windows App (Microsoft Remote Desktop / Azure Virtual Desktop / W365) connection quality issues on macOS. Analyze transport protocol selection (UDP Shortpath vs WebSocket), detect VPN/proxy interference with STUN/TURN negotiation, and parse Windows App logs for Shortpath failures. This skill should be used when VDI connections are slow, when transport shows WebSocket instead of UDP, when RDP Shortpath fails to establish, or when RTT is unexpectedly high.
> video-comparer
This skill should be used when comparing two videos to analyze compression results or quality differences. Generates interactive HTML reports with quality metrics (PSNR, SSIM) and frame-by-frame visual comparisons. Triggers when users mention "compare videos", "video quality", "compression analysis", "before/after compression", or request quality assessment of compressed videos.
> ui-designer
Extract design systems from reference UI images and generate implementation-ready UI design prompts. Use when users provide UI screenshots/mockups and want to create consistent designs, generate design systems, or build MVP UIs matching reference aesthetics.