> computer-vision-expert

Design, implement, debug, and review computer vision systems in Python, including image processing, detection, segmentation, classification, tracking, OCR, camera pipelines, and dataset-driven evaluation. Use when working with OpenCV, PyTorch vision models, video/image analysis, model-selection tradeoffs, annotation strategy, failure analysis, or CV performance and robustness problems.

fetch

$curl "https://skillshub.wtf/vosslab/vosslab-skills/computer-vision-expert?format=md"

SKILL.md•computer-vision-expert

Computer Vision Expert

Overview

Use this skill to turn vague "make the model see better" requests into explicit computer-vision workflows with measurable inputs, outputs, and failure modes. Prefer simple, testable pipelines and evidence-driven evaluation over fashionable models or premature complexity.

Workflow

Define the exact vision task.

Determine whether the task is classification, detection, segmentation, keypoints, tracking, OCR, retrieval, restoration, or measurement.
Identify the input domain: still image, video, live camera, document scan, microscopy, satellite, industrial, or another domain.
Define success in measurable terms such as accuracy, recall, latency, FPS, false positives, localization error, or downstream business impact.
Read references/task_selection.md when the request is underspecified or multiple CV framings are possible.

Choose the simplest viable pipeline.

Start with a baseline that can be inspected and benchmarked.
Separate data ingestion, preprocessing, inference, postprocessing, and evaluation.
Prefer classical CV when the task is geometric, threshold-driven, template-based, or small-data.
Prefer learned models when invariance, semantic understanding, or scale make heuristic pipelines brittle.
Read references/pipeline_design.md when choosing between classical CV, hybrid, and model-heavy approaches.
Read the local OpenCV references when they match the task: references/Learning_OpenCV.txt for broad OpenCV techniques and feature/matching workflows; references/OpenCV_Cookbook.txt for practical implementation patterns; references/Video_Object_Tracking.txt for tracking-specific tasks, datasets, and methods.

Make data quality explicit.

Check label quality, class balance, resolution, compression artifacts, lighting, occlusion, and domain shift.
Treat poor annotations and mismatched evaluation data as first-order causes of failure.
Do not blame the model first when the dataset is weak or misaligned.

Evaluate before optimizing.

Establish a baseline metric and representative validation set.
Inspect failures by category rather than averaging everything into one score.
Measure speed, memory, and deployment constraints alongside accuracy.
Read references/debugging_and_failure_analysis.md when performance is unstable, errors cluster in strange ways, or the model seems to fail "randomly."

Improve iteratively.

Change one major factor at a time: data, preprocessing, architecture, thresholding, postprocessing, or evaluation.
Prefer targeted fixes tied to a known failure mode.
Keep intermediate visualizations so behavior is explainable.

Implementation defaults

Use OpenCV for image I/O, geometry, filtering, thresholding, contour work, calibration, and fast visual debugging.
Use PyTorch-based models when the task needs modern learned vision methods and the environment supports them.
Save representative outputs with overlays, masks, boxes, or heatmaps so predictions can be inspected visually.
For video, define frame sampling, buffering, temporal smoothing, and throughput requirements up front.
For OCR, treat document cleanup, orientation, crop quality, and layout structure as part of the pipeline, not preprocessing trivia.
If the work is OpenCV-heavy, check the local books before reinventing standard pipelines or utility patterns.

Quality bar

Favor measurable improvement over architecture churn.
Favor robust pipelines over benchmark theater.
Avoid changing data, model, thresholds, and evaluation all at once.
Avoid shipping a CV system that has never been reviewed on hard negatives and edge cases.
State what the model cannot do, not only what it can do.

Output expectations

When using this skill, aim to produce:

A clearly framed CV task with explicit inputs, outputs, and success metrics.
A pipeline design that can be implemented and debugged in stages.
Visual inspection artifacts for representative successes and failures.
A short explanation of the main failure modes and the next best improvement step.

> related_skills --same-repo

> pdf

Use when tasks involve reading, creating, or reviewing PDF files where rendering and layout matter; prefer visual checks by rendering pages (Poppler) and use Python tools such as `reportlab`, `pdfplumber`, and `pypdf` for generation and extraction.

> webwork-writer

Create, edit, and lint WeBWorK PG/PGML questions following docs/webwork guidance, HTML whitelist constraints, and renderer-based lint checks. Use for tasks like authoring new PGML problems, adjusting randomization or grading, fixing PGML rendering issues, and running renderer API linting.

> unit-test-starter

Generate thorough Python 3 pytest unit tests across a repo by scanning every *.py file and each function, writing one test module per source file while skipping IO/network behavior and documenting gaps.

> skill-writing-guide

Guide for authoring Agent Skills (SKILL.md). Covers the open standard format, required frontmatter, directory layout, progressive disclosure, description writing, and best practices. Use when creating a new skill, improving an existing skill, or learning how skills work.

┌ stats

installs/wk0

░░░░░░░░░░

github stars2

░░░░░░░░░░

first seenMar 18, 2026

└────────────

┌ repo

vosslab/vosslab-skills

by vosslab

└────────────

┌ tags

#ml #python

└────────────