> axiom-ios-vision
Use when implementing ANY computer vision feature - image analysis, object detection, pose detection, person segmentation, subject lifting, hand/body pose tracking.
curl "https://skillshub.wtf/CharlesWiltgen/Axiom/axiom-ios-vision?format=md"iOS Computer Vision Router
You MUST use this skill for ANY computer vision work using the Vision framework.
When to Use
Use this router when:
- Analyzing images or video
- Detecting objects, faces, or people
- Tracking hand or body pose
- Segmenting people or subjects
- Lifting subjects from backgrounds
- Recognizing text in images (OCR)
- Detecting barcodes or QR codes
- Scanning documents
- Using VisionKit or DataScannerViewController
- Integrating with Visual Intelligence (iOS 26+ system camera feature)
Routing Logic
Vision Work
Implementation patterns → /skill axiom-vision
- Subject segmentation (VisionKit)
- Hand pose detection (21 landmarks)
- Body pose detection (2D/3D)
- Person segmentation
- Face detection
- Isolating objects while excluding hands
- Text recognition (VNRecognizeTextRequest)
- Barcode/QR detection (VNDetectBarcodesRequest)
- Document scanning (VNDocumentCameraViewController)
- Live scanning (DataScannerViewController)
- Structured document extraction (RecognizeDocumentsRequest, iOS 26+)
API reference → /skill axiom-vision-ref
- Complete Vision framework API
- VNDetectHumanHandPoseRequest
- VNDetectHumanBodyPoseRequest
- VNGenerateForegroundInstanceMaskRequest
- VNRecognizeTextRequest (fast/accurate modes)
- VNDetectBarcodesRequest (symbologies)
- DataScannerViewController delegates
- RecognizeDocumentsRequest (iOS 26+)
- Coordinate conversion patterns
Visual Intelligence integration → /skill axiom-vision-ref (see Visual Intelligence Integration section)
- Making app content discoverable to Visual Intelligence camera
IntentValueQueryandSemanticContentDescriptor- Deep linking from Visual Intelligence results
Diagnostics → /skill axiom-vision-diag
- Subject not detected
- Hand pose missing landmarks
- Low confidence observations
- Performance issues
- Coordinate conversion bugs
- Text not recognized or wrong characters
- Barcodes not detected
- DataScanner showing blank or no items
- Document edges not detected
Decision Tree
- Implementing (pose, segmentation, OCR, barcodes, documents, live scanning)? → vision
- Visual Intelligence system integration (camera feature, iOS 26+)? → vision-ref (Visual Intelligence section)
- Need API reference / code examples? → vision-ref
- Debugging issues (detection failures, confidence, coordinates)? → vision-diag
Anti-Rationalization
| Thought | Reality |
|---|---|
| "Vision framework is just a request/handler pattern" | Vision has coordinate conversion, confidence thresholds, and performance gotchas. vision covers them. |
| "I'll handle text recognition without the skill" | VNRecognizeTextRequest has fast/accurate modes and language-specific settings. vision has the patterns. |
| "Subject segmentation is straightforward" | Instance masks have HDR compositing and hand-exclusion patterns. vision covers complex scenarios. |
| "Visual Intelligence is just the camera API" | Visual Intelligence is a system-level feature requiring IntentValueQuery and SemanticContentDescriptor. vision-ref has the integration section. |
Critical Patterns
vision:
- Subject segmentation with VisionKit
- Hand pose detection (21 landmarks)
- Body pose detection (2D/3D, up to 4 people)
- Isolating objects while excluding hands
- CoreImage HDR compositing
- Text recognition (fast vs accurate modes)
- Barcode detection (symbology selection)
- Document scanning with perspective correction
- Live scanning with DataScannerViewController
- Structured document extraction (iOS 26+)
vision-diag:
- Subject detection failures
- Landmark tracking issues
- Performance optimization
- Observation confidence thresholds
- Text recognition failures (language, contrast)
- Barcode detection issues (symbology, distance)
- DataScanner troubleshooting
- Document edge detection problems
Example Invocations
User: "How do I detect hand pose in an image?"
→ Invoke: /skill axiom-vision
User: "Isolate a subject but exclude the user's hands"
→ Invoke: /skill axiom-vision
User: "How do I read text from an image?"
→ Invoke: /skill axiom-vision
User: "Scan QR codes with the camera"
→ Invoke: /skill axiom-vision
User: "How do I implement document scanning?"
→ Invoke: /skill axiom-vision
User: "Use DataScannerViewController for live text"
→ Invoke: /skill axiom-vision
User: "Subject detection isn't working"
→ Invoke: /skill axiom-vision-diag
User: "Text recognition returns wrong characters"
→ Invoke: /skill axiom-vision-diag
User: "Barcode not being detected"
→ Invoke: /skill axiom-vision-diag
User: "Show me VNDetectHumanBodyPoseRequest examples"
→ Invoke: /skill axiom-vision-ref
User: "What symbologies does VNDetectBarcodesRequest support?"
→ Invoke: /skill axiom-vision-ref
User: "RecognizeDocumentsRequest API reference"
→ Invoke: /skill axiom-vision-ref
User: "How do I make my app work with Visual Intelligence?"
→ Invoke: /skill axiom-vision-ref
User: "How do users discover my app content through the camera?"
→ Invoke: /skill axiom-vision-ref
> related_skills --same-repo
> axiom-xctrace-ref
Use when automating Instruments profiling, running headless performance analysis, or integrating profiling into CI/CD - comprehensive xctrace CLI reference with record/export patterns
> axiom-xctest-automation
Use when writing, running, or debugging XCUITests. Covers element queries, waiting strategies, accessibility identifiers, test plans, and CI/CD test execution patterns.
> axiom-xcode-mcp
Use when connecting to Xcode via MCP, using xcrun mcpbridge, or working with ANY Xcode MCP tool (XcodeRead, BuildProject, RunTests, RenderPreview). Covers setup, tool reference, workflow patterns, troubleshooting.
> axiom-xcode-mcp-tools
Xcode MCP workflow patterns — BuildFix loop, TestFix loop, preview verification, window targeting, tool gotchas