found 859 skills in registry
Run AI agent code safely in isolated sandboxes with resource limits, audit trails, and kill switches. Use when someone asks to "sandbox my agent", "run agent code safely", "add guardrails to AI agent", "isolate agent execution", "audit agent actions", "prevent agent from deleting files", "restrict agent permissions", or "add safety controls to AI coding agent". Covers Docker isolation, filesystem restrictions, network policies, resource locking, and comprehensive audit logging.
Run AI agent and LLM evaluations in CI/CD pipelines — automated quality gates that fail the build when AI output quality drops. Use when someone asks to "test my AI agent", "add evals to CI", "catch prompt regressions", "compare models", "evaluate LLM output quality", "set up AI quality gates", or "benchmark my agent before deploying". Covers eval frameworks (Cobalt, Promptfoo, Braintrust), LLM-as-judge scoring, threshold-based assertions, and GitHub Actions integration.
Add persistent memory to AI coding agents — file-based, vector, and semantic search memory systems that survive between sessions. Use when a user asks to "remember this", "add memory to my agent", "persist context between sessions", "build a knowledge base for my agent", "set up agent memory", or "make my AI remember things". Covers file-based memory (MEMORY.md), SQLite with embeddings, vector databases (ChromaDB, Pinecone), semantic search, memory consolidation, and automatic context injection.
Audit and refresh repo documentation sets against docs/REPO_STYLE.md, create missing docs only when supported by evidence, keep ALL CAPS doc naming under docs, and report created, updated, flagged, and known gaps.
Call 100+ LLM APIs with one interface using LiteLLM — unified API proxy for OpenAI, Anthropic, Google, Mistral, Cohere, and self-hosted models. Use when someone asks to "switch between LLM providers", "LiteLLM", "unified LLM API", "LLM proxy", "call Claude and GPT with the same code", "LLM load balancing", or "multi-model AI gateway". Covers provider routing, fallbacks, rate limiting, spend tracking, and self-hosted proxy.
When the user wants to create or update their product marketing context document. Also use when the user mentions 'product context,' 'marketing context,' 'set up context,' 'positioning,' or wants to avoid repeating foundational information across marketing tasks. Creates `.claude/product-marketing-context.md` that other marketing skills reference.
You are an expert in Traceloop and its OpenLLMetry SDK, the open-source observability framework that extends OpenTelemetry for LLM applications. You help developers instrument AI pipelines with automatic tracing for OpenAI, Anthropic, Cohere, LangChain, LlamaIndex, vector databases, and frameworks — exporting to any OpenTelemetry-compatible backend (Grafana Tempo, Jaeger, Datadog, Honeycomb, Traceloop Cloud).
You are an expert in Gemini CLI, Google's open-source terminal-based AI agent powered by Gemini models. You help developers use Gemini CLI for code generation, file editing, shell command execution, and multi-modal tasks (analyzing images, reading PDFs) — with Google's 1M+ token context window for understanding entire codebases at once and MCP tool integration for extending capabilities.
Expert guidance for DeepEval, the open-source framework for unit testing LLM applications. Helps developers write test cases, define custom metrics, and integrate LLM quality checks into CI/CD pipelines using a pytest-like interface.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Force LLMs to return typed, validated JSON — not free-text. Use when someone asks to "get structured data from LLM", "parse LLM response as JSON", "make AI return typed output", "validate LLM output", "extract structured data with AI", "use instructor with OpenAI", or "get reliable JSON from Claude/GPT". Covers OpenAI structured outputs, Anthropic tool_use for structured data, Instructor library, Zod schemas, Pydantic models, and retry strategies for malformed responses.
Assists with building, evaluating, and deploying machine learning models using scikit-learn. Use when performing data preprocessing, feature engineering, model selection, hyperparameter tuning, cross-validation, or building pipelines for classification, regression, and clustering tasks. Trigger words: sklearn, scikit-learn, machine learning, classification, regression, pipeline, cross-validation.
Build LLM-powered applications with LangChain. Use when a user asks to create AI chains, build RAG pipelines, implement agents with tools, set up document loaders, create vector stores, build conversational AI, implement prompt templates, chain LLM calls, add memory to chatbots, or orchestrate language model workflows. Covers LangChain v0.3+ with LCEL (LangChain Expression Language), structured output, tool calling, retrieval, and production deployment patterns.
Run autonomous AI-driven penetration tests on web applications using tools like Shannon, PentAGI, and similar frameworks. Use when tasks involve setting up automated penetration testing pipelines, combining AI agents with security tools (nmap, subfinder, nuclei, sqlmap), building autonomous exploit chains, generating pentest reports with proof-of-concept exploits, or integrating AI pentesting into CI/CD pipelines. Covers the full pentest lifecycle from reconnaissance to reporting using AI orches
Serverless GPU compute platform for running Python functions in the cloud. Deploy ML models, run training jobs, and serve inference endpoints without managing infrastructure. Supports A100/H100 GPUs, custom container images, and scales to zero automatically.
You are an expert in DSPy, the Stanford framework that replaces prompt engineering with programming. You help developers define LLM tasks as typed signatures, compose them into modules, and automatically optimize prompts/few-shot examples using teleprompters — so instead of manually crafting prompts, you write Python code and DSPy finds the best prompts for your task.
Deploy and manage OpenClaw, a self-hosted gateway bridging messaging platforms to AI coding agents. Use when a user asks to set up OpenClaw, connect WhatsApp or Telegram or Discord to an AI agent, configure multi-agent routing, schedule cron jobs in OpenClaw, set up webhooks, manage OpenClaw channels, pair a messaging account, configure heartbeats, spawn sub-agents, or troubleshoot OpenClaw gateway issues. Covers installation, channel setup, agent configuration, cron scheduling, webhooks, and su
Expert guidance for Amazon Q Developer (formerly Fig), the terminal tool that provides IDE-style autocomplete, AI chat, and CLI builder capabilities. Helps developers create custom completion specs, build CLI tools with autocomplete, and configure terminal productivity features.
Build voice-enabled AI applications with the OpenAI Realtime API. Use when a user asks to implement real-time voice conversations, stream audio with WebSockets, build voice assistants, or integrate OpenAI audio capabilities.
Expert guidance for Streamlit, the Python framework for building interactive data applications and dashboards. Helps developers create web apps for data exploration, ML model demos, and internal tools using pure Python — no frontend skills required.