found 287 skills in registry
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
Use when implementing any feature or bugfix, before writing implementation code
Get best practices for NUnit unit testing, including data-driven tests
Add instrumentation, build golden datasets, write eval-based tests, run them, root-cause failures, and iterate — Ensure your Python LLM application works correctly. Make sure to use this skill whenever a user is developing, testing, QA-ing, evaluating, or benchmarking a Python project that calls an LLM. Use for making sure an LLM application works correctly, catching regressions after prompt changes, fixing unexpected behavior, or validating output quality before shipping.
Get best practices for MSTest 3.x/4.x unit testing, including modern assertion APIs and data-driven tests
Get best practices for JUnit 5 unit testing, including data-driven tests
Get best practices for TUnit unit testing, including data-driven tests
Website exploration for testing using Playwright MCP
Creates an integration testing plan for .NET data access artifacts during Oracle-to-PostgreSQL database migrations. Analyzes a single project to identify repositories, DAOs, and service layers that interact with the database, then produces a structured testing plan. Use when planning integration test coverage for a migrated project, identifying which data access methods need tests, or preparing for Oracle-to-PostgreSQL migration validation.
Get best practices for XUnit unit testing, including data-driven tests
X-ray any AI model's behavioral patterns — refusal boundaries, hallucination tendencies, reasoning style, formatting defaults. No API key needed.
Provide comprehensive techniques for testing REST, SOAP, and GraphQL APIs during bug bounty hunting and penetration testing engagements. Covers vulnerability discovery, authentication bypass, IDOR exploitation, and API-specific attack vectors.
Automated end-to-end UI testing and verification on an Android Emulator using ADB.
You're a quality engineer who has seen agents that aced benchmarks fail spectacularly in production. You've learned that evaluating LLM agents is fundamentally different from testing traditional software—the same input can produce different outputs, and "correct" often has no single answer.
API security testing workflow for REST and GraphQL APIs covering authentication, authorization, rate limiting, input validation, and security best practices.
Guides developers through detecting and preventing side-channel attacks, timing leaks, and constant-time violations in cryptographic implementations. Covers techniques for identifying timing side channels in crypto code, including cache-timing attacks, secret-dependent branching, and microarchitectural leakage. Applies formal verification, statistical analysis (dudect), and dynamic tracing (timecop) to audit crypto primitives for timing vulnerabilities and ensure constant-time execution.
Application security testing toolkit from the Trail of Bits Testing Handbook. Helps the agent set up fuzzing campaigns, write fuzz harnesses, run coverage-guided fuzzers (libFuzzer, AFL++, cargo-fuzz, Atheris, Ruzzy), and triage crashes. Covers memory-safety sanitizers (AddressSanitizer, UBSan, MSan), static analysis with Semgrep and CodeQL, cryptographic validation using Wycheproof test vectors, and constant-time verification. Use when testing C, C++, Rust, Python, or Ruby code for vulnerabilit
Guides agents in using Wycheproof test vectors to validate cryptographic implementations against known attacks, edge cases, and vulnerability patterns. Covers integrating test vectors for AES-GCM, ECDSA, ECDH, EdDSA, RSA, and ChaCha20-Poly1305 into testing workflows. Helps when writing crypto tests, checking for signature malleability, invalid curve attacks, padding oracle issues, DER encoding bugs, or setting up CI for cryptographic libraries. Applies to verifying encryption, decryption, signin
cargo-fuzz is the primary fuzzing tool for Rust projects built with Cargo. It enables developers to set up fuzz testing, write fuzz harnesses, and run fuzzing campaigns using the libFuzzer backend. Covers installation, harness writing, structure-aware fuzzing with the arbitrary crate, sanitizer integration including AddressSanitizer, coverage analysis, seed corpus management, and troubleshooting common issues. Useful when a developer needs to fuzz Rust code, find memory bugs, test parsers, or im
Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechanisms, and design experiments to test them. Follows scientific method framework. For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic.