> clade-model-inference
Stream Claude responses, use system prompts, handle multi-turn conversations, Use when working with model-inference patterns. and process structured output with the Messages API. Trigger with "anthropic streaming", "claude messages api", "claude inference", "stream claude response".
curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/clade-model-inference?format=md"Anthropic Messages API — Streaming & Advanced Patterns
Overview
The Messages API is the only inference endpoint. Every Claude interaction goes through client.messages.create(). This skill covers streaming, system prompts, vision, and structured output.
Prerequisites
- Completed
clade-install-auth - Familiarity with
clade-hello-world
Instructions
Step 1: Streaming Responses
import Anthropic from '@claude-ai/sdk';
const client = new Anthropic();
const stream = client.messages.stream({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Write a haiku about TypeScript.' }],
});
for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
process.stdout.write(event.delta.text);
}
}
const finalMessage = await stream.finalMessage();
console.log('\n\nTokens:', finalMessage.usage);
Step 2: Vision — Sending Images
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{
role: 'user',
content: [
{
type: 'image',
source: {
type: 'base64',
media_type: 'image/png',
data: fs.readFileSync('screenshot.png').toString('base64'),
},
},
{ type: 'text', text: 'Describe what you see in this image.' },
],
}],
});
Step 3: JSON / Structured Output
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
system: `Respond with valid JSON only. Schema: { "summary": string, "sentiment": "positive"|"negative"|"neutral", "confidence": number }`,
messages: [{ role: 'user', content: 'Analyze: "This product exceeded my expectations!"' }],
});
const result = JSON.parse(message.content[0].text);
// { summary: "Very positive review", sentiment: "positive", confidence: 0.95 }
Python Streaming
import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a haiku about Python."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print(f"\nTokens: {stream.get_final_message().usage}")
Output
- Non-streaming: Full
Messageobject withcontent,usage,stop_reason - Streaming events:
message_start— message metadatacontent_block_start— new content block beginningcontent_block_delta— incremental text (text_delta) or tool input (input_json_delta)message_delta— finalstop_reasonand usagemessage_stop— stream complete
Error Handling
| Error | Cause | Solution |
|---|---|---|
overloaded_error (529) | Anthropic API temporarily overloaded | Retry with exponential backoff; use client.messages.create with built-in retries |
rate_limit_error (429) | Exceeded RPM or TPM | Check retry-after header. See clade-rate-limits |
invalid_request_error | Image too large or bad format | Max 20 images per request. Supported: PNG, JPEG, GIF, WebP. Max 5MB each |
Key Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | Required. Model ID (e.g. claude-sonnet-4-20250514) |
max_tokens | int | Required. Maximum output tokens (1–8192 typical) |
messages | array | Required. Alternating user/assistant messages |
system | string | Optional. System prompt for behavior/persona |
temperature | float | Optional. 0.0–1.0, default 1.0 |
top_p | float | Optional. Nucleus sampling threshold |
stop_sequences | string[] | Optional. Custom stop strings |
stream | boolean | Optional. Enable SSE streaming |
Examples
See Step 1 (streaming), Step 2 (vision with base64 images), and Step 3 (structured JSON output) above. Python streaming example included.
Resources
Next Steps
See clade-embeddings-search for tool use and function calling patterns.
> related_skills --same-repo
> fathom-cost-tuning
Optimize Fathom API usage and plan selection. Trigger with phrases like "fathom cost", "fathom pricing", "fathom plan".
> fathom-core-workflow-b
Sync Fathom meeting data to CRM and build automated follow-up workflows. Use when integrating Fathom with Salesforce, HubSpot, or custom CRMs, or creating automated post-meeting email summaries. Trigger with phrases like "fathom crm sync", "fathom salesforce", "fathom follow-up", "fathom post-meeting workflow".
> fathom-core-workflow-a
Build a meeting analytics pipeline with Fathom transcripts and summaries. Use when extracting insights from meetings, building CRM sync, or creating automated meeting follow-up workflows. Trigger with phrases like "fathom analytics", "fathom meeting pipeline", "fathom transcript analysis", "fathom action items sync".
> fathom-common-errors
Diagnose and fix Fathom API errors including auth failures and missing data. Use when API calls fail, transcripts are empty, or webhooks are not firing. Trigger with phrases like "fathom error", "fathom not working", "fathom api failure", "fix fathom".