> cohere-reference-architecture

Implement Cohere reference architecture with layered project layout for RAG and agents. Use when designing new Cohere integrations, reviewing project structure, or establishing architecture standards for Cohere API v2 applications. Trigger with phrases like "cohere architecture", "cohere project structure", "cohere layout", "organize cohere app", "cohere design pattern".

fetch

$curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/cohere-reference-architecture?format=md"

SKILL.md•cohere-reference-architecture

Cohere Reference Architecture

Overview

Production-ready architecture for Cohere API v2 applications covering RAG pipelines, tool-use agents, and multi-model orchestration.

Prerequisites

Understanding of layered architecture
cohere-ai SDK v7+
TypeScript project with vitest

Project Structure

my-cohere-app/
├── src/
│   ├── cohere/
│   │   ├── client.ts           # CohereClientV2 singleton
│   │   ├── models.ts           # Model selection logic
│   │   ├── types.ts            # Cohere-specific types
│   │   └── errors.ts           # Error classification
│   ├── services/
│   │   ├── chat.ts             # Chat completions + streaming
│   │   ├── rag.ts              # RAG pipeline (embed → rerank → chat)
│   │   ├── agents.ts           # Tool-use agent loops
│   │   ├── embed.ts            # Batch embedding + caching
│   │   ├── rerank.ts           # Document reranking
│   │   └── classify.ts         # Few-shot classification
│   ├── tools/                  # Tool definitions for agents
│   │   ├── registry.ts         # Tool name → executor mapping
│   │   ├── search.ts
│   │   └── calculator.ts
│   ├── api/
│   │   ├── chat.ts             # POST /api/chat (streaming)
│   │   ├── embed.ts            # POST /api/embed
│   │   └── health.ts           # GET /api/health
│   └── cache/
│       └── embeddings.ts       # LRU cache for embeddings
├── tests/
│   ├── unit/
│   │   ├── chat.test.ts
│   │   ├── rag.test.ts
│   │   └── fixtures/           # Mocked API responses
│   └── integration/
│       └── cohere.test.ts      # Real API tests (gated)
├── config/
│   ├── models.json             # Model selection per environment
│   └── tools.json              # Tool definitions
└── package.json

Layer Architecture

┌─────────────────────────────────────────┐
│             API Layer                    │
│   (Express/Next.js routes, SSE stream)  │
├─────────────────────────────────────────┤
│           Service Layer                  │
│  (RAG pipeline, agent loop, classify)   │
├─────────────────────────────────────────┤
│          Cohere Layer                    │
│   (CohereClientV2, retry, model select) │
├─────────────────────────────────────────┤
│         Infrastructure Layer             │
│    (Embed cache, tool registry, queue)   │
└─────────────────────────────────────────┘

Core Components

Client Layer

// src/cohere/client.ts
import { CohereClientV2, CohereError, CohereTimeoutError } from 'cohere-ai';

let instance: CohereClientV2 | null = null;

export function getCohere(): CohereClientV2 {
  if (!instance) {
    instance = new CohereClientV2({
      token: process.env.CO_API_KEY,
    });
  }
  return instance;
}

// src/cohere/models.ts
export const MODELS = {
  chat: {
    premium: 'command-a-03-2025',
    standard: 'command-r-08-2024',
    fast: 'command-r7b-12-2024',
  },
  embed: {
    latest: 'embed-v4.0',
    english: 'embed-english-v3.0',
    multilingual: 'embed-multilingual-v3.0',
  },
  rerank: {
    latest: 'rerank-v3.5',
  },
} as const;

RAG Service

// src/services/rag.ts
import { getCohere } from '../cohere/client';
import { MODELS } from '../cohere/models';

interface RAGResult {
  answer: string;
  citations: Array<{ start: number; end: number; text: string; sources: string[] }>;
  model: string;
}

export async function ragQuery(
  query: string,
  documents: Array<{ id: string; text: string }>,
  options?: { model?: string; topN?: number }
): Promise<RAGResult> {
  const cohere = getCohere();
  const model = options?.model ?? MODELS.chat.standard;

  // Step 1: Rerank documents
  const reranked = await cohere.rerank({
    model: MODELS.rerank.latest,
    query,
    documents: documents.map(d => d.text),
    topN: options?.topN ?? 5,
  });

  // Step 2: Chat with top documents
  const topDocs = reranked.results.map(r => ({
    id: documents[r.index].id,
    data: { text: documents[r.index].text },
  }));

  const response = await cohere.chat({
    model,
    messages: [{ role: 'user', content: query }],
    documents: topDocs,
  });

  return {
    answer: response.message?.content?.[0]?.text ?? '',
    citations: (response.message?.citations ?? []).map(c => ({
      start: c.start,
      end: c.end,
      text: c.text,
      sources: c.sources?.map((s: any) => s.id) ?? [],
    })),
    model,
  };
}

Agent Service

// src/services/agents.ts
import { getCohere } from '../cohere/client';
import { MODELS } from '../cohere/models';
import { toolRegistry } from '../tools/registry';

export async function runAgent(
  userMessage: string,
  maxSteps = 5
): Promise<string> {
  const cohere = getCohere();
  const messages: any[] = [{ role: 'user', content: userMessage }];
  const tools = toolRegistry.getToolDefinitions();

  for (let step = 0; step < maxSteps; step++) {
    const response = await cohere.chat({
      model: MODELS.chat.premium,
      messages,
      tools,
    });

    if (response.finishReason !== 'TOOL_CALL') {
      return response.message?.content?.[0]?.text ?? '';
    }

    const toolCalls = response.message?.toolCalls ?? [];
    messages.push({ role: 'assistant', toolCalls });

    for (const tc of toolCalls) {
      const result = await toolRegistry.execute(
        tc.function.name,
        JSON.parse(tc.function.arguments)
      );
      messages.push({ role: 'tool', toolCallId: tc.id, content: result });
    }
  }

  return 'Agent reached max steps.';
}

Tool Registry

// src/tools/registry.ts
interface ToolDefinition {
  type: 'function';
  function: {
    name: string;
    description: string;
    parameters: Record<string, unknown>;
  };
}

class ToolRegistry {
  private tools: Map<string, {
    definition: ToolDefinition;
    executor: (args: any) => Promise<string>;
  }> = new Map();

  register(
    name: string,
    description: string,
    parameters: Record<string, unknown>,
    executor: (args: any) => Promise<string>
  ) {
    this.tools.set(name, {
      definition: {
        type: 'function',
        function: { name, description, parameters },
      },
      executor,
    });
  }

  getToolDefinitions(): ToolDefinition[] {
    return Array.from(this.tools.values()).map(t => t.definition);
  }

  async execute(name: string, args: any): Promise<string> {
    const tool = this.tools.get(name);
    if (!tool) return JSON.stringify({ error: `Unknown tool: ${name}` });
    try {
      return await tool.executor(args);
    } catch (err) {
      return JSON.stringify({ error: String(err) });
    }
  }
}

export const toolRegistry = new ToolRegistry();

Error Classification

// src/cohere/errors.ts
import { CohereError, CohereTimeoutError } from 'cohere-ai';

export type ErrorCategory = 'auth' | 'rate_limit' | 'bad_request' | 'server' | 'timeout' | 'unknown';

export function classifyError(err: unknown): {
  category: ErrorCategory;
  retryable: boolean;
  message: string;
} {
  if (err instanceof CohereTimeoutError) {
    return { category: 'timeout', retryable: true, message: 'Request timed out' };
  }
  if (err instanceof CohereError) {
    switch (err.statusCode) {
      case 401: return { category: 'auth', retryable: false, message: 'Invalid API key' };
      case 429: return { category: 'rate_limit', retryable: true, message: 'Rate limited' };
      case 400: return { category: 'bad_request', retryable: false, message: err.message };
      default:
        if (err.statusCode && err.statusCode >= 500) {
          return { category: 'server', retryable: true, message: err.message };
        }
    }
  }
  return { category: 'unknown', retryable: false, message: String(err) };
}

Data Flow

User Query
     │
     ▼
┌─────────────┐
│   API Route  │  POST /api/chat
└──────┬──────┘
       │
       ▼
┌─────────────┐    ┌─────────────┐
│  RAG Service │───▶│  Rerank     │  rerank-v3.5
│  or Agent   │    │  Service    │
└──────┬──────┘    └─────────────┘
       │
       ▼
┌─────────────┐    ┌─────────────┐
│ Chat/Stream  │───▶│  Embed      │  embed-v4.0
│  Service    │    │  Cache      │  (cached)
└──────┬──────┘    └─────────────┘
       │
       ▼
┌─────────────┐
│ CohereClient │  command-a-03-2025
│     V2      │
└─────────────┘

Output

Layered architecture separating API, service, and client concerns
RAG pipeline with rerank pre-filtering and grounded citations
Agent loop with pluggable tool registry
Error classification for retry/alert decisions
Model selection per environment and use case

Error Handling

Issue	Cause	Solution
Circular imports	Wrong layering	Services depend on client, not vice versa
Tool not found	Missing registration	Register tools at startup
Model mismatch	Env config wrong	Validate model IDs at startup
Cache miss storm	TTL expired	Stale-while-revalidate pattern

Resources

Next Steps

For multi-environment setup, see cohere-multi-env-setup.

> related_skills --same-repo

> fathom-cost-tuning

Optimize Fathom API usage and plan selection. Trigger with phrases like "fathom cost", "fathom pricing", "fathom plan".

> fathom-core-workflow-b

Sync Fathom meeting data to CRM and build automated follow-up workflows. Use when integrating Fathom with Salesforce, HubSpot, or custom CRMs, or creating automated post-meeting email summaries. Trigger with phrases like "fathom crm sync", "fathom salesforce", "fathom follow-up", "fathom post-meeting workflow".

> fathom-core-workflow-a

Build a meeting analytics pipeline with Fathom transcripts and summaries. Use when extracting insights from meetings, building CRM sync, or creating automated meeting follow-up workflows. Trigger with phrases like "fathom analytics", "fathom meeting pipeline", "fathom transcript analysis", "fathom action items sync".

> fathom-common-errors

Diagnose and fix Fathom API errors including auth failures and missing data. Use when API calls fail, transcripts are empty, or webhooks are not firing. Trigger with phrases like "fathom error", "fathom not working", "fathom api failure", "fix fathom".

┌ stats

installs/wk0

░░░░░░░░░░

github stars2.1K

██████████

first seenMar 23, 2026

└────────────

┌ repo

jeremylongshore/claude-code-plugins-plus-skills

by jeremylongshore

└────────────