openai-responses

Name: openai-responses
Author: JezWeb

OpenAIAI AgentAPIstateful AIconversational AIreasoningmulti-turndeveloper tools

⭐ 860📄 MIT🕒 2026-06-11Source ↗

Install this skill

npx skills add jezweb/claude-skills

Works across Claude Code, Cursor, Codex, Copilot & Antigravity

The OpenAI Responses API, introduced in March 2025, shifts away from stateless chat interactions toward persistent, stateful agentic workflows. Unlike standard Chat Completions that purge reasoning data between turns, the Responses API retains the model's internal reasoning notebook, yielding notable performance gains on tasks like TAUBench. This endpoint handles conversation state automatically via dedicated IDs, eliminating the overhead of manual message history management. It integrates server-side tools—such as Code Interpreter, web search, and Model Context Protocol (MCP)—directly into the request flow. This approach optimizes cache utilization by 40-80% compared to legacy endpoints and provides polymorphic output types. It serves as a dedicated interface for agents that require multi-turn reasoning and background execution capabilities, rather than simple, transient prompt-response cycles.

When to Use This Skill

•Building autonomous agents that perform multi-step research with web access
•Developing data analysis pipelines that execute sandboxed Python code
•Creating complex workflows that require connection to private APIs via MCP
•Maintaining long-running conversational contexts without manual message array tracking

How to Invoke This Skill

Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:

“Analyze this file using code interpreter
“Find recent articles on this topic and summarize them
“Execute this Python script to generate a chart
“Connect to my Stripe MCP server to check current balance
“Continue our previous conversation with the preserved reasoning state

Pro Tips

💡Leverage the preserved reasoning state to break down complex tasks into smaller, sequential steps, allowing the agent to 'think' through problems more effectively over multiple turns.
💡Design your agent to explicitly manage conversation IDs for seamless continuation, especially when integrating with external services or user interfaces.
💡Experiment with the polymorphic output types to create richer, more interactive agent responses beyond simple text, enhancing user experience and clarity.

What this skill does

•Automatic conversation state persistence using unique session IDs
•Server-side execution for Code Interpreter, File Search, and Web Search
•Native support for Model Context Protocol (MCP) server integration
•Background processing with standard 60-second or extended 10-minute timeouts
•Preservation of model reasoning state across multi-turn exchanges

When not to use it

✕Simple, single-turn query tasks where statelessness is preferred
✕Legacy applications relying on strict Chat Completions schema compatibility

Example workflow

Initialize the OpenAI client with your API credentials
Create a new conversation object to generate a session ID
Submit a request to the /v1/responses endpoint including the conversation ID
Include tool definitions for external access like MCP or web search
Handle approval states if the tool requires explicit user permission
Extract results from response.output_text

Prerequisites

–OpenAI SDK version 6.16.0 or higher
–Valid OpenAI API key
–Configured conversation session

Pitfalls & limitations

!TypeScript definitions for certain tool properties like web_search require type assertions
!MCP connections require explicit user approval unless pre-configured in the dashboard
!Conversation sessions expire after 90 days of inactivity
!Standard tool timeouts are limited to 30 seconds without background processing flags

FAQ

How does this differ from the standard Chat Completions endpoint?

The Responses API manages history automatically through conversation IDs and preserves the model's internal reasoning, whereas Chat Completions requires manual history management and discards reasoning between requests.

Can I use this for real-time web research?

Yes, you can enable the web_search tool, though you may need to use a type assertion in TypeScript to access the external_web_access parameter.

What happens if I try to use an MCP server?

The system checks for approval; if the server is not pre-approved in your dashboard, the response will return a requires_approval status, necessitating a subsequent authorized call.

How it compares

While Chat Completions requires the developer to manage context window state and token overhead manually, the Responses API offloads state tracking and reasoning persistence to the server.

Source & trust

⭐ 860 stars📄 MIT🕒 Updated 2026-06-11

View original skill on GitHub →

📄 Full skill instructions — original source: jezweb/claude-skills

# OpenAI Responses API

**Status**: Production Ready
**Last Updated**: 2026-01-21
**API Launch**: March 2025
**Dependencies**: [email protected] (Node.js) or fetch API (Cloudflare Workers)

---

## What Is the Responses API?

OpenAI's unified interface for agentic applications, launched **March 2025**. Provides **stateful conversations** with **preserved reasoning state** across turns.

**Key Innovation:** Unlike Chat Completions (reasoning discarded between turns), Responses **preserves the model's reasoning notebook**, improving performance by **5% on TAUBench** and enabling better multi-turn interactions.

**vs Chat Completions:**

| Feature | Chat Completions | Responses API |
|---------|-----------------|---------------|
| State | Manual history tracking | Automatic (conversation IDs) |
| Reasoning | Dropped between turns | Preserved across turns (+5% TAUBench) |
| Tools | Client-side round trips | Server-side hosted |
| Output | Single message | Polymorphic (8 types) |
| Cache | Baseline | **40-80% better utilization** |
| MCP | Manual | Built-in |

---

## Quick Start

npm install [email protected]

import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'What are the 5 Ds of dodgeball?',
});

console.log(response.output_text);

**Key differences from Chat Completions:**
- Endpoint: /v1/responses (not /v1/chat/completions)
- Parameter: input (not messages)
- Role: developer (not system)
- Output: response.output_text (not choices[0].message.content)

---

## When to Use Responses vs Chat Completions

**Use Responses:**
- Agentic applications (reasoning + actions)
- Multi-turn conversations (preserved reasoning = +5% TAUBench)
- Built-in tools (Code Interpreter, File Search, Web Search, MCP)
- Background processing (60s standard, 10min extended timeout)

**Use Chat Completions:**
- Simple one-off generation
- Fully stateless interactions
- Legacy integrations

---

## Stateful Conversations

**Automatic State Management** using conversation IDs:

// Create conversation
const conv = await openai.conversations.create({
  metadata: { user_id: 'user_123' },
});

// First turn
const response1 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id,
  input: 'What are the 5 Ds of dodgeball?',
});

// Second turn - model remembers context + reasoning
const response2 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id,
  input: 'Tell me more about the first one',
});

**Benefits:** No manual history tracking, reasoning preserved, 40-80% better cache utilization

**Conversation Limits:** 90-day expiration

---

## Built-in Tools (Server-Side)

**Server-side hosted tools** eliminate backend round trips:

| Tool | Purpose | Notes |
|------|---------|-------|
| code_interpreter | Execute Python code | Sandboxed, 30s timeout (use background: true for longer) |
| file_search | RAG without vector stores | Max 512MB per file, supports PDF/Word/Markdown/HTML/code |
| web_search | Real-time web information | Automatic source citations |
| image_generation | DALL-E integration | DALL-E 3 default |
| mcp | Connect external tools | OAuth supported, tokens NOT stored |

**Usage:**

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Calculate mean of: 10, 20, 30, 40, 50',
  tools: [{ type: 'code_interpreter' }],
});

### Web Search TypeScript Note

**TypeScript Limitation**: The web_search tool's external_web_access option is missing from SDK types (as of v6.16.0).

**Workaround**:

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Search for recent news',
  tools: [{
    type: 'web_search',
    external_web_access: true,
  } as any],  // ✅ Type assertion to suppress error
});

**Source**: [GitHub Issue #1716](https://github.com/openai/openai-node/issues/1716)

---

## MCP Server Integration

Built-in support for **Model Context Protocol (MCP)** servers to connect external tools (Stripe, databases, custom APIs).

### User Approval Requirement

**By default, explicit user approval is required** before any data is shared with a remote MCP server (security feature).

**Handling Approval**:

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Get my Stripe balance',
  tools: [{
    type: 'mcp',
    server_label: 'stripe',
    server_url: 'https://mcp.stripe.com',
    authorization: process.env.STRIPE_TOKEN,
  }],
});

if (response.status === 'requires_approval') {
  // Show user: "This action requires sharing data with Stripe. Approve?"
  // After user approves, retry with approval token
}

**Alternative**: Pre-approve MCP servers in OpenAI dashboard (users configure trusted servers via settings)

**Source**: [Official MCP Guide](https://platform.openai.com/docs/guides/tools-connectors-mcp)

### Basic MCP Usage

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Roll 2d6 dice',
  tools: [{
    type: 'mcp',
    server_label: 'dice',
    server_url: 'https://example.com/mcp',
    authorization: process.env.TOKEN, // ⚠️ NOT stored, required each request
  }],
});

**MCP Output Types:**
- mcp_list_tools - Tools discovered on server
- mcp_call - Tool invocation + result
- message - Final response

---

## Reasoning Preservation

**Key Innovation:** Model's internal reasoning state survives across turns (unlike Chat Completions which discards it).

**Visual Analogy:**
- Chat Completions: Model tears out scratchpad page before responding
- Responses API: Scratchpad stays open for next turn

**Performance:** +5% on TAUBench (GPT-5) purely from preserved reasoning

**Reasoning Summaries** (free):

response.output.forEach(item => {
  if (item.type === 'reasoning') console.log(item.summary[0].text);
  if (item.type === 'message') console.log(item.content[0].text);
});

### Important: Reasoning Traces Privacy

**What You Get**: Reasoning summaries (not full internal traces)
**What OpenAI Keeps**: Full chain-of-thought reasoning (proprietary, for security/privacy)

For GPT-5-Thinking models:
- OpenAI preserves reasoning **internally** in their backend
- This preserved reasoning improves multi-turn performance (+5% TAUBench)
- But developers only receive **summaries**, not the actual chain-of-thought
- Full reasoning traces are not exposed (OpenAI's IP protection)

**Source**: [Sean Goedecke Analysis](https://www.seangoedecke.com/responses-api/)

---

## Background Mode

For long-running tasks, use background: true:

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Analyze 500-page document',
  background: true,
  tools: [{ type: 'file_search', file_ids: [fileId] }],
});

// Poll for completion (check every 5s)
const result = await openai.responses.retrieve(response.id);
if (result.status === 'completed') console.log(result.output_text);

**Timeout Limits:**
- Standard: 60 seconds
- Background: 10 minutes

### Performance Considerations

**Time-to-First-Token (TTFT) Latency:**
Background mode currently has higher TTFT compared to synchronous responses. OpenAI is working to reduce this gap.

**Recommendation:**
- For user-facing real-time responses, use sync mode (lower latency)
- For long-running async tasks, use background mode (latency acceptable)

**Source**: [OpenAI Background Mode Docs](https://platform.openai.com/docs/guides/background)

---

## Data Retention and Privacy

**Default Retention**: 30 days when store: true (default)
**Zero Data Retention (ZDR)**: Organizations with ZDR automatically enforce store: false
**Background Mode**: NOT ZDR compatible (stores data ~10 minutes for polling)

**Timeline**:
- September 26, 2025: OpenAI court-ordered retention ended
- Current: 30-day default retention with store: true

**Control Storage**:

// Disable storage (no retention)
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello!',
  store: false,  // ✅ No retention
});

// ZDR organizations: store always treated as false
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello!',
  store: true,  // ⚠️ Ignored by OpenAI for ZDR orgs, treated as false
});

**ZDR Compliance**:
- Avoid background mode (requires temporary storage)
- Explicitly set store: false for clarity
- Note: 60s timeout applies in sync mode

**Source**: [OpenAI Data Controls](https://platform.openai.com/docs/guides/your-data)

---

## Polymorphic Outputs

Returns **8 output types** instead of single message:

| Type | Example |
|------|---------|
| message | Final answer, explanation |
| reasoning | Step-by-step thought process (free!) |
| code_interpreter_call | Python code + results |
| mcp_call | Tool name, args, output |
| mcp_list_tools | Tool definitions from MCP server |
| file_search_call | Matched chunks, citations |
| web_search_call | URLs, snippets |
| image_generation_call | Image URL |

**Processing:**

response.output.forEach(item => {
  if (item.type === 'reasoning') console.log(item.summary[0].text);
  if (item.type === 'web_search_call') console.log(item.results);
  if (item.type === 'message') console.log(item.content[0].text);
});

// Or use helper for text-only
console.log(response.output_text);

---

## Migration from Chat Completions

**Breaking Changes:**

| Feature | Chat Completions | Responses API |
|---------|-----------------|---------------|
| Endpoint | /v1/chat/completions | /v1/responses |
| Parameter | messages | input |
| Role | system | developer |
| Output | choices[0].message.content | output_text |
| State | Manual array | Automatic (conversation ID) |
| Streaming | data: {"choices":[...]} | SSE with 8 item types |

**Example:**

// Before
const response = await openai.chat.completions.create({
  model: 'gpt-5',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});
console.log(response.choices[0].message.content);

// After
const response = await openai.responses.create({
  model: 'gpt-5',
  input: [
    { role: 'developer', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});
console.log(response.output_text);

---

## Migration from Assistants API

**CRITICAL: Assistants API Sunset Timeline**

- **August 26, 2025**: Assistants API officially deprecated
- **2025-2026**: OpenAI providing migration utilities
- **August 26, 2026**: Assistants API sunset (stops working)

**Migrate before August 26, 2026** to avoid breaking changes.

**Source**: [Assistants API Sunset Announcement](https://community.openai.com/t/assistants-api-beta-deprecation-august-26-2026-sunset/1354666)

**Key Breaking Changes:**

| Assistants API | Responses API |
|----------------|---------------|
| Assistants (created via API) | Prompts (created in dashboard) |
| Threads | Conversations (store items, not just messages) |
| Runs (server-side lifecycle) | Responses (stateless calls) |
| Run-Steps | Items (polymorphic outputs) |

**Migration Example:**

// Before (Assistants API - deprecated)
const assistant = await openai.beta.assistants.create({
  model: 'gpt-4',
  instructions: 'You are helpful.',
});

const thread = await openai.beta.threads.create();

const run = await openai.beta.threads.runs.create(thread.id, {
  assistant_id: assistant.id,
});

// After (Responses API - current)
const conversation = await openai.conversations.create({
  metadata: { purpose: 'customer_support' },
});

const response = await openai.responses.create({
  model: 'gpt-5',
  conversation: conversation.id,
  input: [
    { role: 'developer', content: 'You are helpful.' },
    { role: 'user', content: 'Hello!' },
  ],
});

**Migration Guide**: [Official Assistants Migration Docs](https://platform.openai.com/docs/guides/migrate-to-responses)

---

## Known Issues Prevention

This skill prevents **11** documented errors:

**1. Session State Not Persisting**
- Cause: Not using conversation IDs or using different IDs per turn
- Fix: Create conversation once (const conv = await openai.conversations.create()), reuse conv.id for all turns

**2. MCP Server Connection Failed** (mcp_connection_error)
- Causes: Invalid URL, missing/expired auth token, server down
- Fix: Verify URL is correct, test manually with fetch(), check token expiration

**3. Code Interpreter Timeout** (code_interpreter_timeout)
- Cause: Code runs longer than 30 seconds
- Fix: Use background: true for extended timeout (up to 10 min)

**4. Image Generation Rate Limit** (rate_limit_error)
- Cause: Too many DALL-E requests
- Fix: Implement exponential backoff retry (1s, 2s, 3s delays)

**5. File Search Relevance Issues**
- Cause: Vague queries return irrelevant results
- Fix: Use specific queries ("pricing in Q4 2024" not "find pricing"), filter by chunk.score > 0.7

**6. Cost Tracking Confusion**
- Cause: Responses bills for input + output + tools + stored conversations (vs Chat Completions: input + output only)
- Fix: Set store: false if not needed, monitor response.usage.tool_tokens

**7. Conversation Not Found** (invalid_request_error)
- Causes: ID typo, conversation deleted, or expired (90-day limit)
- Fix: Verify exists with openai.conversations.list() before using

**8. Tool Output Parsing Failed**
- Cause: Accessing wrong output structure
- Fix: Use response.output_text helper or iterate response.output.forEach(item => ...) checking item.type

**9. Zod v4 Incompatibility with Structured Outputs**
- **Error**: Invalid schema for response_format 'name': schema must be a JSON Schema of 'type: "object"', got 'type: "string"'.
- **Source**: [GitHub Issue #1597](https://github.com/openai/openai-node/issues/1597)
- **Why It Happens**: SDK's vendored zod-to-json-schema library doesn't support Zod v4 (missing ZodFirstPartyTypeKind export)
- **Prevention**: Pin to Zod v3 ("zod": "^3.23.8") or use custom zodTextFormat with z.toJSONSchema({ target: "draft-7" })

// Workaround: Pin to Zod v3 (recommended)
{
  "dependencies": {
    "openai": "^6.16.0",
    "zod": "^3.23.8"  // DO NOT upgrade to v4 yet
  }
}

**10. Background Mode Web Search Missing Sources**
- **Error**: web_search_call output items contain query but no sources/results
- **Source**: [GitHub Issue #1676](https://github.com/openai/openai-node/issues/1676)
- **Why It Happens**: When using background: true + web_search tool, OpenAI doesn't return sources in the response
- **Prevention**: Use synchronous mode (background: false) when web search sources are needed

// ✅ Sources available in sync mode
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Latest AI news?',
  background: false,  // Required for sources
  tools: [{ type: 'web_search' }],
});

**11. Streaming Mode Missing output_text Helper**
- **Error**: finalResponse().output_text is undefined in streaming mode
- **Source**: [GitHub Issue #1662](https://github.com/openai/openai-node/issues/1662)
- **Why It Happens**: stream.finalResponse() doesn't include output_text convenience field (only available in non-streaming responses)
- **Prevention**: Listen for output_text.done event or manually extract from output items

// Workaround: Listen for event
const stream = openai.responses.stream({ model: 'gpt-5', input: 'Hello!' });
let outputText = '';
for await (const event of stream) {
  if (event.type === 'output_text.done') {
    outputText = event.output_text;  // ✅ Available in event
  }
}

---

## Critical Patterns

**✅ Always:**
- Use conversation IDs for multi-turn (40-80% better cache)
- Handle all 8 output types in polymorphic responses
- Use background: true for tasks >30s
- Provide MCP authorization tokens (NOT stored, required each request)
- Monitor response.usage.total_tokens for cost control

**❌ Never:**
- Expose API keys in client-side code
- Assume single message output (use response.output_text helper)
- Reuse conversation IDs across users (security risk)
- Ignore error types (handle rate_limit_error, mcp_connection_error specifically)
- Poll faster than 1s for background tasks (use 5s intervals)

---

## References

**Official Docs:**
- Responses API Guide: https://platform.openai.com/docs/guides/responses
- API Reference: https://platform.openai.com/docs/api-reference/responses
- MCP Integration: https://platform.openai.com/docs/guides/tools-connectors-mcp
- Blog Post: https://developers.openai.com/blog/responses-api/
- Starter App: https://github.com/openai/openai-responses-starter-app

**Skill Resources:** templates/, references/responses-vs-chat-completions.md, references/mcp-integration-guide.md, references/built-in-tools-guide.md, references/migration-guide.md, references/top-errors.md

---

**Last verified**: 2026-01-21 | **Skill version**: 2.1.0 | **Changes**: Added 3 TIER 1 issues (Zod v4, background web search, streaming output_text), 2 TIER 2 findings (MCP approval, reasoning privacy), Data Retention & ZDR section, Assistants API sunset timeline, background mode TTFT note, web search TypeScript limitation. Updated SDK version to 6.16.0.

---

---
paths: "**/*.ts", "**/*.tsx", "**/*openai*.ts"
---

# OpenAI Responses API Corrections

Claude's training may reference Chat Completions. This project uses **Responses API**.

## Key Differences from Chat Completions

/* ❌ Chat Completions pattern */
const response = await openai.chat.completions.create({
  model: 'gpt-5',
  messages: [{ role: 'system', content: '...' }, { role: 'user', content: '...' }],
})
const text = response.choices[0].message.content

/* ✅ Responses API pattern */
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello',
  instructions: '...', // replaces system message
})
const text = response.output_text // helper property

## System Role → Developer Role

/* ❌ Chat Completions system role */
messages: [{ role: 'system', content: 'You are helpful' }]

/* ✅ Responses API uses 'developer' or instructions */
instructions: 'You are helpful'
// OR in input array:
input: [{ role: 'developer', content: 'You are helpful' }]

## Stateful Conversations

/* ❌ Managing history manually */
const messages = [...history, newMessage]

/* ✅ Use conversation_id for automatic state */
// First message
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello',
})
const conversationId = response.conversation_id

// Subsequent messages - history managed automatically
const response2 = await openai.responses.create({
  model: 'gpt-5',
  input: 'Follow up question',
  conversation_id: conversationId, // Preserves context
})

## Conversation Expiration

/* ⚠️ Conversations expire after 90 days */
// Verify conversation exists before using
try {
  const response = await openai.responses.create({
    model: 'gpt-5',
    input: 'Hello',
    conversation_id: oldConversationId,
  })
} catch (error) {
  // Conversation expired - start new one
  const response = await openai.responses.create({
    model: 'gpt-5',
    input: 'Hello',
  })
}

## Cost Tracking

/* ⚠️ Responses API has additional costs */
// Cost = input + output + tools + stored conversations

/* Disable storage if not needed */
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello',
  store: false, // Don't persist conversation
})

## Quick Fixes

| Chat Completions | Responses API |
|-----------------|---------------|
| /v1/chat/completions | /v1/responses |
| messages | input |
| role: 'system' | role: 'developer' or instructions |
| choices[0].message.content | output_text |
| Manual history | conversation_id |

By JezWeb

How to Use This Skill Unit

Option A: Project-Specific (Recommended)

Click "Download" above
In your project, create the directory: .agent/skills/openai-responses/
Save the file as SKILL.md
The agent will automatically discover the skill based on its description.

Option B: Global Installation (All Agents)

Save the file to these locations to make it available across all projects:

Claude Code: ~/.claude/skills/jezweb/claude-skills/openai-responses/SKILL.md
Cursor: ~/.cursor/skills/jezweb/claude-skills/openai-responses/SKILL.md
Antigravity: ~/.gemini/antigravity/skills/jezweb/claude-skills/openai-responses/SKILL.md

🚀 Install with CLI:
npx skills add jezweb/claude-skills

Read the Master Guide: Mastering Agent Skills →

Recommended Rules

View more rules →

Recommended Workflows

View more workflows →

Secure API from CSRF

SecurityCSRFAPI

--- description: Prevent CSRF attacks --- 1. **Use SameSite Cookies**: ```ts response.headers.set('Set-Cookie', 'token=abc; SameSite=Strict; Ht...

Generate TypeScript Types from API

TypeScriptAPICodegen

--- description: Auto-generate type-safe API client from OpenAPI/Swagger spec --- 1. **Get Your API Schema**: - Most APIs expose OpenAPI spec at `...

Debug API Issues with Network Tab

APIDebuggingDevTools

--- description: Master Chrome DevTools Network tab for API debugging --- 1. **Open Network Tab**: - Press F12 → Network tab. - Reload the page...

Recommended MCP Servers

View more MCP servers →

AgentOps

Official

Provide observability and tracing for debugging AI agents with [AgentOps](https://www.agentops.ai/) API.

Alibaba Cloud DataWorks

Official

A Model Context Protocol (MCP) server that provides tools for AI, allowing it to interact with the [DataWorks](https://www.alibabacloud.com/help/en/dataworks/) Open API through a standardized interface. This implementation is based on the Alibaba Cloud Open API and enables AI agents to perform cloud resources operations seamlessly.