Back to AI Tools & Agents

openai-assistants

OpenAIAssistants APIlegacy codeAPI integrationAI agentdeprecationmigrationcoding assistant
860📄 MIT🕒 2026-06-11Source ↗

Install this skill

npx skills add jezweb/claude-skills

Works across Claude Code, Cursor, Codex, Copilot & Antigravity

The OpenAI Assistants API serves as a framework for managing long-running AI conversations through stateful objects. By abstracting context management, the system tracks conversation threads and memory without requiring manual token injection. Developers interact with four primary primitives: Assistants, which define specific behavior and tool access; Threads, which hold the persistent history; Messages, which represent individual communication turns; and Runs, which trigger the model to process instructions against the thread state. While functional for existing projects, this interface is entering a sunset phase scheduled for August 2026. Current implementations should prioritize maintenance, as the underlying architecture is being phased out in favor of the newer Responses API. Integrating this skill requires familiarity with asynchronous polling and event-based streaming patterns to monitor background processing status during execution.

When to Use This Skill

  • Building data analysis agents that execute Python code on user-provided CSVs
  • Constructing RAG-based document chatbots for internal technical documentation
  • Automating complex, multi-step workflows that require memory across several turns
  • Maintaining legacy systems that depend on persisted thread identifiers

How to Invoke This Skill

Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:

  • Initialize an OpenAI assistant with file search capabilities
  • Create a persistent thread for my AI tutor agent
  • Run the assistant on this thread and poll for completion
  • Stream the response from the current thread run
  • Attach a file to an existing message in my thread

Pro Tips

  • 💡Prioritize developing a clear migration plan to the OpenAI Responses API, utilizing this skill for phased transitions rather than new development.
  • 💡Regularly review OpenAI's official documentation for updated migration guides and best practices to ensure a smooth transition before the August 2026 sunset.
  • 💡Isolate Assistants API v2 code within your project to minimize refactoring scope when migrating to the Responses API, leveraging modularity.

What this skill does

  • Stateful management of conversation history and threads
  • Code interpreter sandbox for executing Python analysis
  • Vector-based retrieval through automated file search
  • Tool-use orchestration via function calling and polling
  • Event streaming to display token-by-token output

When not to use it

  • Starting any new production projects or greenfield development
  • Applications requiring ultra-low latency without async polling overhead
  • Systems that can be easily implemented with standard chat completion requests

Example workflow

  1. Define an assistant object with custom instructions and selected tools
  2. Initialize a thread ID to track the conversation context
  3. Append user messages to the thread, including any file attachments
  4. Execute a run to trigger processing, receiving a unique run ID
  5. Poll the run status repeatedly until it reaches the completed state
  6. Fetch the final thread message list to retrieve the generated reply

Prerequisites

  • Valid OpenAI API key
  • Node.js environment with openai SDK version 6.16.0
  • Configured file storage for RAG or sandbox resources

Pitfalls & limitations

  • !Runs will automatically expire if not completed within 10 minutes
  • !File search index updates may experience slight latency after upload
  • !Poll-based status checking increases system complexity compared to stateless calls
  • !Migration is mandatory before the August 2026 sunset date

FAQ

Is the Assistants API deprecated?
Yes, it is deprecated and scheduled for sunset in August 2026. New projects should avoid this API and adopt the Responses API instead.
How does the assistant handle token limits?
The API manages conversation history internally. You can store up to 100k messages per thread, though prompts are still subject to the model's specific context window constraints.
Can I use this for real-time applications?
Real-time performance is limited because the API requires asynchronous polling or streaming to wait for the model to finish its processing run.
What happens if a run requires action?
The run status changes to 'requires_action', signaling that the agent needs you to execute a function call and submit the tool outputs back to the thread.

How it compares

Unlike standard chat completions that require you to manage the full context window manually, this API automates conversation state and history management on OpenAI's infrastructure.

Source & trust

860 stars📄 MIT🕒 Updated 2026-06-11
📄 Full skill instructions — original source: jezweb/claude-skills
# OpenAI Assistants API v2

**Status**: Production Ready (⚠️ Deprecated - Sunset August 26, 2026)
**Package**: [email protected]
**Last Updated**: 2026-01-21
**v1 Deprecated**: December 18, 2024
**v2 Sunset**: August 26, 2026 (migrate to Responses API)

---

## ⚠️ Deprecation Notice

**OpenAI is deprecating Assistants API in favor of [Responses API](../openai-responses/SKILL.md).**

**Timeline**: v1 deprecated Dec 18, 2024 | v2 sunset August 26, 2026

**Use this skill if**: Maintaining legacy apps or migrating existing code (12-18 month window)
**Don't use if**: Starting new projects (use openai-responses skill instead)

**Migration**: See references/migration-to-responses.md

---

## Quick Start

npm install [email protected]


import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// 1. Create assistant
const assistant = await openai.beta.assistants.create({
name: "Math Tutor",
instructions: "You are a math tutor. Use code interpreter for calculations.",
tools: [{ type: "code_interpreter" }],
model: "gpt-5",
});

// 2. Create thread
const thread = await openai.beta.threads.create();

// 3. Add message
await openai.beta.threads.messages.create(thread.id, {
role: "user",
content: "Solve: 3x + 11 = 14",
});

// 4. Run assistant
const run = await openai.beta.threads.runs.create(thread.id, {
assistant_id: assistant.id,
});

// 5. Poll for completion
let status = await openai.beta.threads.runs.retrieve(thread.id, run.id);
while (status.status !== 'completed') {
await new Promise(r => setTimeout(r, 1000));
status = await openai.beta.threads.runs.retrieve(thread.id, run.id);
}

// 6. Get response
const messages = await openai.beta.threads.messages.list(thread.id);
console.log(messages.data[0].content[0].text.value);


---

## Core Concepts

**Four Main Objects:**

1. **Assistants**: Configured AI with instructions (max 256k chars in v2, was 32k in v1), model, tools, metadata
2. **Threads**: Conversation containers with persistent message history (max 100k messages)
3. **Messages**: User/assistant messages with optional file attachments
4. **Runs**: Async execution with states (queued, in_progress, requires_action, completed, failed, expired)

---

## Key API Patterns

### Assistants

const assistant = await openai.beta.assistants.create({
model: "gpt-5",
instructions: "System prompt (max 256k chars in v2)",
tools: [{ type: "code_interpreter" }, { type: "file_search" }],
tool_resources: { file_search: { vector_store_ids: ["vs_123"] } },
});


**Key Limits**: 256k instruction chars (v2), 128 tools max, 16 metadata pairs

### Threads & Messages

// Create thread with messages
const thread = await openai.beta.threads.create({
messages: [{ role: "user", content: "Hello" }],
});

// Add message with attachments
await openai.beta.threads.messages.create(thread.id, {
role: "user",
content: "Analyze this",
attachments: [{ file_id: "file_123", tools: [{ type: "code_interpreter" }] }],
});

// List messages
const msgs = await openai.beta.threads.messages.list(thread.id);


**Key Limits**: 100k messages per thread

---

### Runs

// Create run with optional overrides
const run = await openai.beta.threads.runs.create(thread.id, {
assistant_id: "asst_123",
additional_messages: [{ role: "user", content: "Question" }],
max_prompt_tokens: 1000,
max_completion_tokens: 500,
});

// Poll until complete
let status = await openai.beta.threads.runs.retrieve(thread.id, run.id);
while (['queued', 'in_progress'].includes(status.status)) {
await new Promise(r => setTimeout(r, 1000));
status = await openai.beta.threads.runs.retrieve(thread.id, run.id);
}


**Run States**: queuedin_progressrequires_action (function calling) / completed / failed / cancelled / expired (10 min max)

---

### Streaming

const stream = await openai.beta.threads.runs.stream(thread.id, { assistant_id });

for await (const event of stream) {
if (event.event === 'thread.message.delta') {
process.stdout.write(event.data.delta.content?.[0]?.text?.value || '');
}
}


**Key Events**: thread.run.created, thread.message.delta (streaming content), thread.run.step.delta (tool progress), thread.run.completed, thread.run.requires_action (function calling)

---

## Tools

### Code Interpreter

Runs Python code in sandbox. Generates charts, processes files (CSV, JSON, PDF, images). Max 512MB per file.

// Attach file to message
attachments: [{ file_id: "file_123", tools: [{ type: "code_interpreter" }] }]

// Access generated files
for (const content of message.content) {
if (content.type === 'image_file') {
const fileContent = await openai.files.content(content.image_file.file_id);
}
}


### File Search (RAG)

Semantic search with vector stores. **10,000 files max** (v2, was 20 in v1). **Pricing**: $0.10/GB/day (1GB free).

// Create vector store
const vs = await openai.beta.vectorStores.create({ name: "Docs" });
await openai.beta.vectorStores.files.create(vs.id, { file_id: "file_123" });

// Wait for indexing
let store = await openai.beta.vectorStores.retrieve(vs.id);
while (store.status === 'in_progress') {
await new Promise(r => setTimeout(r, 2000));
store = await openai.beta.vectorStores.retrieve(vs.id);
}

// Use in assistant
tool_resources: { file_search: { vector_store_ids: [vs.id] } }


**⚠️ Wait for status: 'completed' before using**

### Function Calling

Submit tool outputs when run.status === 'requires_action':

if (run.status === 'requires_action') {
const toolCalls = run.required_action.submit_tool_outputs.tool_calls;
const outputs = toolCalls.map(tc => ({
tool_call_id: tc.id,
output: JSON.stringify(yourFunction(JSON.parse(tc.function.arguments))),
}));

run = await openai.beta.threads.runs.submitToolOutputs(thread.id, run.id, {
tool_outputs: outputs,
});
}


## File Formats

**Code Interpreter**: .c, .cpp, .csv, .docx, .html, .java, .json, .md, .pdf, .php, .pptx, .py, .rb, .tex, .txt, .css, .jpeg, .jpg, .js, .gif, .png, .tar, .ts, .xlsx, .xml, .zip (512MB max)

**File Search**: .c, .cpp, .docx, .html, .java, .json, .md, .pdf, .php, .pptx, .py, .rb, .tex, .txt, .css, .js, .ts, .go (512MB max)

---

## Known Issues

**1. Thread Already Has Active Run**
Error: 400 Can't add messages to thread_xxx while a run run_xxx is active.

**Fix**: Cancel active run first: await openai.beta.threads.runs.cancel(threadId, runId)

**2. Run Polling Timeout / Incomplete Status**
Error: OpenAIError: Final run has not been received

**Why It Happens**: Long-running tasks may exceed polling windows or finish with incomplete status
**Prevention**: Handle incomplete runs gracefully
try {
const stream = await openai.beta.threads.runs.stream(thread.id, { assistant_id });
for await (const event of stream) {
if (event.event === 'thread.message.delta') {
process.stdout.write(event.data.delta.content?.[0]?.text?.value || '');
}
}
} catch (error) {
if (error.message?.includes('Final run has not been received')) {
// Run ended with 'incomplete' status - thread can continue
const run = await openai.beta.threads.runs.retrieve(thread.id, runId);
if (run.status === 'incomplete') {
// Handle: prompt user to continue, reduce max_completion_tokens, etc.
}
}
}

**Source**: [GitHub Issues #945](https://github.com/openai/openai-node/issues/945), [#1306](https://github.com/openai/openai-node/issues/1306), [#1439](https://github.com/openai/openai-node/issues/1439)

**3. Vector Store Not Ready**
Using vector store before indexing completes.
**Fix**: Poll vectorStores.retrieve() until status === 'completed' (see File Search section)

**4. File Upload Format Issues**
Unsupported file formats cause silent failures.
**Fix**: Validate file extensions before upload (see File Formats section)

**5. Vector Store Upload Documentation Incorrect**
Error: No 'files' provided to process

**Why It Happens**: Official documentation shows incorrect usage of uploadAndPoll
**Prevention**: Wrap file streams in { files: [...] } object
// ✅ Correct
await openai.beta.vectorStores.fileBatches.uploadAndPoll(vectorStoreId, {
files: fileStreams
});

// ❌ Wrong (shown in official docs)
await openai.beta.vectorStores.fileBatches.uploadAndPoll(vectorStoreId, fileStreams);

**Source**: [GitHub Issue #1337](https://github.com/openai/openai-node/issues/1337)

**6. Reasoning Models Reject Temperature Parameter**
Error: Unsupported parameter: 'temperature' is not supported with this model

**Why It Happens**: When updating assistant to o3-mini/o1-preview/o1-mini, old temperature settings persist
**Prevention**: Explicitly set temperature to null
await openai.beta.assistants.update(assistantId, {
model: 'o3-mini',
reasoning_effort: 'medium',
temperature: null, // ✅ Must explicitly clear
top_p: null
});

**Source**: [GitHub Issue #1318](https://github.com/openai/openai-node/issues/1318)

**7. uploadAndPoll Returns Vector Store ID Instead of Batch ID**
Error: Invalid 'batch_id': 'vs_...'. Expected an ID that begins with 'vsfb_'.

**Why It Happens**: uploadAndPoll returns vector store object instead of batch object
**Prevention**: Use alternative methods to get batch ID
// Option 1: Use createAndPoll after separate upload
const batch = await openai.vectorStores.fileBatches.createAndPoll(
vectorStoreId,
{ file_ids: uploadedFileIds }
);

// Option 2: List batches to find correct ID
const batches = await openai.vectorStores.fileBatches.list(vectorStoreId);
const batchId = batches.data[0].id; // starts with 'vsfb_'

**Source**: [GitHub Issue #1700](https://github.com/openai/openai-node/issues/1700)

**8. Vector Store File Delete Affects All Stores**
**Warning**: Deleting a file from one vector store removes it from ALL vector stores
// ❌ This deletes file from VS_A, VS_B, AND VS_C
await openai.vectorStores.files.delete('VS_A', 'file-xxx');

**Why It Happens**: SDK or API bug - delete operation has global effect
**Prevention**: Avoid sharing files across multiple vector stores if selective deletion is needed
**Source**: [GitHub Issue #1710](https://github.com/openai/openai-node/issues/1710)

**9. Memory Leak in Large File Uploads (Community-sourced)**
**Source**: [GitHub Issue #1052](https://github.com/openai/openai-node/issues/1052) | **Status**: OPEN
**Impact**: ~44MB leaked per 22MB file upload in long-running servers
**Why It Happens**: When uploading large files from streams (S3, etc.) using vectorStores.fileBatches.uploadAndPoll, memory may not be released after upload completes
**Verified**: Maintainer acknowledged, reduced in v4.58.1 but not eliminated
**Workaround**: Monitor memory usage in long-lived servers; restart periodically or use separate worker processes

**10. Thread Already Has Active Run - Race Condition (Community-sourced)**
**Enhancement to Issue #1**: When canceling an active run, race conditions may occur if the run completes before cancellation
async function createRunSafely(threadId: string, assistantId: string) {
// Check for active runs first
const runs = await openai.beta.threads.runs.list(threadId, { limit: 1 });
const activeRun = runs.data.find(r =>
['queued', 'in_progress', 'requires_action'].includes(r.status)
);

if (activeRun) {
try {
await openai.beta.threads.runs.cancel(threadId, activeRun.id);

// Wait for cancellation to complete
let run = await openai.beta.threads.runs.retrieve(threadId, activeRun.id);
while (run.status === 'cancelling') {
await new Promise(r => setTimeout(r, 500));
run = await openai.beta.threads.runs.retrieve(threadId, activeRun.id);
}
} catch (error) {
// Ignore "already completed" errors - run finished naturally
if (!error.message?.includes('completed')) throw error;
}
}

return openai.beta.threads.runs.create(threadId, { assistant_id: assistantId });
}

**Source**: [OpenAI Community Forum](https://community.openai.com/t/error-running-thread-already-has-an-active-run/782118)

See references/top-errors.md for complete catalog.

## Relationship to Other Skills

**openai-api** (Chat Completions): Stateless, manual history, direct responses. Use for simple generation.

**openai-responses** (Responses API): ✅ **Recommended for new projects**. Better reasoning, modern MCP integration, active development.

**openai-assistants**: ⚠️ **Deprecated H1 2026**. Use for legacy apps only. Migration: references/migration-to-responses.md

---

## v1 to v2 Migration

**v1 deprecated**: Dec 18, 2024

**Key Changes**: retrievalfile_search, vector stores (10k files vs 20), 256k instructions (vs 32k), message-level file attachments

See references/migration-from-v1.md

---

**Templates**: templates/basic-assistant.ts, code-interpreter-assistant.ts, file-search-assistant.ts, function-calling-assistant.ts, streaming-assistant.ts

**References**: references/top-errors.md, thread-lifecycle.md, vector-stores.md, migration-to-responses.md, migration-from-v1.md

**Related Skills**: openai-responses (recommended), openai-api

---

**Last Updated**: 2026-01-21
**Package**: [email protected]
**Status**: Production Ready (⚠️ Deprecated - Sunset August 26, 2026)
**Changes**: Added 6 new known issues (vector store upload bugs, o3-mini temperature, memory leak), enhanced streaming error handling


---

---
paths: "**/*assistant*.ts", "**/*.ts"
---

# OpenAI Assistants API - DEPRECATED

**⚠️ SUNSET H1 2026** - Use openai-responses skill for new projects.

## Migration Required

/* ❌ Assistants API (deprecated) */
const assistant = await openai.beta.assistants.create({...})
const thread = await openai.beta.threads.create()
const run = await openai.beta.threads.runs.create(thread.id, {...})

/* ✅ Use Responses API instead */
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Hello',
conversation_id: existingConversationId, // Optional: for stateful
})


## If Maintaining Existing Code

### Thread Already Has Active Run

/* ❌ Will fail if run already active */
await openai.beta.threads.runs.create(threadId, { assistant_id })

/* ✅ Cancel existing run first */
const runs = await openai.beta.threads.runs.list(threadId)
const activeRun = runs.data.find(r => r.status === 'in_progress')
if (activeRun) {
await openai.beta.threads.runs.cancel(threadId, activeRun.id)
}
await openai.beta.threads.runs.create(threadId, { assistant_id })


### Vector Store Must Be Ready

/* ❌ Using vector store before ready */
const vectorStore = await openai.beta.vectorStores.create({...})
// Immediately using...

/* ✅ Poll until completed */
let vs = await openai.beta.vectorStores.create({...})
while (vs.status !== 'completed') {
await new Promise(r => setTimeout(r, 1000))
vs = await openai.beta.vectorStores.retrieve(vs.id)
}


## v1 → v2 Changes

| v1 | v2 |
|----|-----|
| Max 20 files for file search | Max 10,000 files |
| 32k char instructions | 256k char instructions |
| retrieval tool | file_search tool |

## Quick Fixes

| If Claude suggests... | Use instead... |
|----------------------|----------------|
| New Assistants project | Responses API (openai-responses skill) |
| retrieval tool | file_search tool (v2) |
| No run cancellation | Cancel active runs before creating new |
| Immediate vector store use | Poll until status: 'completed' |

How to Use This Skill Unit

Option A: Project-Specific (Recommended)

  1. Click "Download" above
  2. In your project, create the directory: .agent/skills/openai-assistants/
  3. Save the file as SKILL.md
  4. The agent will automatically discover the skill based on its description.

Option B: Global Installation (All Agents)

Save the file to these locations to make it available across all projects:

  • Claude Code: ~/.claude/skills/jezweb/claude-skills/openai-assistants/SKILL.md
  • Cursor: ~/.cursor/skills/jezweb/claude-skills/openai-assistants/SKILL.md
  • Antigravity: ~/.gemini/antigravity/skills/jezweb/claude-skills/openai-assistants/SKILL.md

🚀 Install with CLI:
npx skills add jezweb/claude-skills

Read the Master Guide: Mastering Agent Skills

Related Skill Units

Recommended Rules

View more rules

Recommended Workflows

View more workflows

Recommended MCP Servers

View more MCP servers

Take It Further

Maximize your productivity with these powerful resources

📋

Define Your Standards

Set up coding standards to ensure this workflow produces consistent, high-quality results.

Browse Rules Library
📖

Master Workflows

Learn how to create custom workflows, use Turbo Mode, and build your automation library.

Complete Guide

How to use this Skill in Claude Code & Cursor

For Claude Code (CLI)

To use this skill in Claude Code, copy the rule content into your project's custom instructions or follow our Add-Skill CLI guide. This ensures Claude follows your standards during every code generation.

For Cursor & Windsurf

For Cursor or Windsurf, individual skills are best used in the "Rules for AI" section. This specific unit helps the agent avoid ai tools & agents issues, leading to cleaner, more efficient code.

Why the skill format matters: the standardized Agent Skills format lets your AI agent load detailed instructions only when they are relevant, keeping your prompt clean while improving results.

Source & attribution

This skill is categorized under AI Tools & Agents and is published by JezWeb, maintained in jezweb/claude-skills.

← Browse All Agent Skills
Sponsored AI assistant. Recommendations may be paid.