chunking-strategy

Name: chunking-strategy
Author: Giuseppe Trisciuoglio

RAGchunkingvector databaseLLMAIdocument processingembeddingsretrieval

⭐ 282📄 MIT🕒 2026-06-15Source ↗

Install this skill

npx skills add giuseppe-trisciuoglio/developer-kit

Works across Claude Code, Cursor, Codex, Copilot & Antigravity

The chunking-strategy skill manages how documents are fragmented before ingestion into vector databases. Efficient RAG systems depend on how well these segments maintain topical integrity and context. This skill moves beyond naive splitting by offering five levels of document segmentation ranging from simple fixed-length character buffers to sophisticated semantic boundary detection. It provides methods to handle codebases, technical documentation, and unstructured data, ensuring the retrieved context remains relevant to user queries. By aligning segmentation logic with the specific nature of your data—such as preserving function boundaries in source code or identifying thematic shifts in prose—it improves both retrieval precision and the overall relevance of generated model outputs. The approach emphasizes iterative testing and metrics-based tuning to balance chunk size against model context limits.

When to Use This Skill

•Segmenting legal contracts where individual clauses must remain intact
•Indexing large code repositories while keeping function and class structures together
•Optimizing RAG retrieval for thematic analysis of long-form reports
•Processing diverse datasets with mixed formatting like Markdown and HTML

How to Invoke This Skill

Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:

“How should I split these documents for my RAG system?
“My vector search results are too vague, how do I improve chunking?
“What is the best way to handle code files in a vector database?
“Show me how to implement semantic chunking on my dataset
“How do I determine the right chunk size and overlap for my project?

Pro Tips

💡Always test different chunk sizes and overlaps on a representative dataset to find the optimal balance between context preservation and retrieval efficiency for your specific use case.
💡Consider the downstream task: for factoid Q&A, smaller chunks might be better, while for summarization or analytical queries, larger, more contextual chunks are often preferred.
💡Combine structural chunking with metadata enrichment to provide more robust retrieval cues, allowing your RAG system to not just find relevant text but also understand its context within the original document.

What this skill does

•Applies recursive character splitting based on structural delimiters
•Preserves semantic integrity using embedding-based boundary detection
•Implements structure-aware segmentation for code blocks and tables
•Configures custom overlap ratios to prevent loss of information at fragment edges
•Supports Late Chunking strategies for high-context embedding models

When not to use it

✕When documents are short enough to fit entirely within the embedding model context window
✕When raw text retrieval or exact keyword matching is prioritized over semantic search

Example workflow

Analyze the document source type to determine structural hierarchy
Choose a primary strategy level ranging from fixed-size to semantic splitting
Define parameters including target token size and overlap percentage
Run a test batch to generate embeddings for a sample set of chunks
Evaluate retrieval precision using a gold-standard set of queries
Adjust thresholds or chunk sizes based on performance metrics

Prerequisites

–Cleaned source documents
–Access to an embedding model
–Defined vector store architecture

Pitfalls & limitations

!Excessive chunk overlap can lead to redundant results and higher storage costs
!Semantic chunking requires extra computation time for initial embedding generation
!Small fixed-size chunks often cause loss of necessary surrounding context

FAQ

Why is overlap necessary between chunks?

Overlap ensures that information split across a boundary is preserved in both adjacent chunks, preventing critical context from being orphaned.

Which strategy should I start with?

Start with recursive character chunking if your documents have clear structural markers like paragraphs, or use fixed-size chunking for unstructured data.

How does semantic chunking differ from fixed-size?

Fixed-size splits based on character or token count regardless of meaning, whereas semantic chunking splits when the thematic content of the text significantly changes.

How it compares

Generic prompts often result in uniform splitting that destroys meaning, whereas this strategy uses structural and semantic awareness to ensure chunks act as logical, independent units.

Source & trust

⭐ 282 stars📄 MIT🕒 Updated 2026-06-15

View original skill on GitHub →

📄 Full skill instructions — original source: giuseppe-trisciuoglio/developer-kit

# Chunking Strategy for RAG Systems

## Overview

Implement optimal chunking strategies for Retrieval-Augmented Generation (RAG) systems and document processing pipelines. This skill provides a comprehensive framework for breaking large documents into smaller, semantically meaningful segments that preserve context while enabling efficient retrieval and search.

## When to Use

Use this skill when building RAG systems, optimizing vector search performance, implementing document processing pipelines, handling multi-modal content, or performance-tuning existing RAG systems with poor retrieval quality.

## Instructions

### Choose Chunking Strategy

Select appropriate chunking strategy based on document type and use case:

1. **Fixed-Size Chunking** (Level 1)
- Use for simple documents without clear structure
- Start with 512 tokens and 10-20% overlap
- Adjust size based on query type: 256 for factoid, 1024 for analytical

2. **Recursive Character Chunking** (Level 2)
- Use for documents with clear structural boundaries
- Implement hierarchical separators: paragraphs → sentences → words
- Customize separators for document types (HTML, Markdown)

3. **Structure-Aware Chunking** (Level 3)
- Use for structured documents (Markdown, code, tables, PDFs)
- Preserve semantic units: functions, sections, table blocks
- Validate structure preservation post-splitting

4. **Semantic Chunking** (Level 4)
- Use for complex documents with thematic shifts
- Implement embedding-based boundary detection
- Configure similarity threshold (0.8) and buffer size (3-5 sentences)

5. **Advanced Methods** (Level 5)
- Use Late Chunking for long-context embedding models
- Apply Contextual Retrieval for high-precision requirements
- Monitor computational costs vs. retrieval improvements

Reference detailed strategy implementations in [references/strategies.md](references/strategies.md).

### Implement Chunking Pipeline

Follow these steps to implement effective chunking:

1. **Pre-process documents**
- Analyze document structure and content types
- Identify multi-modal content (tables, images, code)
- Assess information density and complexity

2. **Select strategy parameters**
- Choose chunk size based on embedding model context window
- Set overlap percentage (10-20% for most cases)
- Configure strategy-specific parameters

3. **Process and validate**
- Apply chosen chunking strategy
- Validate semantic coherence of chunks
- Test with representative documents

4. **Evaluate and iterate**
- Measure retrieval precision and recall
- Monitor processing latency and resource usage
- Optimize based on specific use case requirements

Reference detailed implementation guidelines in [references/implementation.md](references/implementation.md).

### Evaluate Performance

Use these metrics to evaluate chunking effectiveness:

- **Retrieval Precision**: Fraction of retrieved chunks that are relevant
- **Retrieval Recall**: Fraction of relevant chunks that are retrieved
- **End-to-End Accuracy**: Quality of final RAG responses
- **Processing Time**: Latency impact on overall system
- **Resource Usage**: Memory and computational costs

Reference detailed evaluation framework in [references/evaluation.md](references/evaluation.md).

## Examples

### Basic Fixed-Size Chunking

from langchain.text_splitter import RecursiveCharacterTextSplitter

# Configure for factoid queries
splitter = RecursiveCharacterTextSplitter(
    chunk_size=256,
    chunk_overlap=25,
    length_function=len
)

chunks = splitter.split_documents(documents)

### Structure-Aware Code Chunking

def chunk_python_code(code):
    """Split Python code into semantic chunks"""
    import ast

    tree = ast.parse(code)
    chunks = []

    for node in ast.walk(tree):
        if isinstance(node, (ast.FunctionDef, ast.ClassDef)):
            chunks.append(ast.get_source_segment(code, node))

    return chunks

### Semantic Chunking with Embeddings

def semantic_chunk(text, similarity_threshold=0.8):
    """Chunk text based on semantic boundaries"""
    sentences = split_into_sentences(text)
    embeddings = generate_embeddings(sentences)

    chunks = []
    current_chunk = [sentences[0]]

    for i in range(1, len(sentences)):
        similarity = cosine_similarity(embeddings[i-1], embeddings[i])

        if similarity < similarity_threshold:
            chunks.append(" ".join(current_chunk))
            current_chunk = [sentences[i]]
        else:
            current_chunk.append(sentences[i])

    chunks.append(" ".join(current_chunk))
    return chunks

## Best Practices

### Core Principles
- Balance context preservation with retrieval precision
- Maintain semantic coherence within chunks
- Optimize for embedding model constraints
- Preserve document structure when beneficial

### Implementation Guidelines
- Start simple with fixed-size chunking (512 tokens, 10-20% overlap)
- Test thoroughly with representative documents
- Monitor both accuracy metrics and computational costs
- Iterate based on specific document characteristics

### Common Pitfalls to Avoid
- Over-chunking: Creating too many small, context-poor chunks
- Under-chunking: Missing relevant information due to oversized chunks
- Ignoring document structure and semantic boundaries
- Using one-size-fits-all approach for diverse content types
- Neglecting overlap for boundary-crossing information

## Constraints

### Resource Considerations
- Semantic and contextual methods require significant computational resources
- Late chunking needs long-context embedding models
- Complex strategies increase processing latency
- Monitor memory usage for large document processing

### Quality Requirements
- Validate chunk semantic coherence post-processing
- Test with domain-specific documents before deployment
- Ensure chunks maintain standalone meaning where possible
- Implement proper error handling for edge cases

## References

Reference detailed documentation in the [references/](references/) folder:
- [strategies.md](references/strategies.md) - Detailed strategy implementations
- [implementation.md](references/implementation.md) - Complete implementation guidelines
- [evaluation.md](references/evaluation.md) - Performance evaluation framework
- [tools.md](references/tools.md) - Recommended libraries and frameworks
- [research.md](references/research.md) - Key research papers and findings
- [advanced-strategies.md](references/advanced-strategies.md) - 11 comprehensive chunking methods
- [semantic-methods.md](references/semantic-methods.md) - Semantic and contextual approaches
- [visualization-tools.md](references/visualization-tools.md) - Evaluation and visualization tools

By Giuseppe Trisciuoglio

How to Use This Skill Unit

Option A: Project-Specific (Recommended)

Click "Download" above
In your project, create the directory: .agent/skills/chunking-strategy/
Save the file as SKILL.md
The agent will automatically discover the skill based on its description.

Option B: Global Installation (All Agents)

Save the file to these locations to make it available across all projects:

Claude Code: ~/.claude/skills/giuseppe-trisciuoglio/developer-kit/chunking-strategy/SKILL.md
Cursor: ~/.cursor/skills/giuseppe-trisciuoglio/developer-kit/chunking-strategy/SKILL.md
Antigravity: ~/.gemini/antigravity/skills/giuseppe-trisciuoglio/developer-kit/chunking-strategy/SKILL.md

🚀 Install with CLI:
npx skills add giuseppe-trisciuoglio/developer-kit

Read the Master Guide: Mastering Agent Skills →

Recommended Rules

View more rules →

Recommended Workflows

View more workflows →

Automatic commit message generator

GitAIAutomation

--- description: Automatic commit message generator and fast AI-powered commit for all current changes --- // turbo-all This workflow automatically ...

Fix Next.js Hydration Errors

Next.jsDebuggingHydration

--- description: Systematically debug and fix 'Text content does not match server-rendered HTML' errors --- 1. **Check for Invalid HTML Nesting**: ...

Nuke & Reinstall

npmTroubleshootingDependencies

--- description: The nuclear option for when dependencies are completely broken --- 1. **Remove node_modules**: - Delete the existing `node_module...

Recommended MCP Servers

View more MCP servers →

py-mcp-qdrant-rag

Community

(by amornpan) - A Model Context Protocol server implementation that provides RAG capabilities through Qdrant vector database integration, enabling AI agents to perform semantic search and document retrieval with local or cloud-based embedding generation support across Mac, Linux, and Windows platforms.