langchain4j-testing-strategies
Install this skill
npx skills add giuseppe-trisciuoglio/developer-kitWorks across Claude Code, Cursor, Codex, Copilot & Antigravity
LangChain4J testing strategies provide a structured framework for validating AI-integrated Java applications. This approach balances rapid feedback through unit tests with high-fidelity validation using Testcontainers for real-world interactions. Developers can isolate business logic by mocking chat and embedding models, ensuring that service configurations and AI pipelines function reliably without incurring API costs or latency during builds. The strategy emphasizes a testing pyramid, prioritizing fast execution while maintaining oversight of complex flows like RAG systems, tool invocation, and streaming response patterns. By configuring specific test dependencies and maintaining strict isolation between test cases, teams ensure that prompt changes, dependency updates, and internal logic refinements do not break existing LLM-powered workflows. This methodology specifically addresses the unique non-deterministic nature of AI outputs within standard Java enterprise testing environments.
When to Use This Skill
- β’Verifying that an AiService correctly maps a natural language query to a specific Java tool function
- β’Ensuring RAG retrieval logic correctly filters and embeds context from local document stores
- β’Validating exception handling when an LLM provider returns malformed JSON or times out
- β’Regression testing prompt changes to ensure output structure remains consistent for downstream processing
How to Invoke This Skill
Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:
- βhow to mock chat models in LangChain4J
- βsetup Testcontainers for Ollama in Java
- βwrite unit tests for LangChain4J AiServices
- βbest practices for testing RAG pipelines in Java
- βhow to verify tool execution in LangChain4J tests
Pro Tips
- π‘Combine mock models for rapid unit testing of business logic with Testcontainers for realistic, isolated integration testing of your full AI stack.
- π‘Prioritize testing guardrails and critical tool execution paths, as these are often where subtle failures can lead to significant issues.
- π‘Implement snapshot testing for LLM outputs in key flows to detect unexpected changes or regressions after model updates or prompt modifications.
What this skill does
- β’Isolation of LLM logic via Mockito-based model stubs
- β’Containerized integration testing for local model execution using Ollama or similar engines
- β’Validation of Retrieval-Augmented Generation (RAG) pipelines and retrieval accuracy
- β’Assertion testing for tool-calling and function invocation logic
- β’Performance measurement for streaming responses and latency thresholds
When not to use it
- βTesting raw model accuracy or training data quality of a commercial foundation model
- βPerformance benchmarking of third-party cloud-based LLM APIs that do not support test keys
- βEvaluating non-Java AI workflows that do not integrate with the LangChain4J library
Example workflow
- Configure project dependencies to include langchain4j-test and Testcontainers
- Implement unit tests using Mockito to verify basic AiService routing and logic
- Create containerized integration tests using Testcontainers for local model interactions
- Define test data scenarios including edge cases for empty inputs or malformed prompts
- Execute the test suite during CI/CD to monitor for regressions in AI pipeline behavior
Prerequisites
- βLangChain4J core library installation
- βDocker installed for Testcontainers usage
- βJUnit 5 testing framework
- βKnowledge of Mockito library
Pitfalls & limitations
- !Over-relying on real LLM calls in integration tests leads to slow builds and high API expenses
- !Failing to reset state in global stores between tests results in non-deterministic flakiness
- !Underestimating the latency of spinning up containers, which can inflate feedback loops if not managed properly
FAQ
How it compares
Unlike manual ad-hoc testing, this framework provides a formal, repeatable path that treats AI behavior as testable code, reducing the ambiguity typically associated with generative outputs.
π Full skill instructions β original source: giuseppe-trisciuoglio/developer-kit
## When to Use This Skill
Use this skill when:
- Building AI-powered applications with LangChain4J
- Writing unit tests for AI services and guardrails
- Setting up integration tests with real LLM models
- Creating mock-based tests for faster test execution
- Using Testcontainers for isolated testing environments
- Testing RAG (Retrieval-Augmented Generation) systems
- Validating tool execution and function calling
- Testing streaming responses and async operations
- Setting up end-to-end tests for AI workflows
- Implementing performance and load testing
## Instructions
To test LangChain4J applications effectively, follow these key strategies:
### 1. Start with Unit Testing
Use mock models for fast, isolated testing of business logic. See
references/unit-testing.md for detailed examples.// Example: Mock ChatModel for unit tests
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.generate(any(String.class)))
.thenReturn(Response.from(AiMessage.from("Mocked response")));
var service = AiServices.builder(AiService.class)
.chatModel(mockModel)
.build();### 2. Configure Testing Dependencies
Setup proper Maven/Gradle dependencies for testing. See
references/testing-dependencies.md for complete configuration.**Key dependencies**:
-
langchain4j-test - Testing utilities and guardrail assertions-
testcontainers - Integration testing with containerized services-
mockito - Mock external dependencies-
assertj - Better assertions### 3. Implement Integration Tests
Test with real services using Testcontainers. See
references/integration-testing.md for container setup examples.@Testcontainers
class OllamaIntegrationTest {
@Container
static GenericContainer<?> ollama = new GenericContainer<>(
DockerImageName.parse("ollama/ollama:latest")
).withExposedPorts(11434);
@Test
void shouldGenerateResponse() {
ChatModel model = OllamaChatModel.builder()
.baseUrl(ollama.getEndpoint())
.build();
String response = model.generate("Test query");
assertNotNull(response);
}
}### 4. Test Advanced Features
For streaming responses, memory management, and complex workflows, refer to
references/advanced-testing.md.### 5. Apply Testing Workflows
Follow testing pyramid patterns and best practices from
references/workflow-patterns.md.- **70% Unit Tests**: Fast, isolated business logic testing
- **20% Integration Tests**: Real service interactions
- **10% End-to-End Tests**: Complete user workflows
## Examples
### Basic Unit Test
@Test
void shouldProcessQueryWithMock() {
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.generate(any(String.class)))
.thenReturn(Response.from(AiMessage.from("Test response")));
var service = AiServices.builder(AiService.class)
.chatModel(mockModel)
.build();
String result = service.chat("What is Java?");
assertEquals("Test response", result);
}### Integration Test with Testcontainers
@Testcontainers
class RAGIntegrationTest {
@Container
static GenericContainer<?> ollama = new GenericContainer<>(
DockerImageName.parse("ollama/ollama:latest")
);
@Test
void shouldCompleteRAGWorkflow() {
// Setup models and stores
var chatModel = OllamaChatModel.builder()
.baseUrl(ollama.getEndpoint())
.build();
var embeddingModel = OllamaEmbeddingModel.builder()
.baseUrl(ollama.getEndpoint())
.build();
var store = new InMemoryEmbeddingStore<>();
var retriever = EmbeddingStoreContentRetriever.builder()
.chatModel(chatModel)
.embeddingStore(store)
.embeddingModel(embeddingModel)
.build();
// Test complete workflow
var assistant = AiServices.builder(RagAssistant.class)
.chatLanguageModel(chatModel)
.contentRetriever(retriever)
.build();
String response = assistant.chat("What is Spring Boot?");
assertNotNull(response);
assertTrue(response.contains("Spring"));
}
}## Best Practices
### Test Isolation
- Each test must be independent
- Use
@BeforeEach and @AfterEach for setup/teardown- Avoid sharing state between tests
### Mock External Dependencies
- Never call real APIs in unit tests
- Use mocks for ChatModel, EmbeddingModel, and external services
- Test error handling scenarios
### Performance Considerations
- Unit tests should run in < 50ms
- Integration tests should use container reuse
- Include timeout assertions for slow operations
### Quality Assertions
- Test both success and error scenarios
- Validate response coherence and relevance
- Include edge case testing (empty inputs, large payloads)
## Reference Documentation
For comprehensive testing guides and API references, see the included reference documents:
- **[Testing Dependencies](references/testing-dependencies.md)** - Maven/Gradle configuration and setup
- **[Unit Testing](references/unit-testing.md)** - Mock models, guardrails, and individual components
- **[Integration Testing](references/integration-testing.md)** - Testcontainers and real service testing
- **[Advanced Testing](references/advanced-testing.md)** - Streaming, memory, and error handling
- **[Workflow Patterns](references/workflow-patterns.md)** - Test pyramid and best practices
## Common Patterns
### Mock Strategy
// For fast unit tests
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.generate(anyString())).thenReturn(Response.from(AiMessage.from("Mocked")));
// For specific responses
when(mockModel.generate(eq("Hello"))).thenReturn(Response.from(AiMessage.from("Hi")));
when(mockModel.generate(contains("Java"))).thenReturn(Response.from(AiMessage.from("Java response")));### Test Configuration
// Use test-specific profiles
@TestPropertySource(properties = {
"langchain4j.ollama.base-url=http://localhost:11434"
})
class TestConfig {
// Test with isolated configuration
}### Assertion Helpers
// Custom assertions for AI responses
assertThat(response).isNotNull().isNotEmpty();
assertThat(response).containsAll(expectedKeywords);
assertThat(response).doesNotContain("error");## Performance Requirements
- **Unit Tests**: < 50ms per test
- **Integration Tests**: Use container reuse for faster startup
- **Timeout Tests**: Include
@Timeout for external service calls- **Memory Management**: Test conversation window limits and cleanup
## Security Considerations
- Never use real API keys in tests
- Mock external API calls completely
- Test prompt injection detection
- Validate output sanitization
## Testing Pyramid Implementation
70% Unit Tests
ββ Business logic validation
ββ Guardrail testing
ββ Mock tool execution
ββ Edge case handling
20% Integration Tests
ββ Testcontainers with Ollama
ββ Vector store testing
ββ RAG workflow validation
ββ Performance benchmarking
10% End-to-End Tests
ββ Complete user journeys
ββ Real model interactions
ββ Performance under load## Related Skills
-
spring-boot-test-patterns-
unit-test-service-layer-
unit-test-boundary-conditions## References
- [Testing Dependencies](references/testing-dependencies.md)
- [Unit Testing](references/unit-testing.md)
- [Integration Testing](references/integration-testing.md)
- [Advanced Testing](references/advanced-testing.md)
- [Workflow Patterns](references/workflow-patterns.md)
How to Use This Skill Unit
Option A: Project-Specific (Recommended)
- Click "Download" above
- In your project, create the directory:
.agent/skills/langchain4j-testing-strategies/ - Save the file as
SKILL.md - The agent will automatically discover the skill based on its description.
Option B: Global Installation (All Agents)
Save the file to these locations to make it available across all projects:
- Claude Code:
~/.claude/skills/giuseppe-trisciuoglio/developer-kit/langchain4j-testing-strategies/SKILL.md - Cursor:
~/.cursor/skills/giuseppe-trisciuoglio/developer-kit/langchain4j-testing-strategies/SKILL.md - Antigravity:
~/.gemini/antigravity/skills/giuseppe-trisciuoglio/developer-kit/langchain4j-testing-strategies/SKILL.md
π Install with CLI:npx skills add giuseppe-trisciuoglio/developer-kit