aws-sdk-java-v2-bedrock

Name: aws-sdk-java-v2-bedrock
Author: Giuseppe Trisciuoglio

awsjavabedrockgenerative-aifoundation-modelssdkspring-bootai-models

⭐ 282📄 MIT🕒 2026-06-15Source ↗

Install this skill

npx skills add giuseppe-trisciuoglio/developer-kit

Works across Claude Code, Cursor, Codex, Copilot & Antigravity

The AWS SDK for Java 2.x - Amazon Bedrock skill provides direct programmatic access to generative AI services. Developers use this interface to interact with foundation models including Claude, Llama, and Titan. The skill simplifies the construction of JSON-based payloads required by the Bedrock Runtime, manages client-side authentication, and handles the complexities of streaming responses and asynchronous communication. By utilizing native Java SDK components, it allows for the integration of text generation, image creation, and vector embedding capabilities directly into enterprise JVM applications. It removes the need for manual HTTP request handling, providing typed response objects and established error handling patterns for model interaction. This is the primary method for Java-based agents to perform inference tasks across different AI providers within the AWS ecosystem.

When to Use This Skill

•Building automated customer support chatbots in Spring Boot
•Implementing data enrichment pipelines using vector embeddings
•Developing real-time content generation tools with streaming output
•Creating multi-modal agents that perform image and text synthesis

How to Invoke This Skill

Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:

“Connect to Amazon Bedrock with Java
“Invoke Claude using Java SDK
“Stream model response in Java
“Generate text embeddings using Titan
“List available foundation models on AWS

Pro Tips

💡Always manage AWS credentials securely, ideally using IAM roles for production deployments to follow the principle of least privilege.
💡For optimal performance and cost-efficiency, experiment with different foundation models and fine-tune their parameters to best suit your specific use case.
💡Utilize asynchronous programming with Java's CompletableFuture for non-blocking Bedrock API calls, especially when handling streaming responses to enhance application responsiveness.

What this skill does

•Execute inference requests against Bedrock-hosted foundation models
•Parse and handle real-time streaming data from LLMs
•Generate text and image assets using unified API patterns
•Produce vector embeddings for semantic search or RAG workflows
•Inventory available foundation models through metadata inspection

When not to use it

✕When model training or fine-tuning is required
✕When latency-sensitive tasks require local model inference
✕When direct, non-AWS hosted API endpoints are necessary

Example workflow

Initialize BedrockRuntimeClient using local AWS credentials
Format the prompt into the specific JSON schema required by the chosen model
Call the invokeModel method with the model ID and serialized payload
Parse the binary response bytes back into a UTF-8 string
Extract the generated content from the model-specific JSON structure

Prerequisites

–Active AWS account with Bedrock model access permissions
–Java 8 or higher environment
–AWS credentials configured via Environment Variables or shared config file

Pitfalls & limitations

!Payload schemas vary significantly between model families like Titan and Claude
!Failure to handle specific model-induced rate limits during high-volume requests
!Manual JSON string construction is error-prone without structured request builders

FAQ

Why do I need both Bedrock and BedrockRuntime clients?

The Bedrock client manages infrastructure-level tasks like listing available models, while the BedrockRuntime client is for executing inferences.

Does this support streaming responses?

Yes, it provides the InvokeModelWithResponseStreamRequest interface for processing data as it is generated by the model.

Which Java version is required?

The AWS SDK for Java 2.x requires Java 8 or later.

How are different model inputs handled?

You must define custom JSON payloads for each model provider, as their input structures differ significantly.

How it compares

Unlike manual REST API calls which require complex AWS Signature Version 4 authentication, this SDK handles request signing and retry logic internally, reducing boilerplate code.

Source & trust

⭐ 282 stars📄 MIT🕒 Updated 2026-06-15

View original skill on GitHub →

📄 Full skill instructions — original source: giuseppe-trisciuoglio/developer-kit

# AWS SDK for Java 2.x - Amazon Bedrock

## When to Use

Use this skill when:
- Listing and inspecting foundation models on Amazon Bedrock
- Invoking foundation models for text generation (Claude, Llama, Titan)
- Generating images with AI models (Stable Diffusion)
- Creating text embeddings for RAG applications
- Implementing streaming responses for real-time generation
- Working with multiple AI providers through unified API
- Integrating generative AI into Spring Boot applications
- Building AI-powered chatbots and assistants

## Overview

Amazon Bedrock provides access to foundation models from leading AI providers through a unified API. This skill covers patterns for working with various models including Claude, Llama, Titan, and Stability Diffusion using AWS SDK for Java 2.x.

## Quick Start

### Dependencies

<!-- Bedrock (model management) -->
<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>bedrock</artifactId>
</dependency>

<!-- Bedrock Runtime (model invocation) -->
<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>bedrockruntime</artifactId>
</dependency>

<!-- For JSON processing -->
<dependency>
    <groupId>org.json</groupId>
    <artifactId>json</artifactId>
    <version>20231013</version>
</dependency>

### Basic Client Setup

import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.bedrock.BedrockClient;
import software.amazon.awssdk.services.bedrockruntime.BedrockRuntimeClient;

// Model management client
BedrockClient bedrockClient = BedrockClient.builder()
    .region(Region.US_EAST_1)
    .build();

// Model invocation client
BedrockRuntimeClient bedrockRuntimeClient = BedrockRuntimeClient.builder()
    .region(Region.US_EAST_1)
    .build();

## Core Patterns

### Model Discovery

import software.amazon.awssdk.services.bedrock.model.*;
import java.util.List;

public List<FoundationModelSummary> listFoundationModels(BedrockClient bedrockClient) {
    return bedrockClient.listFoundationModels().modelSummaries();
}

### Model Invocation

import software.amazon.awssdk.core.SdkBytes;
import software.amazon.awssdk.services.bedrockruntime.model.*;
import org.json.JSONObject;

public String invokeModel(BedrockRuntimeClient client, String modelId, String prompt) {
    JSONObject payload = createPayload(modelId, prompt);

    InvokeModelResponse response = client.invokeModel(request -> request
        .modelId(modelId)
        .body(SdkBytes.fromUtf8String(payload.toString())));

    return extractTextFromResponse(modelId, response.body().asUtf8String());
}

private JSONObject createPayload(String modelId, String prompt) {
    if (modelId.startsWith("anthropic.claude")) {
        return new JSONObject()
            .put("anthropic_version", "bedrock-2023-05-31")
            .put("max_tokens", 1000)
            .put("messages", new JSONObject[]{
                new JSONObject().put("role", "user").put("content", prompt)
            });
    } else if (modelId.startsWith("amazon.titan")) {
        return new JSONObject()
            .put("inputText", prompt)
            .put("textGenerationConfig", new JSONObject()
                .put("maxTokenCount", 512)
                .put("temperature", 0.7));
    } else if (modelId.startsWith("meta.llama")) {
        return new JSONObject()
            .put("prompt", "[INST] " + prompt + " [/INST]")
            .put("max_gen_len", 512)
            .put("temperature", 0.7);
    }
    throw new IllegalArgumentException("Unsupported model: " + modelId);
}

### Streaming Responses

public void streamResponse(BedrockRuntimeClient client, String modelId, String prompt) {
    JSONObject payload = createPayload(modelId, prompt);

    InvokeModelWithResponseStreamRequest streamRequest =
        InvokeModelWithResponseStreamRequest.builder()
            .modelId(modelId)
            .body(SdkBytes.fromUtf8String(payload.toString()))
            .build();

    client.invokeModelWithResponseStream(streamRequest,
        InvokeModelWithResponseStreamResponseHandler.builder()
            .onEventStream(stream -> {
                stream.forEach(event -> {
                    if (event instanceof PayloadPart) {
                        PayloadPart payloadPart = (PayloadPart) event;
                        String chunk = payloadPart.bytes().asUtf8String();
                        processChunk(modelId, chunk);
                    }
                });
            })
            .build());
}

### Text Embeddings

public double[] createEmbeddings(BedrockRuntimeClient client, String text) {
    String modelId = "amazon.titan-embed-text-v1";

    JSONObject payload = new JSONObject().put("inputText", text);

    InvokeModelResponse response = client.invokeModel(request -> request
        .modelId(modelId)
        .body(SdkBytes.fromUtf8String(payload.toString())));

    JSONObject responseBody = new JSONObject(response.body().asUtf8String());
    JSONArray embeddingArray = responseBody.getJSONArray("embedding");

    double[] embeddings = new double[embeddingArray.length()];
    for (int i = 0; i < embeddingArray.length(); i++) {
        embeddings[i] = embeddingArray.getDouble(i);
    }

    return embeddings;
}

### Spring Boot Integration

@Configuration
public class BedrockConfiguration {

    @Bean
    public BedrockClient bedrockClient() {
        return BedrockClient.builder()
            .region(Region.US_EAST_1)
            .build();
    }

    @Bean
    public BedrockRuntimeClient bedrockRuntimeClient() {
        return BedrockRuntimeClient.builder()
            .region(Region.US_EAST_1)
            .build();
    }
}

@Service
public class BedrockAIService {

    private final BedrockRuntimeClient bedrockRuntimeClient;

    @Value("${bedrock.default-model-id:anthropic.claude-sonnet-4-5-20250929-v1:0}")
    private String defaultModelId;

    public BedrockAIService(BedrockRuntimeClient bedrockRuntimeClient) {
        this.bedrockRuntimeClient = bedrockRuntimeClient;
    }

    public String generateText(String prompt) {
        return generateText(prompt, defaultModelId);
    }

    public String generateText(String prompt, String modelId) {
        Map<String, Object> payload = createPayload(modelId, prompt);
        String payloadJson = new ObjectMapper().writeValueAsString(payload);

        InvokeModelResponse response = bedrockRuntimeClient.invokeModel(
            request -> request
                .modelId(modelId)
                .body(SdkBytes.fromUtf8String(payloadJson)));

        return extractTextFromResponse(modelId, response.body().asUtf8String());
    }
}

## Basic Usage Example

BedrockRuntimeClient client = BedrockRuntimeClient.builder()
    .region(Region.US_EAST_1)
    .build();

String prompt = "Explain quantum computing in simple terms";
String response = invokeModel(client, "anthropic.claude-sonnet-4-5-20250929-v1:0", prompt);
System.out.println(response);

## Best Practices

### Model Selection
- **Claude 4.5 Sonnet**: Best for complex reasoning, analysis, and creative tasks
- **Claude 4.5 Haiku**: Fast and affordable for real-time applications
- **Claude 3.7 Sonnet**: Most advanced reasoning capabilities
- **Llama 3.1**: Latest generation open-source alternative, good for general tasks
- **Titan**: AWS native, cost-effective for simple text generation

### Performance Optimization
- Reuse client instances (don't create new clients for each request)
- Use async clients for I/O operations
- Implement streaming for long responses
- Cache foundation model lists

### Security
- Never log sensitive prompt data
- Use IAM roles for authentication (never access keys)
- Implement rate limiting for public applications
- Sanitize user inputs to prevent prompt injection

### Error Handling
- Implement retry logic for throttling (exponential backoff)
- Handle model-specific validation errors
- Validate responses before processing
- Use proper exception handling for different error types

### Cost Optimization
- Use appropriate max_tokens limits
- Choose cost-effective models for simple tasks
- Cache embeddings when possible
- Monitor usage and set budget alerts

## Common Model IDs

// Claude Models
public static final String CLAUDE_SONNET_4_5 = "anthropic.claude-sonnet-4-5-20250929-v1:0";
public static final String CLAUDE_HAIKU_4_5 = "anthropic.claude-haiku-4-5-20251001-v1:0";
public static final String CLAUDE_OPUS_4_1 = "anthropic.claude-opus-4-1-20250805-v1:0";
public static final String CLAUDE_3_7_SONNET = "anthropic.claude-3-7-sonnet-20250219-v1:0";
public static final String CLAUDE_OPUS_4 = "anthropic.claude-opus-4-20250514-v1:0";
public static final String CLAUDE_SONNET_4 = "anthropic.claude-sonnet-4-20250514-v1:0";
public static final String CLAUDE_3_5_SONNET_V2 = "anthropic.claude-3-5-sonnet-20241022-v2:0";
public static final String CLAUDE_3_5_HAIKU = "anthropic.claude-3-5-haiku-20241022-v1:0";
public static final String CLAUDE_3_OPUS = "anthropic.claude-3-opus-20240229-v1:0";

// Llama Models
public static final String LLAMA_3_3_70B = "meta.llama3-3-70b-instruct-v1:0";
public static final String LLAMA_3_2_90B = "meta.llama3-2-90b-instruct-v1:0";
public static final String LLAMA_3_2_11B = "meta.llama3-2-11b-instruct-v1:0";
public static final String LLAMA_3_2_3B = "meta.llama3-2-3b-instruct-v1:0";
public static final String LLAMA_3_2_1B = "meta.llama3-2-1b-instruct-v1:0";
public static final String LLAMA_4_MAV_17B = "meta.llama4-maverick-17b-instruct-v1:0";
public static final String LLAMA_4_SCOUT_17B = "meta.llama4-scout-17b-instruct-v1:0";
public static final String LLAMA_3_1_405B = "meta.llama3-1-405b-instruct-v1:0";
public static final String LLAMA_3_1_70B = "meta.llama3-1-70b-instruct-v1:0";
public static final String LLAMA_3_1_8B = "meta.llama3-1-8b-instruct-v1:0";
public static final String LLAMA_3_70B = "meta.llama3-70b-instruct-v1:0";
public static final String LLAMA_3_8B = "meta.llama3-8b-instruct-v1:0";

// Amazon Titan Models
public static final String TITAN_TEXT_EXPRESS = "amazon.titan-text-express-v1";
public static final String TITAN_TEXT_LITE = "amazon.titan-text-lite-v1";
public static final String TITAN_EMBEDDINGS = "amazon.titan-embed-text-v1";
public static final String TITAN_IMAGE_GENERATOR = "amazon.titan-image-generator-v1";

// Stable Diffusion
public static final String STABLE_DIFFUSION_XL = "stability.stable-diffusion-xl-v1";

// Mistral AI Models
public static final String MISTRAL_LARGE_2407 = "mistral.mistral-large-2407-v1:0";
public static final String MISTRAL_LARGE_2402 = "mistral.mistral-large-2402-v1:0";
public static final String MISTRAL_SMALL_2402 = "mistral.mistral-small-2402-v1:0";
public static final String MISTRAL_PIXTRAL_2502 = "mistral.pixtral-large-2502-v1:0";
public static final String MISTRAL_MIXTRAL_8X7B = "mistral.mixtral-8x7b-instruct-v0:1";
public static final String MISTRAL_7B = "mistral.mistral-7b-instruct-v0:2";

// Amazon Nova Models
public static final String NOVA_PREMIER = "amazon.nova-premier-v1:0";
public static final String NOVA_PRO = "amazon.nova-pro-v1:0";
public static final String NOVA_LITE = "amazon.nova-lite-v1:0";
public static final String NOVA_MICRO = "amazon.nova-micro-v1:0";
public static final String NOVA_CANVAS = "amazon.nova-canvas-v1:0";
public static final String NOVA_REEL = "amazon.nova-reel-v1:1";

// Other Models
public static final String COHERE_COMMAND = "cohere.command-text-v14";
public static final String DEEPSEEK_R1 = "deepseek.r1-v1:0";
public static final String DEEPSEEK_V3_1 = "deepseek.v3-v1:0";

## Examples

See the [examples directory](examples/) for comprehensive usage patterns.

## Advanced Topics

See the [Advanced Topics](references/advanced-topics.md) for:
- Multi-model service patterns
- Advanced error handling with retries
- Batch processing strategies
- Performance optimization techniques
- Custom response parsing

## Model Reference

See the [Model Reference](references/model-reference.md) for:
- Detailed model specifications
- Payload/response formats for each provider
- Performance characteristics
- Model selection guidelines
- Configuration templates

## Testing Strategies

See the [Testing Strategies](references/testing-strategies.md) for:
- Unit testing with mocked clients
- Integration testing with LocalStack
- Performance testing
- Streaming response testing
- Test data management

## Related Skills

- aws-sdk-java-v2-core - Core AWS SDK patterns
- langchain4j-ai-services-patterns - LangChain4j integration
- spring-boot-dependency-injection - Spring DI patterns
- spring-boot-test-patterns - Spring testing patterns

## References

- [AWS Bedrock User Guide](references/aws-bedrock-user-guide.md)
- [AWS SDK for Java 2.x Documentation](references/aws-sdk-java-bedrock-api.md)
- [Bedrock API Reference](references/aws-bedrock-api-reference.md)
- [AWS SDK Examples](references/aws-sdk-examples.md)
- [Official AWS Examples](bedrock_code_examples.md)
- [Supported Models](bedrock_models_supported.md)
- [Runtime Examples](bedrock_runtime_code_examples.md)

By Giuseppe Trisciuoglio

How to Use This Skill Unit

Option A: Project-Specific (Recommended)

Click "Download" above
In your project, create the directory: .agent/skills/aws-sdk-java-v2-bedrock/
Save the file as SKILL.md
The agent will automatically discover the skill based on its description.

Option B: Global Installation (All Agents)

Save the file to these locations to make it available across all projects:

Claude Code: ~/.claude/skills/giuseppe-trisciuoglio/developer-kit/aws-sdk-java-v2-bedrock/SKILL.md
Cursor: ~/.cursor/skills/giuseppe-trisciuoglio/developer-kit/aws-sdk-java-v2-bedrock/SKILL.md
Antigravity: ~/.gemini/antigravity/skills/giuseppe-trisciuoglio/developer-kit/aws-sdk-java-v2-bedrock/SKILL.md

🚀 Install with CLI:
npx skills add giuseppe-trisciuoglio/developer-kit

Read the Master Guide: Mastering Agent Skills →

Recommended Rules

View more rules →

Recommended Workflows

View more workflows →

Handle File Uploads (S3)

AWSS3Uploads

--- description: Setup secure file uploads to AWS S3 --- 1. **Install AWS SDK**: - Install the S3 client and presigner. // turbo - Run `npm ...

Automatic commit message generator

GitAIAutomation

--- description: Automatic commit message generator and fast AI-powered commit for all current changes --- // turbo-all This workflow automatically ...

Fix Next.js Hydration Errors

Next.jsDebuggingHydration

--- description: Systematically debug and fix 'Text content does not match server-rendered HTML' errors --- 1. **Check for Invalid HTML Nesting**: ...

Recommended MCP Servers

View more MCP servers →

AWS Cost Explorer

Community

Optimize your AWS spend (including Amazon Bedrock spend) with this MCP server by examining spend across regions, services, instance types and foundation models ([demo video](https://www.youtube.com/watch?v=WuVOmYLRFmI&feature=youtu.be)).