Install this skill
npx skills add softaworks/agent-toolkitWorks across Claude Code, Cursor, Codex, Copilot & Antigravity
The datadog-cli is a specialized interface for AI agents to interact directly with Datadog observability data. It enables automated retrieval and analysis of logs, metrics, and trace information without manual navigation of the web interface. The tool acts as a bridge between an agent and the Datadog API, processing data streams into actionable insights. By supporting specific search queries, real-time log streaming, and error aggregation, it facilitates rapid incident response. It requires standard API and application keys for authentication and supports multi-region configurations, allowing agents to identify bottlenecks, troubleshoot service failures, and compare system performance across defined time windows. It standardizes the data collection process, providing structured output that is ready for further programmatic analysis or remediation steps within an automated deployment pipeline.
When to Use This Skill
- β’Automatically investigating spikes in error rates for a specific microservice
- β’Comparing system health metrics before and after a recent deployment
- β’Tracing the lifecycle of a specific request through correlated distributed logs
- β’Generating an automated summary of service errors to attach to a bug report
How to Invoke This Skill
Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:
- βCheck the last hour of error logs for the API service
- βCompare current error counts with the previous 60 minutes
- βShow me the log patterns for the auth-service failures
- βQuery the average CPU usage for the production cluster over the last day
- βFetch logs associated with trace ID xyz123
Pro Tips
- π‘Always specify a precise time range (`--from`, `--to`) when searching logs or querying metrics to narrow results and improve efficiency.
- π‘Utilize `logs trace` in conjunction with a trace ID to quickly isolate all logs related to a specific distributed transaction.
- π‘Leverage `logs patterns` after an initial broad search to identify common error messages or trends within high-volume logs.
- π‘Remember to set the `--site` flag for non-US Datadog instances to ensure proper connectivity and data retrieval.
What this skill does
- β’Execute granular log searches and real-time streaming
- β’Query timeseries metrics with support for custom intervals
- β’Identify log patterns to reduce noise from repetitive errors
- β’Retrieve contextual logs surrounding specific event timestamps
- β’Manage dashboard configurations and resource lists programmatically
- β’Perform comparative analysis of log volumes across two distinct time periods
When not to use it
- βModifying complex Datadog infrastructure or account-level billing settings
- βPerforming high-volume visual dashboard monitoring that requires real-time human intuition
- βExecuting bulk configuration changes that are better handled via Terraform
Example workflow
- Run an error summary command to identify failing services
- Use the compare command to determine if the error count is an anomaly
- Execute a patterns command to group the specific error types
- Search specific logs for the dominant pattern identified
- Perform a trace lookup using a log-derived ID to see the full request path
Prerequisites
- βDatadog API Key
- βDatadog Application Key
- βNode.js environment (for npx usage)
Pitfalls & limitations
- !Standardized output may omit metadata if not properly scoped by query filters
- !Large query windows can result in long execution times or data truncation
- !Non-US Datadog accounts require an explicit --site flag to avoid connection errors
FAQ
How it compares
Unlike manual web browser exploration, this CLI provides machine-parsable, automated data access that integrates directly into an agent's execution loop, reducing context-switching latency.
π Full skill instructions β original source: softaworks/agent-toolkit
A CLI tool for AI agents to debug and triage using Datadog logs and metrics.
## Required Reading
**You MUST read the relevant reference docs before using any command:**
- [Log Commands](references/logs-commands.md)
- [Metrics](references/metrics.md)
- [Query Syntax](references/query-syntax.md)
- [Workflows](references/workflows.md)
- [Dashboards](references/dashboards.md)
## Setup
### Environment Variables (Required)
export DD_API_KEY="your-api-key"
export DD_APP_KEY="your-app-key"Get keys from: https://app.datadoghq.com/organization-settings/api-keys
### Running the CLI
npx @leoflores/datadog-cli <command>For non-US Datadog sites, use
--site flag:npx @leoflores/datadog-cli logs search --query "*" --site datadoghq.eu## Commands Overview
| Command | Description |
|---------|-------------|
|
logs search | Search logs with filters ||
logs tail | Stream logs in real-time ||
logs trace | Find logs for a distributed trace ||
logs context | Get logs before/after a timestamp ||
logs patterns | Group similar log messages ||
logs compare | Compare log counts between periods ||
logs multi | Run multiple queries in parallel ||
logs agg | Aggregate logs by facet ||
metrics query | Query timeseries metrics ||
errors | Quick error summary by service/type ||
services | List services with log activity ||
dashboards | Manage dashboards (CRUD) ||
dashboard-lists | Manage dashboard lists |## Quick Examples
### Search Errors
npx @leoflores/datadog-cli logs search --query "status:error" --from 1h --pretty### Tail Logs (Real-time)
npx @leoflores/datadog-cli logs tail --query "service:api status:error" --pretty### Error Summary
npx @leoflores/datadog-cli errors --from 1h --pretty### Trace Correlation
npx @leoflores/datadog-cli logs trace --id "abc123def456" --pretty### Query Metrics
npx @leoflores/datadog-cli metrics query --query "avg:system.cpu.user{*}" --from 1h --pretty### Compare Periods
npx @leoflores/datadog-cli logs compare --query "status:error" --period 1h --pretty## Global Flags
| Flag | Description |
|------|-------------|
|
--pretty | Human-readable output with colors ||
--output <file> | Export results to JSON file ||
--site <site> | Datadog site (e.g., datadoghq.eu) |## Time Formats
- **Relative**:
30m, 1h, 6h, 24h, 7d- **ISO 8601**:
2024-01-15T10:30:00Z## Incident Triage Workflow
# 1. Quick error overview
npx @leoflores/datadog-cli errors --from 1h --pretty
# 2. Is this new? Compare to previous period
npx @leoflores/datadog-cli logs compare --query "status:error" --period 1h --pretty
# 3. Find error patterns
npx @leoflores/datadog-cli logs patterns --query "status:error" --from 1h --pretty
# 4. Narrow down by service
npx @leoflores/datadog-cli logs search --query "status:error service:api" --from 1h --pretty
# 5. Get context around a timestamp
npx @leoflores/datadog-cli logs context --timestamp "2024-01-15T10:30:00Z" --service api --pretty
# 6. Follow the distributed trace
npx @leoflores/datadog-cli logs trace --id "TRACE_ID" --prettySee [workflows.md](references/workflows.md) for more debugging workflows.
How to Use This Skill Unit
Option A: Project-Specific (Recommended)
- Click "Download" above
- In your project, create the directory:
.agent/skills/datadog-cli/ - Save the file as
SKILL.md - The agent will automatically discover the skill based on its description.
Option B: Global Installation (All Agents)
Save the file to these locations to make it available across all projects:
- Claude Code:
~/.claude/skills/softaworks/agent-toolkit/datadog-cli/SKILL.md - Cursor:
~/.cursor/skills/softaworks/agent-toolkit/datadog-cli/SKILL.md - Antigravity:
~/.gemini/antigravity/skills/softaworks/agent-toolkit/datadog-cli/SKILL.md
π Install with CLI:npx skills add softaworks/agent-toolkit