Back to Python Development

Data Analyst Agent

data-analysissqlpandaspythonstatistics
4.9 (153)114.7k📄 Apache-2.0🕒 2026-06-15Source ↗

Install this skill

npx skills add Shubhamsaboo/awesome-llm-apps

Works across Claude Code, Cursor, Codex, Copilot & Antigravity

What this skill does

  • Constructing complex SQL queries involving window functions and multi-table joins
  • Automating data cleaning and normalization routines using pandas
  • Executing statistical hypothesis testing and descriptive analytics
  • Optimizing database performance through query indexing suggestions
  • Generating time-series projections and data pattern visualizations

When to use it

  • When refactoring inefficient SQL queries for faster execution
  • When cleaning inconsistent datasets before model ingestion
  • When identifying statistical correlations between variables
  • When generating data summary reports for stakeholder meetings

When not to use it

  • For unstructured data analysis requiring natural language processing
  • For real-time streaming data ingestion tasks
  • For heavy-duty graphical dashboard generation

How to invoke it

Example prompts that trigger this skill:

  • Clean the following CSV data and calculate the rolling average for the last 30 days.
  • Write a SQL query to join the user and orders tables and identify high-value churners.
  • Perform a descriptive statistical analysis on this dataframe to identify outliers.
  • Optimize this JOIN query to reduce latency on a table with over a million rows.
  • Explain the potential bias in this dataset based on the current distribution.

Example workflow

  1. Initialize the agent with access to your local database or CSV files.
  2. Submit a request to perform initial exploratory data analysis to understand schema constraints.
  3. Direct the agent to clean missing values and format date columns using pandas.
  4. Instruct the agent to run SQL aggregation queries to summarize key KPIs.
  5. Request an interpretation of the resulting metrics compared to previous benchmarks.
  6. Review the suggested performance improvements for the underlying data structures.

Prerequisites

  • Python 3.9+
  • Pandas library
  • Database credentials for target sources

Pitfalls & limitations

  • !Can generate non-optimized code if schema indexes are not clearly defined in the prompt
  • !May struggle with extremely large datasets that exceed local memory limits
  • !Often requires explicit instructions to handle database-specific dialect differences

FAQ

Does this agent handle non-SQL databases?
The agent is primarily optimized for SQL and pandas-ready data structures; NoSQL support depends on the underlying driver capabilities.
Can it visualize data directly?
It focuses on the computational aspects of data analysis and code generation, but can produce logic for plotting libraries like matplotlib.
Is the generated SQL safe for production environments?
While it follows best practices, you should always review generated queries for performance and injection risks before running them against production databases.

How it compares

Unlike generic LLM prompts that may provide one-off code snippets, this agent maintains context of your dataset's specific schema and applies iterative refinement based on statistical findings.

Source & trust

115k stars📄 Apache-2.0🕒 Updated 2026-06-15🛡 no risky patterns found

From the source: “# Data Analyst You are an expert data analyst with expertise in SQL, Python (pandas), and statistical analysis. ## When to Apply Use this skill when: - Writing SQL queries for data extraction - Analyzing datasets with pandas - Performing statistical analysis - Creating data transformations - Identif…”

View the full SKILL.md source
# Data Analyst

You are an expert data analyst with expertise in SQL, Python (pandas), and statistical analysis.

## When to Apply

Use this skill when:
- Writing SQL queries for data extraction
- Analyzing datasets with pandas
- Performing statistical analysis
- Creating data transformations
- Identifying data patterns and insights
- Data cleaning and preparation

## Core Competencies

### SQL
- Complex queries with JOINs, subqueries, CTEs
- Window functions and aggregations
- Query optimization
- Database design understanding

### pandas
- Data manipulation and transformation
- Grouping, filtering, pivoting
- Time series analysis
- Handling missing data

### Statistics
- Descriptive statistics
- Hypothesis testing
- Correlation analysis
- Basic predictive modeling

## Output Format

Provide SQL queries and pandas code with:
- Clear comments
- Example results
- Performance considerations
- Interpretation of findings

---

*Created for data analysis and SQL/pandas workflows*

Quoted from Shubhamsaboo/awesome-llm-apps for reference — see the original for the authoritative, latest version.

📄 Full skill instructions — original source: Shubhamsaboo/awesome-llm-apps
The Data Analyst agent functions as a dedicated partner for technical data exploration and quantitative validation. It is engineered for developers and analysts who need to transition from raw datasets to actionable intelligence without manual syntax friction. By utilizing specialized logic for SQL construction and Python-based data manipulation via pandas, the agent manages complex aggregation, data cleansing, and time-series assessment. It excels at bridging the gap between database schemas and statistical output, providing not just the code for extraction or transformation, but also interpretive context for the findings. Designed for accuracy, this agent enforces structured commenting and performance-aware query patterns, making it a reliable utility for cleaning messy data, validating hypotheses, or extracting business insights from large-scale structured repositories during standard development cycles.

How to Use This Skill Unit

Option A: Project-Specific (Recommended)

  1. Click "Download" above
  2. In your project, create the directory: .agent/skills/data-analyst/
  3. Save the file as SKILL.md
  4. The agent will automatically discover the skill based on its description.

Option B: Global Installation (All Agents)

Save the file to these locations to make it available across all projects:

  • Claude Code: ~/.claude/skills/Shubhamsaboo/awesome-llm-apps/data-analyst/SKILL.md
  • Cursor: ~/.cursor/skills/Shubhamsaboo/awesome-llm-apps/data-analyst/SKILL.md
  • Antigravity: ~/.gemini/antigravity/skills/Shubhamsaboo/awesome-llm-apps/data-analyst/SKILL.md

🚀 Install with CLI:
npx skills add Shubhamsaboo/awesome-llm-apps

Read the Master Guide: Mastering Agent Skills

Recommended Rules

View more rules

Recommended Workflows

View more workflows

Recommended MCP Servers

View more MCP servers

Take It Further

Maximize your productivity with these powerful resources

📋

Define Your Standards

Set up coding standards to ensure this workflow produces consistent, high-quality results.

Browse Rules Library
📖

Master Workflows

Learn how to create custom workflows, use Turbo Mode, and build your automation library.

Complete Guide

How to use this Skill in Claude Code & Cursor

For Claude Code (CLI)

To use this skill in Claude Code, copy the rule content into your project's custom instructions or follow our Add-Skill CLI guide. This ensures Claude follows your standards during every code generation.

For Cursor & Windsurf

For Cursor or Windsurf, individual skills are best used in the "Rules for AI" section. This specific unit helps the agent avoid python development issues, leading to cleaner, more efficient code.

Why the skill format matters: the standardized Agent Skills format lets your AI agent load detailed instructions only when they are relevant, keeping your prompt clean while improving results.

Source & attribution

This skill is categorized under Python Development and is published by Shubhamsaboo, maintained in Shubhamsaboo/awesome-llm-apps.

← Browse All Agent Skills
Sponsored AI assistant. Recommendations may be paid.