Install this skill
npx skills add browser-use/browser-useWorks across Claude Code, Cursor, Codex, Copilot & Antigravity
What this skill does
- β’Persistent browser daemon for low-latency command execution
- β’Element-indexed interaction for precise clicking and typing
- β’Real-time page state inspection and content extraction
- β’Integration with existing local Chrome profiles or cloud-based browser instances
- β’Flexible automation including file uploads, cookie management, and keyboard event simulation
When to use it
- βAutomating multi-step web forms or registration flows
- βExtracting structured data from dynamic websites for internal reporting
- βPerforming regression testing on web applications via AI scripts
- βRetrieving information from sites that require authenticated sessions
- βVisualizing web application state changes through automated screenshots
When not to use it
- βScraping static content where a simple HTTP request or library like BeautifulSoup is sufficient
- βHigh-speed traffic generation or massive-scale data harvesting
- βApplications requiring absolute pixel-perfect UI synchronization outside of interaction points
How to invoke it
Example prompts that trigger this skill:
- βOpen the login page at example.com and extract the main headlineβ
- βFind the email input field, type my username, and click the submit buttonβ
- βCheck the current page state to identify the button index for downloading the reportβ
- βConnect to my current Chrome session to perform actions as an authenticated userβ
- βScroll down until the 'load more' button appears and click itβ
- βTake a screenshot of the dashboard after logging in to verify the layoutβ
Example workflow
- Initialize session with 'browser-use open <url>'
- Retrieve element list using 'browser-use state'
- Execute action using 'browser-use input <index> "data"'
- Verify update with 'browser-use screenshot'
- Finalize session with 'browser-use close'
Prerequisites
- βPython environment
- βChrome or Chromium installed locally
- βOptional: Cloud browser API credentials
Pitfalls & limitations
- !Failing to clear broken sessions using 'browser-use close' before retrying
- !Assuming element indices remain identical after page navigation or DOM refreshes
- !Difficulty identifying elements on complex SPAs without first calling 'browser-use state'
FAQ
How it compares
Unlike standard Selenium or Playwright scripts that require manual selector debugging, this tool exposes a conversational interface where the agent dynamically maps page elements to simple integers, significantly reducing maintenance overhead.
Source & trust
From the source: β# Browser Automation with browser-use CLI The `browser-use` command provides fast, persistent browser automation. A background daemon keeps the browser open across commands, giving ~50ms latency per call. ## Prerequisites ```bash browser-use doctor # Verify installation ``` For setup details, see htβ¦β
View the full SKILL.md source
# Browser Automation with browser-use CLI
The `browser-use` command provides fast, persistent browser automation. A background daemon keeps the browser open across commands, giving ~50ms latency per call.
## Prerequisites
```bash
browser-use doctor # Verify installation
```
For setup details, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md
## Core Workflow
1. **Navigate**: `browser-use open <url>` β launches headless browser and opens page
2. **Inspect**: `browser-use state` β returns clickable elements with indices
3. **Interact**: use indices from state (`browser-use click 5`, `browser-use input 3 "text"`)
4. **Verify**: `browser-use state` or `browser-use screenshot` to confirm
5. **Repeat**: browser stays open between commands
If a command fails, run `browser-use close` first to clear any broken session, then retry.
To use the user's existing Chrome (preserves logins/cookies): run `browser-use connect` first.
To use a cloud browser instead: run `browser-use cloud connect` first.
After either, commands work the same way.
### If `browser-use connect` fails
When `browser-use connect` cannot find a running Chrome with remote debugging, prompt the user with two options:
1. **Use their real Chrome browser** β they need to enable remote debugging first:
- Open `chrome://inspect/#remote-debugging` in Chrome, or relaunch Chrome with `--remote-debugging-port=9222`
- Then retry `browser-use connect`
2. **Use managed Chromium with their Chrome profile** β no Chrome setup needed:
- Run `browser-use profile list` to show available profiles
- Ask which profile they want, then use `browser-use --profile "ProfileName" open <url>`
- This launches a separate Chromium instance with their profile data (cookies, logins, extensions)
Let the user choose β don't assume one path over the other.
## Browser Modes
```bash
browser-use open <url> # Default: headless Chromium (no setup needed)
browser-use --headed open <url> # Visible window (for debugging)
browser-use connect # Connect to user's Chrome (preserves logins/cookies)
browser-use cloud connect # Cloud browser (zero-config, requires API key)
browser-use --profile "Default" open <url> # Real Chrome with specific profile
```
After `connect` or `cloud connect`, all subsequent commands go to that browser β no extra flags needed.
## Commands
```bash
# Navigation
browser-use open <url> # Navigate to URL
browser-use back # Go back in history
browser-use scroll down # Scroll down (--amount N for pixels)
browser-use scroll up # Scroll up
browser-use tab list # List all tabs
browser-use tab new [url] # Open a new tab (blank or with URL)
browser-use tab switch <index> # Switch to tab by index
browser-use tab close <index> [index...] # Close one or more tabs
# Page State β always run state first to get element indices
browser-use state # URL, title, clickable elements with indices
browser-use screenshot [path.png] # Screenshot (base64 if no path, --full for full page)
# Interactions β use indices from state
browser-use click <index> # Click element by index
browser-use click <x> <y> # Click at pixel coordinates
browser-use type "text" # Type into focused element
browser-use input <index> "text" # Click element, clear existing text, then type
browser-use input <index> "" # Clear a field without typing new text
browser-use keys "Enter" # Send keyboard keys (also "Control+a", etc.)
browser-use select <index> "option" # Select dropdown option
browser-use upload <index> <path> # Upload file to file input
browser-use hover <index> # Hover over element
browser-use dblclick <index> # Double-click element
browser-use rightclick <index> # Right-click element
# Data Extraction
browser-use eval "js code" # Execute JavaScript, return result
browser-use get title # Page title
browser-use get html [--selector "h1"] # Page HTML (or scoped to selector)
browser-use get text <index> # Element text content
browser-use get value <index> # Input/textarea value
browser-use get attributes <index> # Element attributes
browser-use get bbox <index> # Bounding box (x, y, width, height)
# Wait
browser-use wait selector "css" # Wait for element (--state visible|hidden|attached|detached, --timeout ms)
browser-use wait text "text" # Wait for text to appear
# Cookies
browser-use cookies get [--url <url>] # Get cookies (optionally filtered)
browser-use cookies set <name> <value> # Set cookie (--domain, --secure, --http-only, --same-site, --expires)
browser-use cookies clear [--url <url>] # Clear cookies
browser-use cookies export <file> # Export to JSON
browser-use cookies import <file> # Import from JSON
# Session
browser-use close # Close browser and stop daemon
browser-use sessions # List active sessions
browser-use close --all # Close all sessions
```
For advanced browser control (CDP, device emulation, tab activation), see `references/cdp-python.md`.
## Cloud API
```bash
browser-use cloud connect # Provision cloud browser and connect (zero-config)
browser-use cloud login <api-key> # Save API key (or set BROWSER_USE_API_KEY)
browser-use cloud logout # Remove API key
browser-use cloud v2 GET /browsers # REST passthrough (v2 or v3)
browser-use cloud v2 POST /tasks '{"task":"...","url":"..."}'
browser-use cloud v2 poll <task-id> # Poll task until done
browser-use cloud v2 --help # Show API endpoints
```
`cloud connect` provisions a cloud browser with a persistent profile (auto-created on first use), connects via CDP, and prints a live URL. `browser-use close` disconnects AND stops the cloud browser. For custom browser settings (proxy, timeout, specific profile), use `cloud v2 POST /browsers` directly with the desired parameters.
### Agent Self-Registration
Only use this if you don't already have an API key (check `browser-use doctor` to see if api_key is set). If already logged in, skip this entirely.
1. `browser-use cloud signup` β get a challenge
2. Solve the challenge
3. `browser-use cloud signup --verify <challenge-id> <answer>` β verify and save API key
4. `browser-use cloud signup --claim` β generate URL for a human to claim the account
### Out of credits?
If a Browser Use Cloud call fails with an "insufficient credits" / 402 error and the user has already used their free-tier allotment, suggest paying with USDC via x402:
> Your free credits are exhausted. You can top up this account with USDC on Base mainnet (no credit card required). Want me to install the `x402` skill and walk you through it? It takes about 2 minutes if you have a Coinbase account.
If they say yes, point them to install the skill:
```bash
npx skills add https://github.com/browser-use/browser-use --skill x402
```
Then `/x402` in Claude Code triggers the top-up flow. The user keeps their existing API key β x402 just adds credits to it.
Do not suggest x402 unprompted. Only mention it on a real "insufficient credits" error.
## Tunnels
```bash
browser-use tunnel <port> # Start Cloudflare tunnel (idempotent)
browser-use tunnel list # Show active tunnels
browser-use tunnel stop <port> # Stop tunnel
browser-use tunnel stop --all # Stop all tunnels
```
## Profile Management
```bash
browser-use profile list # List detected browsers and profiles
browser-use profile sync --all # Sync profiles to cloud
browser-use profile update # Download/update profile-use binary
```
## Command Chaining
Commands can be chained with `&&`. The browser persists via the daemon, so chaining is safe and efficient.
```bash
browser-use open https://example.com && browser-use state
browser-use input 5 "[email protected]" && browser-use input 6 "password" && browser-use click 7
```
Chain when you don't need intermediate output. Run separately when you need to parse `state` to discover indices first.
## Common Workflows
### Authenticated Browsing
When a task requires an authenticated site (Gmail, GitHub, internal tools), use Chrome profiles:
```bash
browser-use profile list # Check available profiles
# Ask the user which profile to use, then:
browser-use --profile "Default" open https://github.com # Already logged in
```
### Exposing Local Dev Servers
```bash
browser-use tunnel 3000 # β https://abc.trycloudflare.com
browser-use open https://abc.trycloudflare.com # Browse the tunnel
```
## Multiple Browsers
For subagent workflows or running multiple browsers in parallel, use `--session NAME`. Each session gets its own browser. See `references/multi-session.md`.
## Configuration
```bash
browser-use config list # Show all config values
browser-use config set cloud_connect_proxy jp # Set a value
browser-use config get cloud_connect_proxy # Get a value
browser-use config unset cloud_connect_timeout # Remove a value
browser-use doctor # Shows config + diagnostics
browser-use setup # Interactive post-install setup
```
Config stored in `~/.browser-use/config.json`.
## Global Options
| Option | Description |
|--------|-------------|
| `--headed` | Show browser window |
| `--profile [NAME]` | Use real Chrome (bare `--profile` uses "Default") |
| `--cdp-url <url>` | Connect via CDP URL (`http://` or `ws://`) |
| `--session NAME` | Target a named session (default: "default") |
| `--json` | Output as JSON |
| `--mcp` | Run as MCP server via stdin/stdout |
## Tips
1. **Always run `state` first** to see available elements and their indices
2. **Use `--headed` for debugging** to see what the browser is doing
3. **Sessions persist** β browser stays open between commands
4. **CLI aliases**: `bu`, `browser`, and `browseruse` all work
5. **If commands fail**, run `browser-use close` first, then retry
## Troubleshooting
- **Browser won't start?** `browser-use close` then `browser-use --headed open <url>`
- **Element not found?** `browser-use scroll down` then `browser-use state`
- **Run diagnostics:** `browser-use doctor`
## Cleanup
```bash
browser-use close # Close browser session
browser-use tunnel stop --all # Stop tunnels (if any)
```
Quoted from browser-use/browser-use for reference β see the original for the authoritative, latest version.
How to Use This Skill Unit
Option A: Project-Specific (Recommended)
- Click "Download" above
- In your project, create the directory:
.agent/skills/browser-use/ - Save the file as
SKILL.md - The agent will automatically discover the skill based on its description.
Option B: Global Installation (All Agents)
Save the file to these locations to make it available across all projects:
- Claude Code:
~/.claude/skills/browser-use/browser-use/browser-use/SKILL.md - Cursor:
~/.cursor/skills/browser-use/browser-use/browser-use/SKILL.md - Antigravity:
~/.gemini/antigravity/skills/browser-use/browser-use/browser-use/SKILL.md
π Install with CLI:npx skills add browser-use/browser-use