
Skyvern Browser Automation
Low Riskby @sickn33Verified Source
About
AI-powered browser automation — navigate sites, fill forms, extract structured data, log in with stored credentials, and build reusable workflows.
name: skyvern-browser-automation description: "AI-powered browser automation — navigate sites, fill forms, extract structured data, log in with stored credentials, and build reusable workflows." category: browser-automation risk: safe source: community source_repo: Skyvern-AI/skyvern source_type: official date_added: "2026-04-23" author: mark1ian tags: [browser-automation, mcp, web-scraping, form-filling, ai-agents, workflow-automation] tools: [claude, cursor, gemini, codex] license: "AGPL-3.0" license_source: "https://github.com/Skyvern-AI/skyvern/blob/main/LICENSE"
Skyvern Browser Automation -- CLI Judgment Procedure
Skyvern uses AI to navigate and interact with websites. Every command below is a runnable skyvern <command> invocation.
When to Use This Skill
- Use when you need AI-assisted browser automation for navigation, extraction, form filling, login flows, or reusable website workflows.
- Use when deterministic selectors are unavailable and Skyvern's visual/a11y reasoning can identify page controls.
- Use when a one-off browser task should become a repeatable workflow with run history and verification.
Step 1: Classify Your Task (ALWAYS do this first)
| Classification | Signal | CLI Command | Cost | What Happens |
|---|---|---|---|---|
| Quick check (yes/no) | "is the user logged in?" | skyvern browser validate | 1 LLM + screenshots | Lightweight validation (2 steps max), returns boolean. Cheapest AI option. |
| Quick inspection | "what does the page show?" | skyvern browser extract | 1 LLM + screenshots | Dedicated extraction LLM + schema validation + caching. |
| Single action (known target) | "click #submit" | skyvern browser click/type | 0 LLM | Deterministic Playwright. No AI. Fastest. |
| Single action (unknown target) | "click the submit button" | skyvern browser act | 2-3 LLM, no screenshots | No screenshots in reasoning. Economy a11y tree. For visual targets, use hybrid mode (selector + intent). |
| Same-page multi-step | "fill the form and submit" | skyvern browser act or primitive chain | 2-3 LLM or 0 LLM | Use act when labels are clear. Use click/type/select directly when you know selectors. |
| Throwaway autonomous trial | "try this once", "see if this works" | skyvern browser run-task | Higher | One-off autonomous agent for exploration. Do not use for recurring or multi-page production automations. |
| Multi-page or reusable automation | "navigate a multi-page wizard", "set this up", "automate this weekly" | skyvern workflow create + run | N LLM + screenshots | Build a workflow with one block per step. Each block gets visual reasoning, verification, and reusable run history. |
MCP note: if you are using the Skyvern MCP instead of the CLI, prefer observe + execute for same-page multi-step UI work. The CLI does not expose that pair directly.
Step 2: Apply These Decision Rules
- If the prompt includes a selector, id, XPath, or exact field target, use browser primitives -- not
act. - If you only need a yes/no answer, use
validate-- notextractoract. - If the work stays on one page and labels are clear, use
actor a primitive chain. - If the user says
try this once,see if this works, or clearly wants a one-off exploratory trial, userun-task. - If the task spans multiple pages and is meant to be reusable, scheduled, repeatable, or explicitly
set upas automation, useworkflow create. - Never type passwords. Always use stored credentials with
skyvern browser login.
Step 3: Create a Session
Every browser command needs a session. Create one first:
# Cloud session (default -- works for public URLs)
skyvern browser session create --timeout 30
# Local session (for localhost URLs or self-hosted mode)
skyvern browser session create --local --timeout 30
# Connect to existing browser via CDP
skyvern browser session connect --cdp "ws://localhost:9222"
Session state persists between commands. After session create, subsequent commands auto-attach.
Override with --session pbs_.... Close when done: skyvern browser session close.
Step 4: Execute by Classification
Quick check (yes/no)
skyvern browser validate --prompt "Is the user logged in? Look for a dashboard or avatar."
Returns true/false. Cheapest AI option -- prefer over extract or act for boolean checks.
Quick inspection
skyvern browser extract \
--prompt "Extract all product names and prices" \
--schema '{"type":"object","properties":{"items":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"price":{"type":"string"}}}}}}'
Uses screenshots + dedicated extraction LLM. Better than screenshot+read because Skyvern's LLM interprets the page.
Single action (known target)
skyvern browser click --selector "#submit-btn"
skyvern browser type --text "user@co.com" --selector "#email"
skyvern browser select --value "US" --intent "the country dropdown"
Determi