Computer Use APIv1

Give your code
eyes and hands.

Send a screenshot. Get structured mouse and keyboard actions back. One REST endpoint — for automation, browser testing, and AI agents that interact with any GUI.

3.5s
median step latency
10
action primitives
99.9%
uptime SLA
One call. Four lines.

Built for any stack.

Pure REST. No SDK lock-in, no extra servers, no browser drivers.

POST /api/v1/cua/predict
1import requests, base64
2 
3img = base64.b64encode(open("screen.png", "rb").read()).decode()
4 
5r = requests.post(
6 "https://coasty.ai/api/v1/cua/predict",
7 headers={"X-API-Key": "cua_sk_..."},
8 json={
9 "screenshot": img,
10 "instruction": "Click the search bar and type 'hello'",
11 },
12)
13 
14for a in r.json()["actions"]:
15 print(a["action_type"], a["params"])
Returns a stream of typed actions — coordinates, keystrokes, and confidence.

Screenshot in. Actions out.

No selectors. No DOM parsing. No brittle XPath. Just vision.

01

Send screenshot

Base64 PNG/JPEG + plain-language intent

02

AI reasons visually

Vision model identifies the target UI element

03

Execute actions

Typed primitives: click, type, scroll, press…

Vision-First

Works on any UI — web, desktop, mobile, VNC. No DOM access, no selectors, no agents.

Stateful Sessions

Multi-step trajectories. The model remembers what it tried, what worked, and what's next.

Two Engines

V3 for speed (3.5s/step, multi-action). V1 for precision (reflection, single-action).

Any Screen

Browser tabs, desktop apps, mobile emulators, VNC feeds — anything you can capture visually.

10 Action Types

click, double_click, type, scroll, drag, key_press, key_combo, wait, done, fail.

Any Language

Plain REST + JSON. Python, Node, Go, Ruby, PHP, Java, C#, or cURL from your terminal.

Per-request pricing. No subscription.

Deducted from your shared credit balance. Management endpoints always free.

EndpointCost
POST /predict5 cr
POST /sessions10 cr
POST /sessions/{id}/predict4 cr
POST /ground3 cr
POST /ocr3 cr
POST /parseFree
GET /models, /usage, /sessionsFree

Surcharges

Trajectory screenshot+2 cr each
HD image >1280×720+1 cr/image
V1 engine+3 cr/request
Custom system prompt+1 cr
Computer Use API

Send a screenshot, get actions back

The CUA API gives your code the ability to see and interact with any screen. Send a screenshot and a natural language instruction — receive structured mouse clicks, keyboard inputs, and scroll commands with exact coordinates.

Authentication

Every request needs an X-API-Key header. Sign up to create API keys. Credits are deducted per request from your shared balance.

header
X-API-Key: cua_sk_your_key_here

How it Works

1Capture a screenshot of the target screen
2Send it with a natural language instruction
3Receive structured actions (click, type, scroll...)
4Execute the actions in your environment

Quick Start

Choose your language. The predict endpoint is the core of the API — everything else builds on it.

install
pip install requests
predict — single screenshot
import requests, base64

API_KEY = "cua_sk_..."
img = base64.b64encode(open("screen.png", "rb").read()).decode()

r = requests.post(
    "https://coasty.ai/api/v1/cua/predict",
    headers={"X-API-Key": API_KEY},
    json={
        "screenshot": img,
        "instruction": "Click the search bar and type 'hello'",
    },
)

for action in r.json()["actions"]:
    print(action["action_type"], action["params"])
sessions — multi-step tasks
# Create a session for multi-step tasks
s = requests.post(
    "https://coasty.ai/api/v1/cua/sessions",
    headers={"X-API-Key": API_KEY},
    json={"cua_version": "v3", "screen_width": 1920, "screen_height": 1080},
).json()

session_id = s["session_id"]

# Send screenshots in a loop
while True:
    screenshot = capture_screenshot()  # your screenshot function
    r = requests.post(
        f"https://coasty.ai/api/v1/cua/sessions/{session_id}/predict",
        headers={"X-API-Key": API_KEY},
        json={"screenshot": screenshot, "instruction": "Complete the form"},
    ).json()

    for action in r["actions"]:
        execute_action(action)  # your action executor

    if r["status"] in ("done", "fail"):
        break

Response Format

Every prediction returns structured actions with exact coordinates, a status signal, and token usage.

response
{
  "request_id": "req_abc123",
  "actions": [
    {
      "action_type": "click",
      "params": { "x": 512, "y": 340, "button": "left", "clicks": 1 }
    },
    {
      "action_type": "type_text",
      "params": { "text": "hello world" }
    }
  ],
  "reasoning": "I see a search bar at (512, 340)...",
  "status": "continue",
  "usage": {
    "input_tokens": 1523,
    "output_tokens": 245,
    "credits_charged": 5
  }
}

Action Types

clickMouse click at (x, y)
type_textType a string
key_pressPress a key (enter, tab...)
key_comboCombo (ctrl+c, cmd+v...)
scrollScroll at a position
dragDrag between two points
moveMove cursor
waitPause execution
doneTask completed
failTask impossible

Request Options

Only screenshot and instruction are required.

screenshotstringrequired
instructionstringrequired
cua_version"v3" | "v1"
screen_widthint
screen_heightint
max_actionsint (1-10)
trajectoryarray
system_promptstring
toolsstring[]

All Endpoints

All endpoints require the X-API-Key header. Credits deducted from your shared balance.

Prediction
POST/api/v1/cua/predict5 cr
POST/api/v1/cua/sessions10 cr
POST/api/v1/cua/sessions/{id}/predict4 cr
POST/api/v1/cua/sessions/{id}/resetFree
DELETE/api/v1/cua/sessions/{id}Free
Utilities
POST/api/v1/cua/ground3 cr
POST/api/v1/cua/ocr3 cr
POST/api/v1/cua/parseFree
Management
GET/api/v1/cua/modelsFree
GET/api/v1/cua/usageFree
GET/api/v1/cua/sessionsFree

Error Handling

All errors return a JSON body with error.code and error.message fields.

401INVALID_API_KEYMissing or invalid X-API-Key
402INSUFFICIENT_CREDITSNot enough credits for this request
403INSUFFICIENT_SCOPEAPI key lacks the required scope
429RATE_LIMIT_EXCEEDEDToo many requests — check Retry-After header
400INVALID_SCREENSHOTBad base64 or unsupported image format
404SESSION_NOT_FOUNDSession expired or does not exist

Ship your first click in minutes.

Free account, free keys, free credits to start. No card required.

Coasty - #1 Computer-Use AI Agent | AI Employee for Desktop & Browser Automation