Skip to main content

Phone Bridge API

When running the 0x01 Mobile application, the native OS environment exposes a local HTTP server on 127.0.0.1:9092.

This local server acts as the "Phone Bridge". It translates standard REST requests from the agent's LLM brain (or remote controlling agents) into Android system commands.

Architecture

  1. The ZeroClaw agent generates a plan that requires device context.
  2. The agent issues an HTTP request to http://127.0.0.1:9092.
  3. The request is authenticated via an x-bridge-token stored in Android Keystore.
  4. The native Kotlin backend executes the requested operation and returns a JSON payload.

By supplying the OpenAPI specification for port 9092 to the agent's LLM as a tool-call array, the agent gains programmable access to device hardware and personal data.

Authentication

Every request must include the bridge token header:

x-bridge-token: <token>

The token is generated on application install and stored in Android Keystore. It is handed only to the local node process at startup and is never transmitted off-device. Any request without a valid token returns HTTP 401.

Capability Gating

Each endpoint is gated by a named capability that can be toggled individually in Settings → Phone Bridge. If a capability is disabled, the endpoint returns HTTP 403 with {"error": "capability disabled"}. The capabilities are:

CapabilityControls
notifications_readRead active and historical notifications from all apps
notifications_replyReply to notifications inline (WhatsApp, Messages, etc.)
notifications_dismissDismiss notifications
sms_readRead SMS inbox
sms_sendSend SMS messages
contactsRead/write address book
locationGPS location
calendarRead/write calendar events
mediaPhoto library and documents
motionIMU accelerometer/gyroscope data
cameraHeadless camera capture
microphoneAudio recording
callsCall log, pending calls, call screening
healthHealth Connect data (steps, heart rate, sleep, calories)
wearablesBLE GATT wearable scan and read
screen_read_treeRead UI view hierarchy of foreground app
screen_captureScreenshot
screen_actInject touch/click/scroll/type into other apps
screen_global_navTrigger Back/Home/Recents
screen_visionSend screenshot to Vision-Language Model
screen_autonomyMulti-step autonomous screen sequences

Endpoints

Contacts

GET /phone/contacts

Returns the device address book. Optional ?query= for name search.

{
"contacts": [
{ "id": "12", "name": "Alice", "phone": "+15551234567" },
{ "id": "34", "name": "Bob", "phone": "+15559876543" }
]
}

POST /phone/contacts

Creates a new contact.

Request:

{ "name": "Alice", "phone": "+15551234567" }

Response:

{ "raw_contact_id": 42 }

PUT /phone/contacts/{id}

Updates an existing contact by ID.

Request:

{ "phone": "+15550001111" }

Response:

{ "rows_updated": 1 }

Messaging

SMS endpoints are gated by sms_read (read) and sms_send (send) separately.

GET /phone/sms

Returns recent SMS messages. Requires sms_read. Optional ?box=inbox|sent|all (default inbox) and ?limit=.

{
"messages": [
{ "from": "+15551234567", "body": "On my way", "timestamp": "2026-03-21T13:10:00Z" },
{ "from": "+15559876543", "body": "Call me back", "timestamp": "2026-03-21T12:45:00Z" }
]
}

POST /phone/sms/send

Sends a text message. Requires sms_send.

Request:

{ "to": "+15551234567", "message": "Hello from the agent" }

Response:

{ "success": true }

Calls

GET /phone/call_log

Returns recent call history. Optional ?limit=.

{
"calls": [
{ "number": "+15551234567", "type": "incoming", "duration_seconds": 142, "timestamp": "2026-03-21T11:00:00Z" },
{ "number": "+15559876543", "type": "missed", "duration_seconds": 0, "timestamp": "2026-03-21T10:30:00Z" }
]
}

type values: "incoming", "outgoing", "missed".

GET /phone/calls/pending

Returns the current incoming call waiting for screening (if any).

GET /phone/calls/history

Returns call screening history log.

POST /phone/calls/respond

Responds to a pending screened call.

Request:

{ "action": "answer" }

action values: "answer", "reject", "silence".


Location

GET /phone/location

Returns the current GPS fix.

{
"lat": 40.7128,
"lng": -74.0060,
"accuracy_meters": 12.5,
"timestamp": "2026-03-21T14:22:10Z"
}

Calendar

GET /phone/calendar

Returns calendar events. Optional ?start_ms=, ?end_ms=, ?limit=.

{
"events": [
{ "id": "1", "title": "Team sync", "start": "2026-03-22T10:00:00Z", "end": "2026-03-22T11:00:00Z", "location": "Zoom" }
]
}

POST /phone/calendar

Creates a new calendar event.

Request:

{ "title": "Meeting", "start_ms": 1742640000000, "end_ms": 1742643600000, "description": "Optional notes" }

Response:

{ "event_id": "42" }

PUT /phone/calendar/{id}

Updates an existing event.


Camera

POST /phone/camera/capture

Triggers a headless camera capture.

Request:

{ "lens": "back" }

lens values: "front", "back".

Response:

{
"image_b64": "<base64-encoded JPEG>",
"width": 1920,
"height": 1080,
"format": "jpeg"
}

Audio

POST /phone/audio/record

Records audio for a fixed duration.

Request:

{ "duration_seconds": 5 }

Response:

{
"audio_b64": "<base64-encoded WAV>",
"format": "wav",
"duration_seconds": 5
}

GET /phone/audio/profile

Returns the current audio environment profile (detected ambient noise level, active audio mode).

POST /phone/audio/profile

Sets audio mode preferences.


Media

GET /phone/media/images

Lists photos from the device gallery, sorted newest first. Optional ?limit= (max 50) and ?offset=.

[
{
"id": 1234,
"uri": "content://media/external/images/media/1234",
"name": "IMG_20260321.jpg",
"date_taken": 1742600000000,
"size_bytes": 3145728,
"width": 4032,
"height": 3024
}
]

GET /phone/documents

Lists accessible documents from device storage. Optional ?mime_type=, ?limit=.


Notifications

GET /phone/notifications

Returns currently visible notifications.

{
"notifications": [
{ "id": "1", "package": "com.whatsapp", "title": "Alice", "text": "On my way", "timestamp": 1742600000000 }
]
}

GET /phone/notifications/history

Returns the recent notification history ring buffer (last 200 entries).

POST /phone/notifications/reply

Sends a reply via a notification's inline reply action.

Request:

{ "notification_key": "<key>", "reply_text": "Be there in 10" }

POST /phone/notifications/dismiss

Dismisses a notification.

Request:

{ "notification_key": "<key>" }

Health & Wearables

GET /phone/health

Returns health data from Android Health Connect. Optional ?types=steps,heart_rate,sleep,calories and ?days=7 (max 90).

{
"steps": { "total": 8432, "days": [{ "date": "2026-03-21", "count": 8432 }] },
"heart_rate": { "average_bpm": 72, "resting_bpm": 62 },
"sleep": { "total_hours": 7.5 },
"calories": { "total_kcal": 2100 }
}

GET /phone/recovery

Returns a computed recovery score (0–100) based on recent sleep, HRV, and resting heart rate.

{ "score": 82, "label": "ready", "components": { "sleep": 85, "hrv": 78, "resting_hr": 83 } }

GET /phone/wearables/scan

Scans for BLE GATT wearables. Optional ?duration_ms= (2000–15000, default 8000).

{
"devices": [
{ "address": "AA:BB:CC:DD:EE:FF", "name": "Polar H10", "rssi": -62 }
]
}

GET /phone/wearables/read

Reads a GATT characteristic from a paired wearable.

Query params: ?device=<MAC>, ?service=heart_rate|battery|body_composition|running_speed_cadence|glucose|cgm


Device & System

GET /phone/device

Returns device hardware info.

{
"manufacturer": "Google",
"model": "Pixel 9",
"brand": "google",
"android_version": "15",
"sdk_int": 35,
"screen_width": 1080,
"screen_height": 2400,
"locale": "en_US"
}

GET /phone/battery

Returns current battery state.

{ "percent": 78, "status": "discharging", "source": "unplugged" }

status values: "charging", "discharging", "full", "not_charging". source values: "ac", "usb", "wireless", "unplugged".

GET /phone/network

Returns active network state.

{ "connected": true, "type": "wifi", "internet": true, "validated": true }

GET /phone/wifi

Returns detailed Wi-Fi info (requires ACCESS_FINE_LOCATION for SSID).

{ "enabled": true, "ssid": "MyNetwork", "ip": "192.168.1.42", "rssi": -55, "link_speed": 433, "frequency": 5180 }

GET /phone/carrier

Returns cellular carrier and call state.

{ "operator_name": "T-Mobile", "country_iso": "us", "roaming": false, "call_state": "idle" }

call_state values: "idle", "ringing", "offhook".

GET /phone/bluetooth

Returns paired Bluetooth devices.

{
"enabled": true,
"devices": [
{ "address": "AA:BB:CC:DD:EE:FF", "name": "Pixel Buds", "type": "le" }
]
}

GET /phone/timezone

Returns device timezone.

{ "id": "America/New_York", "display_name": "EST", "offset_ms": -18000000, "dst_active": false }

GET /phone/activity

Returns the current physical activity state (requires ACTIVITY_RECOGNITION permission).

{ "activity": "walking", "confidence": 90 }

GET /phone/app_usage

Returns foreground app usage stats. Optional ?hours=24 (max 168).

{
"usage": [
{ "package": "com.twitter.android", "total_ms": 3600000, "last_used_ms": 1742600000000 }
]
}

GET /phone/permissions

Returns the current grant status of all bridge-relevant permissions.

{
"READ_CONTACTS": true,
"WRITE_CONTACTS": false,
"READ_SMS": true,
"SEND_SMS": false,
"ACCESS_FINE_LOCATION": true,
"CAMERA": true,
"RECORD_AUDIO": false
}

Motion (IMU)

GET /phone/imu

Takes a single accelerometer + gyroscope snapshot.

{
"timestamp_ms": 1742600000000,
"accelerometer": { "x": 0.12, "y": 9.78, "z": 0.04, "unit": "m/s²" },
"gyroscope": { "x": 0.01, "y": -0.02, "z": 0.00, "unit": "rad/s" }
}

POST /phone/imu/record

Records a time-series IMU stream.

Request:

{ "duration_ms": 5000, "rate_hz": 50 }

duration_ms: 500–30000. rate_hz: 10–200 (default 50).

Response:

{
"duration_ms": 5000,
"rate_hz": 50,
"sample_count": 250,
"has_gyroscope": true,
"accelerometer": [{ "t_ms": 0, "x": 0.12, "y": 9.78, "z": 0.04 }, "..."],
"gyroscope": [{ "t_ms": 0, "x": 0.01, "y": -0.02, "z": 0.00 }, "..."]
}

System Interaction

POST /phone/notify

Posts a notification to the system tray.

Request:

{ "title": "Agent Update", "body": "Your flight price dropped to $312." }

Response:

{ "success": true }

POST /phone/vibrate

Vibrates the device.

Request:

{ "duration_ms": 200, "amplitude": 128 }

amplitude: 1–255, or omit for device default.

Response:

{ "vibrating": true, "duration_ms": 200 }

POST /phone/clipboard

Writes text to the clipboard.

Request:

{ "text": "text to copy" }

Response:

{ "success": true }

POST /phone/alarm

Sets a one-shot alarm.

Request:

{ "trigger_ms": 1742643600000, "label": "Call Alice" }

Response:

{ "success": true }

Accessibility (Screen Control)

All /phone/a11y/* endpoints require the Accessibility Service to be enabled and the relevant screen_* capability to be on. Available in the full APK only. In ASSISTED mode these endpoints trigger a user confirmation bottom-sheet before executing.

GET /phone/a11y/status

Returns whether the accessibility service is currently connected.

{ "connected": true }

GET /phone/a11y/tree

Returns the full UI view hierarchy of the foreground app. Requires screen_read_tree.

{
"nodes": [
{
"className": "android.widget.Button",
"text": "Submit",
"viewId": "com.example:id/btn_submit",
"bounds": { "left": 120, "top": 540, "right": 360, "bottom": 600 },
"clickable": true,
"editable": false,
"enabled": true
}
]
}

GET /phone/a11y/screenshot

Captures a JPEG screenshot. Requires screen_capture. Requires Android 11+.

{ "format": "jpeg", "data_base64": "<base64 JPEG>" }

POST /phone/a11y/action

Performs an accessibility action on a specific view node by ID. Requires screen_act.

Request:

{ "viewId": "com.example:id/btn_submit", "action": "click" }

action values: "click", "long_click", "focus", "clear_focus", "scroll_forward", "scroll_backward", "set_text" (pass "text" field for set_text).

Response:

{ "success": true }

POST /phone/a11y/click

Clicks at absolute screen coordinates. Requires screen_act.

Request:

{ "x": 240, "y": 570 }

Response:

{ "success": true }

POST /phone/a11y/global

Triggers a global navigation action. Requires screen_global_nav.

Request:

{ "action": "back" }

action values: "back", "home", "recents", "notifications", "quick_settings".

Response:

{ "success": true }

GET /phone/a11y/autonomy

Returns the current accessibility tree (same as /tree). Used as a foundation step before beginning a multi-step autonomous sequence. Requires screen_autonomy.

POST /phone/a11y/wait_for

Waits up to timeout_ms for an element matching the selector to appear. Optionally taps it. Requires screen_act.

Request:

{
"view_id": "com.example:id/btn_ok",
"text": "Submit",
"content_desc": "close button",
"class_name": "android.widget.Button",
"exact_text": false,
"timeout_ms": 5000,
"tap": false
}

At least one of view_id, text, content_desc, or class_name is required. All fields are optional individually.

Response:

{ "found": true, "tapped": false, "node": { "text": "Submit", "viewId": "...", "bounds": {...} } }

POST /phone/a11y/scroll_find

Scrolls a container until the target element appears, then optionally taps it. Requires screen_act.

Request:

{
"text": "Settings",
"direction": "down",
"max_scrolls": 10,
"container_view_id": "com.example:id/list",
"wait_after_ms": 400,
"tap": true
}

direction values: "up", "down", "left", "right" (default "down").

Response:

{ "found": true, "tapped": true }

POST /phone/a11y/tap_text

Finds the first element whose text matches and clicks it. Requires screen_act.

Request:

{ "text": "Submit", "exact": false, "timeout_ms": 3000 }

Response:

{ "success": true }

POST /phone/a11y/type

Types text into a field. Without view_id, targets the focused or first editable field. Requires screen_act.

Request:

{ "text": "hello world", "view_id": "com.example:id/input" }

Response:

{ "success": true }

POST /phone/a11y/swipe

Performs a swipe gesture between two screen coordinates. Requires screen_act.

Request:

{ "x1": 300, "y1": 800, "x2": 300, "y2": 200, "duration_ms": 300 }

Response:

{ "success": true }

GET /phone/a11y/tree_interactive

Returns a filtered UI tree containing only interactive elements (clickable, long-clickable, or editable). Lighter payload than /phone/a11y/tree. Requires screen_read_tree. Optional ?class_filter=Button,EditText to further narrow by class name.

{
"nodes": [
{ "className": "android.widget.Button", "text": "Submit", "viewId": "...", "bounds": {...}, "clickable": true }
]
}

POST /phone/a11y/execute_plan

Executes a multi-step UI automation plan in a single request. Each step is an action applied to an element identified by view_id, text, or screen coordinates. Requires screen_act and screen_autonomy.

Request:

{
"steps": [
{ "action": "tap_text", "text": "Search" },
{ "action": "type", "view_id": "com.example:id/search_input", "text": "0x01 agent" },
{ "action": "click", "x": 900, "y": 120 },
{ "action": "wait_for", "text": "Results", "timeout_ms": 4000 }
]
}

Response:

{ "success": true, "steps_executed": 4 }

Step action values: "tap_text", "type", "click", "long_click", "scroll_find", "wait_for", "global", "swipe".


Apps

POST /phone/app/launch

Launches an installed app by package name. Requires screen_act.

Request:

{ "package": "com.twitter.android" }

Response:

{ "success": true }

GET /phone/app/list

Lists installed non-system apps. Returns package name and label.

[
{ "package": "com.twitter.android", "label": "X" },
{ "package": "com.whatsapp", "label": "WhatsApp" }
]

Notifications (continued)

GET /phone/notifications/triage

Returns all currently visible notifications grouped by priority (urgent, normal, silent). Requires notifications_read.

POST /phone/notifications/triage

Posts a triage decision for a set of notifications — mark as read, snooze, or dismiss. Requires notifications_dismiss.


Device Context

GET /phone/context

Returns an aggregated snapshot of device state — battery, network, current foreground app, active notifications count, pending calls, and next calendar event. Useful as a single call to orient the agent before it acts.

{
"battery": { "percent": 78, "status": "discharging" },
"network": { "connected": true, "type": "wifi" },
"foreground_app": "com.twitter.android",
"notifications_count": 3,
"pending_calls": 0,
"next_event": { "title": "Team sync", "start": "2026-03-22T10:00:00Z" }
}

POST /phone/a11y/vision

Captures a screenshot, optionally includes the UI tree, and sends both to a Vision-Language Model for analysis. Returns structured action suggestions. Requires screen_vision. Rate limited to 1 call per 3 seconds.

The VLM used is determined by the agent brain's configured provider (Gemini, Anthropic, or any OpenAI-compatible endpoint). No separate vision API key is needed — the same key from the Agent Brain config is used.

Request:

{ "prompt": "What buttons are visible on screen?", "include_tree": false }

Response:

{
"description": "The screen shows the Uber app with a pickup location field and a 'Request' button.",
"actions": [
{ "type": "click", "target": "Request button", "bounds": { "left": 120, "top": 540, "right": 360, "bottom": 600 } }
]
}

Diagnostics

GET /phone/activity_log

Returns the bridge activity ring buffer (last 200 entries). Shows what actions the agent has taken recently. Optional ?limit=.

{
"entries": [
{ "ts": 1742600000000, "endpoint": "POST /phone/a11y/tap_text", "cap": "screen_act", "status": 200 }
]
}

Security Model

  • The Phone Bridge binds strictly to 127.0.0.1 and cannot be accessed from the public internet.
  • Every request requires a valid x-bridge-token (HTTP 401 otherwise).
  • Each endpoint checks the corresponding capability toggle (HTTP 403 if disabled).
  • In ASSISTED mode, screen-control endpoints present a user confirmation bottom-sheet before executing; the request blocks until approved or denied.
  • The bridge activity log at /phone/activity_log gives the user full visibility into what the agent has done.