Phone Bridge API
When running the 0x01 Mobile application, the native OS environment exposes a local HTTP server on 127.0.0.1:9092.
This local server acts as the "Phone Bridge". It translates standard REST requests from the agent's LLM brain (or remote controlling agents) into raw Android system commands.
Architecture
- The ZeroClaw agent running on the node generates a plan that requires device context.
- The agent issues an HTTP request to
http://127.0.0.1:9092. - The request is authenticated via an
x-bridge-tokenstored in secure Keychain storage. - The native Android (Kotlin) backend securely executes the requested operation and returns a JSON payload.
By supplying the OpenAPI specification for port 9092 to the agent's LLM as a tool-call array, the agent gains programmable control over device hardware.
Authentication
Every request must include the bridge token header:
x-bridge-token: <token>
The token is generated on application install and stored in Android Keystore. It is handed only to the local node process at startup and is never transmitted off-device.
Endpoints
Communication
GET /phone/contacts
Returns the device address book.
{
"contacts": [
{ "name": "Alice", "phone": "+15551234567", "email": "alice@example.com" },
{ "name": "Bob", "phone": "+15559876543" }
]
}
GET /phone/sms
Returns recent SMS messages.
{
"messages": [
{ "from": "+15551234567", "body": "On my way", "timestamp": "2025-03-05T13:10:00Z" },
{ "from": "+15559876543", "body": "Call me back", "timestamp": "2025-03-05T12:45:00Z" }
]
}
POST /phone/sms/send
Sends a text message.
Request:
{ "to": "+15551234567", "message": "Hello from the agent" }
Response:
{ "success": true }
GET /phone/call-log
Returns recent call history.
{
"calls": [
{ "number": "+15551234567", "type": "incoming", "duration_seconds": 142, "timestamp": "2025-03-05T11:00:00Z" },
{ "number": "+15559876543", "type": "missed", "duration_seconds": 0, "timestamp": "2025-03-05T10:30:00Z" }
]
}
type values: "incoming", "outgoing", "missed".
Hardware & Sensors
POST /phone/camera/capture
Triggers a headless camera capture.
Request:
{ "lens": "back" }
lens values: "front", "back".
Response:
{
"image_b64": "<base64-encoded JPEG>",
"width": 1920,
"height": 1080,
"format": "jpeg"
}
POST /phone/microphone/record
Records audio for a fixed duration.
Request:
{ "duration_seconds": 5 }
Response:
{
"audio_b64": "<base64-encoded WAV>",
"format": "wav",
"duration_seconds": 5
}
GET /phone/location
Returns the current GPS fix.
{
"lat": 40.7128,
"lng": -74.0060,
"accuracy_meters": 12.5,
"timestamp": "2025-03-05T14:22:10Z"
}
System Interaction
GET /phone/clipboard
Reads the current clipboard contents.
{ "text": "copied text here" }
POST /phone/clipboard
Writes to the clipboard.
Request:
{ "text": "text to copy" }
Response:
{ "success": true }
POST /phone/notify
Posts a notification to the system tray.
Request:
{ "title": "Agent Update", "body": "Your flight price dropped to $312." }
Response:
{ "success": true }
POST /phone/vibrate
Vibrates the device with a custom pattern.
Request:
{ "pattern_ms": [100, 200, 100, 400] }
The array alternates between vibrate and pause durations in milliseconds.
Response:
{ "success": true }
Experimental / A11y
POST /phone/a11y/vision
Captures a screenshot of the current UI and passes it to a Vision-Language Model for structured analysis. Requires the Android Accessibility Service to be enabled for the app.
Request:
{ "prompt": "What buttons are visible on screen?" }
prompt is optional. If omitted, the VLM returns a general description of the current screen state.
Response:
{
"description": "The screen shows the Uber app with a pickup location field and a 'Request' button.",
"actions": [
{ "type": "click", "target": "Request button", "bounds": [120, 540, 360, 600] }
]
}
Security Model
The Phone Bridge cannot be accessed from the public internet. It binds strictly to 127.0.0.1. Any request without a valid x-bridge-token returns HTTP 401.