--- title: Responses API | Lightcone description: OpenAI-compatible computer-use agent interface for building your own agent loops. --- The Responses API is an OpenAI-compatible interface for building **computer-use agent (CUA)** loops. A CUA loop is a pattern where a vision model looks at a screenshot of a computer screen, decides the next action (click, type, scroll), and you execute it — repeating until the task is complete. Unlike [Agent Tasks](/guides/agent-tasks/index.md) where the agent runs autonomously, the Responses API gives you control at every step. ## How it works 1. **Send input** — provide a text instruction and optionally a screenshot 2. **Get output** — the model returns either a text `message` or a `computer_call` action 3. **Execute the action** — perform the click, type, or scroll on a computer session 4. **Feed back the result** — send a screenshot of the new state as `computer_call_output` 5. **Repeat** until the model returns a `message` (it’s done) or you decide to stop ## Input format The `input` field accepts either a **string** or an **array of message objects**: - **String** — simplest form, for text-only instructions - **Array** — when you need to include images (screenshots) or structured multi-turn input Both formats work identically for text-only requests. Use the array format when you need to attach a screenshot with `input_image`. ## Create a response The simplest form — pass a string as `input`: ``` from tzafon import Lightcone client = Lightcone() response = client.responses.create( model="tzafon.northstar-cua-fast", input="Go to wikipedia.org and search for 'Alan Turing'", tools=[ { "type": "computer_use", "display_width": 1280, "display_height": 720, "environment": "browser", }, ], ) ``` ``` import Lightcone from "@tzafon/lightcone"; const client = new Lightcone(); const response = await client.responses.create({ model: "tzafon.northstar-cua-fast", input: "Go to wikipedia.org and search for 'Alan Turing'", tools: [ { type: "computer_use", display_width: 1280, display_height: 720, environment: "browser", }, ], }); ``` When you need to include a screenshot (common in CUA loops), use the array format: ``` response = client.responses.create( model="tzafon.northstar-cua-fast", input=[ { "role": "user", "content": [ {"type": "input_text", "text": "Click the search button"}, {"type": "input_image", "image_url": screenshot_url}, ], }, ], tools=[ { "type": "computer_use", "display_width": 1280, "display_height": 720, "environment": "browser", }, ], ) ``` ``` const response = await client.responses.create({ model: "tzafon.northstar-cua-fast", input: [ { role: "user", content: [ { type: "input_text", text: "Click the search button" }, { type: "input_image", image_url: screenshotUrl }, ], }, ], tools: [ { type: "computer_use", display_width: 1280, display_height: 720, environment: "browser", }, ], }); ``` `display_width` and `display_height` are optional. If omitted and you include a screenshot in the input, the server infers the viewport dimensions from the image. If no image is provided, it defaults to 1024×768. ## Process the output The response `output` is an array of items. Each item has a `type`: | Type | Meaning | | --------------- | -------------------------------------------------------------------- | | `computer_call` | The model wants to perform an action (click, type, scroll, and more) | | `message` | The model is responding with text (task may be done) | | `reasoning` | Internal reasoning (if available) | When you get a `computer_call`, the `action` field tells you what to do: ``` for item in response.output: if item.type == "computer_call": action = item.action print(f"Action: {action.type}") # e.g., "click", "type", "navigate" print(f"Coordinates: ({action.x}, {action.y})") print(f"Text: {action.text}") elif item.type == "message": for block in item.content: print(block.text) ``` ``` for (const item of response.output ?? []) { if (item.type === "computer_call") { console.log(`Action: ${item.action?.type}`); console.log(`Coordinates: (${item.action?.x}, ${item.action?.y})`); console.log(`Text: ${item.action?.text}`); } else if (item.type === "message") { for (const block of item.content ?? []) { console.log(block.text); } } } ``` ## Action types The model returns actions with coordinates already scaled to match your `display_width` and `display_height`. You can pass `action.x` and `action.y` directly to `computer.click()` or other session methods without any conversion. The model can request these actions: | Action | Fields | Description | | --------------------- | -------------------------- | ------------------------------------------------------- | | `click` | `x`, `y`, `button` | Click at coordinates | | `double_click` | `x`, `y` | Double-click | | `triple_click` | `x`, `y` | Triple-click (select a line) | | `right_click` | `x`, `y` | Right-click | | `type` | `text` | Type text | | `key` / `keypress` | `keys` | Press key combination | | `key_down` / `key_up` | `keys` | Hold / release key | | `scroll` | `x`, `y`, `scroll_y` | Scroll vertically | | `hscroll` | `x`, `y`, `scroll_x` | Scroll horizontally | | `navigate` | `url` | Go to a URL (browser only) | | `drag` | `x`, `y`, `end_x`, `end_y` | Drag between two points | | `wait` | — | Wait for the page to settle | | `terminate` | `status`, `result` | Task is complete (`status`: `"success"` or `"failure"`) | | `answer` | `result` | Answer a question with findings | | `done` | `text` | Task is complete (alias) | ## Multi-turn chaining Use `previous_response_id` to chain conversations without resending the full history: ``` # First turn — string input is fine for text-only response = client.responses.create( model="tzafon.northstar-cua-fast", input="Navigate to example.com", tools=[{"type": "computer_use", "display_width": 1280, "display_height": 720, "environment": "browser"}], ) # Execute the action, take a screenshot, then continue followup = client.responses.create( model="tzafon.northstar-cua-fast", previous_response_id=response.id, input=[ { "type": "computer_call_output", "call_id": response.output[0].call_id, "output": {"type": "input_image", "image_url": screenshot_url}, }, ], tools=[{"type": "computer_use", "display_width": 1280, "display_height": 720, "environment": "browser"}], ) ``` ``` // First turn — string input is fine for text-only const response = await client.responses.create({ model: "tzafon.northstar-cua-fast", input: "Navigate to example.com", tools: [{ type: "computer_use", display_width: 1280, display_height: 720, environment: "browser" }], }); const followup = await client.responses.create({ model: "tzafon.northstar-cua-fast", previous_response_id: response.id!, input: [ { type: "computer_call_output", call_id: response.output![0].call_id!, output: { type: "input_image", image_url: screenshotUrl }, }, ], tools: [{ type: "computer_use", display_width: 1280, display_height: 720, environment: "browser" }], }); ``` ## Manage responses ``` # Retrieve a response response = client.responses.retrieve("resp_abc123") # Cancel an in-progress response client.responses.cancel("resp_abc123") # Delete a response client.responses.delete("resp_abc123") ``` ``` const response = await client.responses.retrieve("resp_abc123"); await client.responses.cancel("resp_abc123"); await client.responses.delete("resp_abc123"); ``` ## Models | Model | Best for | | --------------------------- | -------------------------------------- | | `tzafon.northstar-cua-fast` | Computer-use tasks (optimized for CUA) | | `tzafon.sm-1` | General text tasks | For a fully managed agent experience where you don’t need to control the loop, use [Agent Tasks](/guides/agent-tasks/index.md) instead. The Responses API is for when you need full control over every step. ## See also - [**CUA protocol guide**](/guides/cua-protocol/index.md) — full implementation of a CUA loop - [**Agent Tasks**](/guides/agent-tasks/index.md) — fully managed alternative where the agent runs autonomously - [**Computers**](/guides/computers/index.md) — session lifecycle and all available actions - [**How Lightcone works**](/guides/how-lightcone-works/index.md) — how the three API layers relate