Skip to content
NorthstarPlatformPricingLogin
Environments

Operate a Computer

End-to-end guide to spinning up a cloud desktop, interacting with it, and capturing results.

This guide walks through a complete desktop workflow: launching an application, interacting with it, and capturing the results.

Prerequisites: Complete the Quickstart and have TZAFON_API_KEY set in your environment.

operate.py
from tzafon import Lightcone
client = Lightcone()
with client.computer.create(kind="desktop") as computer:
# Open a browser on the desktop (exec is on the low-level client, not the session wrapper)
client.computers.exec.sync(computer.id, command="firefox https://wikipedia.org &")
computer.wait(3)
# Click the search box and type a query
computer.click(640, 360)
computer.type("Ada Lovelace")
computer.hotkey("Enter")
computer.wait(2)
# Scroll down the page
computer.scroll(0, 500, 640, 400)
computer.wait(1)
# Capture a screenshot
result = computer.screenshot()
print(f"Screenshot: {computer.get_screenshot_url(result)}")
operate.ts
import Lightcone from "@tzafon/lightcone";
const client = new Lightcone();
const computer = await client.computers.create({ kind: "desktop" });
const id = computer.id!;
try {
// Open a browser on the desktop
await client.computers.exec.sync(id, { command: "firefox https://wikipedia.org &" });
await new Promise((r) => setTimeout(r, 3000));
// Click the search box and type a query
await client.computers.click(id, { x: 640, y: 360 });
await client.computers.type(id, { text: "Ada Lovelace" });
await client.computers.hotkey(id, { keys: ["Enter"] });
await new Promise((r) => setTimeout(r, 2000));
// Scroll down
await client.computers.scroll(id, { dx: 0, dy: 500, x: 640, y: 400 });
await new Promise((r) => setTimeout(r, 1000));
// Capture a screenshot
const result = await client.computers.screenshot(id);
console.log("Screenshot:", result.result?.screenshot_url);
} finally {
await client.computers.delete(id);
}

Coordinates are screenshot pixel positions. To find where to click:

  1. Take a screenshot first: computer.screenshot()
  2. Open the screenshot URL in your browser
  3. Note the pixel coordinates of the element you want to interact with
  4. Pass those exact coordinates to click(), double_click(), or type()

Every action can return page context with viewport state. Use the low-level execute() method to get it:

from tzafon import Lightcone
client = Lightcone()
with client.computer.create(kind="browser") as computer:
result = client.computers.execute(computer.id, action={
"type": "screenshot",
"include_context": True,
})
ctx = result.page_context
print(f"URL: {ctx.url}")
print(f"Title: {ctx.title}")
print(f"Viewport: {ctx.viewport_width}x{ctx.viewport_height}")
print(f"Page size: {ctx.page_width}x{ctx.page_height}")
print(f"Scroll position: ({ctx.scroll_x}, {ctx.scroll_y})")
const result = await client.computers.execute(id, {
action: {
type: "screenshot",
include_context: true,
},
});
const ctx = result.page_context;
console.log(`URL: ${ctx?.url}`);
console.log(`Title: ${ctx?.title}`);
console.log(`Viewport: ${ctx?.viewport_width}x${ctx?.viewport_height}`);

There’s no automatic wait-for-page-load. Use explicit waits:

client.computers.exec.sync(computer.id, command="firefox https://example.com &")
computer.wait(3) # Wait for the app to launch and page to load
computer.click(100, 200)
computer.wait(1) # Wait for any animations or network requests
await client.computers.exec.sync(id, { command: "firefox https://example.com &" });
await new Promise((r) => setTimeout(r, 3000));
await client.computers.click(id, { x: 100, y: 200 });
await new Promise((r) => setTimeout(r, 1000));
import tzafon
from tzafon import Lightcone
client = Lightcone()
try:
with client.computer.create(kind="desktop") as computer:
computer.click(100, 200)
result = computer.screenshot()
if result.error_message:
print(f"Action failed: {result.error_message}")
except tzafon.AuthenticationError:
print("Invalid API key")
except tzafon.RateLimitError:
print("Rate limited — slow down")
except tzafon.APIStatusError as e:
print(f"API error {e.status_code}: {e.message}")
try {
const result = await client.computers.click(id, { x: 100, y: 200 });
if (result.error_message) {
console.error(`Action failed: ${result.error_message}`);
}
} catch (err) {
if (err instanceof Lightcone.AuthenticationError) {
console.error("Invalid API key");
} else if (err instanceof Lightcone.RateLimitError) {
console.error("Rate limited");
} else if (err instanceof Lightcone.APIError) {
console.error(`API error ${err.status}: ${err.message}`);
}
}
  • Start with a screenshot to understand the desktop layout before clicking
  • Use wait() after launching apps and after clicks to let things settle
  • Check error_message on every action result for graceful error handling
  • Use shell commands to launch applications — client.computers.exec.sync(id, command="app &")
  • Use persistent: true to keep computer state across runs
  • Computers — full reference for computer lifecycle and actions
  • Shell commands — run terminal commands on your computer
  • Run a task — let Northstar operate the computer instead of hardcoding coordinates
  • Playwright integration — use CSS selectors for browser-mode computers
  • Coordinates — how Northstar’s coordinate system works and how to scale coordinates
  • Troubleshooting — common issues with clicks, waits, and more
  • desktop.py — full desktop loop example with screenshot, think, act