Automate a Browser

Guides

End-to-end guide to creating a browser session, navigating pages, interacting with elements, and capturing results.

This guide walks through a complete browser automation workflow: searching Wikipedia, clicking a result, and capturing a screenshot.

Prerequisites: Complete the Quickstart and have TZAFON_API_KEY set in your environment.

The full example

from tzafon import Lightcone

client = Lightcone()

with client.computer.create(kind="browser") as computer:
    # Navigate to Wikipedia
    computer.navigate("https://wikipedia.org")
    computer.wait(1)

    # Type a search query
    computer.click(640, 360)  # Click the search box
    computer.type("Ada Lovelace")
    computer.hotkey("enter")
    computer.wait(2)

    # Get page content
    html_result = computer.html()
    html_content = computer.get_html_content(html_result)
    print(f"Page length: {len(html_content)} chars")

    # Scroll down
    computer.scroll(0, 500, 640, 400)
    computer.wait(1)

    # Capture a screenshot
    result = computer.screenshot()
    print(f"Screenshot: {computer.get_screenshot_url(result)}")

import Lightcone from "@tzafon/lightcone";

const client = new Lightcone();
const computer = await client.computers.create({ kind: "browser" });
const id = computer.id!;

try {
  // Navigate to Wikipedia
  await client.computers.navigate(id, { url: "https://wikipedia.org" });

  // Type a search query
  await client.computers.click(id, { x: 640, y: 360 });
  await client.computers.type(id, { text: "Ada Lovelace" });
  await client.computers.hotkey(id, { keys: ["Enter"] });

  // Get page HTML
  const htmlResult = await client.computers.html(id);
  console.log("HTML retrieved");

  // Scroll down
  await client.computers.scroll(id, { dx: 0, dy: 500, x: 640, y: 400 });

  // Capture a screenshot
  const result = await client.computers.screenshot(id);
  console.log("Screenshot:", result.result?.screenshot_url);
} finally {
  await client.computers.delete(id);
}

Finding coordinates

Coordinates are screenshot pixel positions. To find where to click:

Take a screenshot first: computer.screenshot()
Open the screenshot URL in your browser
Note the pixel coordinates of the element you want to interact with
Pass those exact coordinates to click(), double_click(), or type().

Working with page context

Every action can return page context with viewport state. Use the low-level execute() method to get it:

result = client.computers.execute(session.id, action={
    "type": "screenshot",
    "include_context": True,
})

ctx = result.page_context
print(f"URL: {ctx.url}")
print(f"Title: {ctx.title}")
print(f"Viewport: {ctx.viewport_width}x{ctx.viewport_height}")
print(f"Page size: {ctx.page_width}x{ctx.page_height}")
print(f"Scroll position: ({ctx.scroll_x}, {ctx.scroll_y})")

const result = await client.computers.execute(id, {
  action: {
    type: "screenshot",
    include_context: true,
  },
});

const ctx = result.page_context;
console.log(`URL: ${ctx?.url}`);
console.log(`Title: ${ctx?.title}`);
console.log(`Viewport: ${ctx?.viewport_width}x${ctx?.viewport_height}`);

Handling waits

There’s no automatic wait-for-page-load. Use explicit waits:

computer.navigate("https://example.com")
computer.wait(2)  # Wait 2 seconds for the page to load

computer.click(100, 200)
computer.wait(1)  # Wait for any animations or network requests

await client.computers.navigate(id, { url: "https://example.com" });
// Use a simple delay
await new Promise((r) => setTimeout(r, 2000));

await client.computers.click(id, { x: 100, y: 200 });
await new Promise((r) => setTimeout(r, 1000));

Error handling

import tzafon

try:
    with client.computer.create(kind="browser") as computer:
        computer.navigate("https://example.com")
        result = computer.click(100, 200)
        if result.error_message:
            print(f"Action failed: {result.error_message}")
except tzafon.AuthenticationError:
    print("Invalid API key")
except tzafon.RateLimitError:
    print("Rate limited — slow down")
except tzafon.APIError as e:
    print(f"API error {e.status_code}: {e.message}")

try {
  const result = await client.computers.click(id, { x: 100, y: 200 });
  if (result.error_message) {
    console.error(`Action failed: ${result.error_message}`);
  }
} catch (err) {
  if (err instanceof Lightcone.AuthenticationError) {
    console.error("Invalid API key");
  } else if (err instanceof Lightcone.RateLimitError) {
    console.error("Rate limited");
  } else if (err instanceof Lightcone.APIError) {
    console.error(`API error ${err.status}: ${err.message}`);
  }
}

Tips

Start with a screenshot to understand the page layout before clicking
Use wait() after navigation and clicks to let pages settle
Check error_message on every action result for graceful error handling
Set use_advanced_proxy: true when automating sites with bot detection
Use persistent: true to keep login sessions across runs

Automate a Browser

The full example

Finding coordinates

Working with page context

Handling waits

Error handling

Tips

See also

What can I help you with?

Suggestions