Software Testing

Use Cases

Northstar tests your application by using it like a real person — no selectors, no scripts to maintain.

Traditional UI testing is fragile. Selenium scripts break when a button moves, a class name changes, or a page redesigns. Northstar tests your application the way a human would — by looking at the screen and interacting with what it sees. No selectors to maintain, no scripts to update when the UI changes.

Give Northstar a test scenario in plain language:

from tzafon import Lightcone

client = Lightcone()

for event in client.agent.tasks.start_stream(
    instruction=(
        "Go to https://app.example.com/login. "
        "Enter username 'testuser@example.com' and password 'test123'. "
        "Click the login button. "
        "Verify that the dashboard loads and shows a welcome message. "
        "If login fails, report the error message you see."
    ),
    kind="desktop",
    max_steps=15,
):
    print(event)

import Lightcone from "@tzafon/lightcone";

const client = new Lightcone();

const stream = await client.agent.tasks.startStream({
  instruction:
    "Go to https://app.example.com/login. " +
    "Enter username 'testuser@example.com' and password 'test123'. " +
    "Click the login button. " +
    "Verify that the dashboard loads and shows a welcome message. " +
    "If login fails, report the error message you see.",
  kind: "desktop",
  max_steps: 15,
});

for await (const event of stream) {
  console.log(event);
}

Northstar navigates to the login page, finds the fields, enters the credentials, submits, and reports whether it worked — all from a natural language description.

Test a multi-step workflow

for event in client.agent.tasks.start_stream(
    instruction=(
        "Go to https://app.example.com. Log in with 'admin@example.com' / 'admin123'. "
        "Navigate to Settings > Billing. "
        "Verify that the current plan shows 'Pro'. "
        "Click 'Update Payment Method'. "
        "Verify that the payment form loads with credit card fields visible. "
        "Do NOT submit the form — just confirm the form is present and functional."
    ),
    kind="desktop",
    max_steps=30,
):
    print(event)

const stream = await client.agent.tasks.startStream({
  instruction:
    "Go to https://app.example.com. Log in with 'admin@example.com' / 'admin123'. " +
    "Navigate to Settings > Billing. " +
    "Verify that the current plan shows 'Pro'. " +
    "Click 'Update Payment Method'. " +
    "Verify that the payment form loads with credit card fields visible. " +
    "Do NOT submit the form — just confirm the form is present and functional.",
  kind: "desktop",
  max_steps: 30,
});

for await (const event of stream) {
  console.log(event);
}

Visual verification with the Responses API

For more control, use the Responses API to check specific visual states:

with client.computer.create(kind="desktop") as computer:
    client.computers.exec.sync(computer.id, command="firefox https://app.example.com &")
    computer.wait(5)

    screenshot_url = computer.get_screenshot_url(computer.screenshot())

    # Ask Northstar to evaluate what it sees — no tools, so it responds with text
    response = client.responses.create(
        model="tzafon.northstar-cua-fast",
        input=[{
            "role": "user",
            "content": [
                {"type": "input_text", "text": "Does this page look correct? Check for: 1) Logo is visible 2) Navigation bar has Home, Products, About links 3) No error messages or broken images. Report any issues."},
                {"type": "input_image", "image_url": screenshot_url, "detail": "auto"},
            ],
        }],
        # No tools — forces a text assessment
    )

    for item in response.output or []:
        if item.type == "message":
            for block in item.content or []:
                print(block.text)

const computer = await client.computers.create({ kind: "desktop" });
const id = computer.id!;

try {
  await client.computers.exec.sync(id, { command: "firefox https://app.example.com &" });
  await new Promise((r) => setTimeout(r, 5000));

  const screenshot = await client.computers.screenshot(id);
  const screenshotUrl = screenshot.result?.screenshot_url as string;

  const response = await client.responses.create({
    model: "tzafon.northstar-cua-fast",
    input: [{
      role: "user",
      content: [
        { type: "input_text", text: "Does this page look correct? Check for: 1) Logo is visible 2) Navigation bar has Home, Products, About links 3) No error messages or broken images. Report any issues." },
        { type: "input_image", image_url: screenshotUrl, detail: "auto" },
      ],
    }],
  });

  for (const item of response.output ?? []) {
    if (item.type === "message") {
      for (const block of item.content ?? []) {
        console.log(block.text);
      }
    }
  }
} finally {
  await client.computers.delete(id);
}

Why Northstar for testing

	Traditional UI tests	Northstar
Selectors	CSS/XPath — break on UI changes	None — Northstar sees the screen
Maintenance	Update scripts every redesign	Instructions stay the same
Cross-app	Separate frameworks per app	One approach for any application
Desktop apps	Requires platform-specific tools	Works on any GUI
Verification	Assert on DOM state	”Does this look right?” in plain language

Software Testing

Test a multi-step workflow

Visual verification with the Responses API

Why Northstar for testing

See also

What can I help you with?

Suggestions

Software Testing

Test a login flow

Test a multi-step workflow

Visual verification with the Responses API

Why Northstar for testing

See also

What can I help you with?

Suggestions