Software Testing
Northstar tests your application by using it like a real person — no selectors, no scripts to maintain.
Traditional UI testing is fragile. Selenium scripts break when a button moves, a class name changes, or a page redesigns. Northstar tests your application the way a human would — by looking at the screen and interacting with what it sees. No selectors to maintain, no scripts to update when the UI changes.
Test a login flow
Section titled “Test a login flow”Give Northstar a test scenario in plain language:
from tzafon import Lightcone
client = Lightcone()
for event in client.agent.tasks.start_stream( instruction=( "Go to https://app.example.com/login. " "Enter username 'testuser@example.com' and password 'test123'. " "Click the login button. " "Verify that the dashboard loads and shows a welcome message. " "If login fails, report the error message you see." ), kind="desktop", max_steps=15,): print(event)import Lightcone from "@tzafon/lightcone";
const client = new Lightcone();
const stream = await client.agent.tasks.startStream({ instruction: "Go to https://app.example.com/login. " + "Enter username 'testuser@example.com' and password 'test123'. " + "Click the login button. " + "Verify that the dashboard loads and shows a welcome message. " + "If login fails, report the error message you see.", kind: "desktop", max_steps: 15,});
for await (const event of stream) { console.log(event);}Northstar navigates to the login page, finds the fields, enters the credentials, submits, and reports whether it worked — all from a natural language description.
Test a multi-step workflow
Section titled “Test a multi-step workflow”for event in client.agent.tasks.start_stream( instruction=( "Go to https://app.example.com. Log in with 'admin@example.com' / 'admin123'. " "Navigate to Settings > Billing. " "Verify that the current plan shows 'Pro'. " "Click 'Update Payment Method'. " "Verify that the payment form loads with credit card fields visible. " "Do NOT submit the form — just confirm the form is present and functional." ), kind="desktop", max_steps=30,): print(event)const stream = await client.agent.tasks.startStream({ instruction: "Go to https://app.example.com. Log in with 'admin@example.com' / 'admin123'. " + "Navigate to Settings > Billing. " + "Verify that the current plan shows 'Pro'. " + "Click 'Update Payment Method'. " + "Verify that the payment form loads with credit card fields visible. " + "Do NOT submit the form — just confirm the form is present and functional.", kind: "desktop", max_steps: 30,});
for await (const event of stream) { console.log(event);}Visual verification with the Responses API
Section titled “Visual verification with the Responses API”For more control, use the Responses API to check specific visual states:
with client.computer.create(kind="desktop") as computer: client.computers.exec.sync(computer.id, command="firefox https://app.example.com &") computer.wait(5)
screenshot_url = computer.get_screenshot_url(computer.screenshot())
# Ask Northstar to evaluate what it sees — no tools, so it responds with text response = client.responses.create( model="tzafon.northstar-cua-fast", input=[{ "role": "user", "content": [ {"type": "input_text", "text": "Does this page look correct? Check for: 1) Logo is visible 2) Navigation bar has Home, Products, About links 3) No error messages or broken images. Report any issues."}, {"type": "input_image", "image_url": screenshot_url, "detail": "auto"}, ], }], # No tools — forces a text assessment )
for item in response.output or []: if item.type == "message": for block in item.content or []: print(block.text)const computer = await client.computers.create({ kind: "desktop" });const id = computer.id!;
try { await client.computers.exec.sync(id, { command: "firefox https://app.example.com &" }); await new Promise((r) => setTimeout(r, 5000));
const screenshot = await client.computers.screenshot(id); const screenshotUrl = screenshot.result?.screenshot_url as string;
const response = await client.responses.create({ model: "tzafon.northstar-cua-fast", input: [{ role: "user", content: [ { type: "input_text", text: "Does this page look correct? Check for: 1) Logo is visible 2) Navigation bar has Home, Products, About links 3) No error messages or broken images. Report any issues." }, { type: "input_image", image_url: screenshotUrl, detail: "auto" }, ], }], });
for (const item of response.output ?? []) { if (item.type === "message") { for (const block of item.content ?? []) { console.log(block.text); } } }} finally { await client.computers.delete(id);}Why Northstar for testing
Section titled “Why Northstar for testing”| Traditional UI tests | Northstar | |
|---|---|---|
| Selectors | CSS/XPath — break on UI changes | None — Northstar sees the screen |
| Maintenance | Update scripts every redesign | Instructions stay the same |
| Cross-app | Separate frameworks per app | One approach for any application |
| Desktop apps | Requires platform-specific tools | Works on any GUI |
| Verification | Assert on DOM state | ”Does this look right?” in plain language |
See also
Section titled “See also”- Tasks — run test scenarios with natural language instructions
- Responses API — visual verification with screenshots
- Computers — environment setup and configuration
- visualize.py — save annotated screenshots of every step for visual test reports
- qa.py — visual + structural testing with parallel analysis