Skip to content
Dashboard

Tutorial: Automate a Form with AI

Build an AI agent that fills and submits a web form using natural language instructions.

In this tutorial, you’ll build a script that uses an AI agent to fill out a web form — no hardcoded coordinates or CSS selectors needed. The agent sees the page through screenshots and figures out where to click and what to type. You’ll learn how to write effective agent instructions, stream progress in real time, and verify results.

Prerequisites: Complete the Quickstart and have TZAFON_API_KEY set in your environment.

Time: About 10 minutes.

A script that:

  1. Starts an AI agent with form-filling instructions
  2. Streams the agent’s progress so you can watch it work
  3. Verifies the form was submitted correctly
  4. Handles failure by adding more detail to the instructions

We’ll use httpbin.org/forms/post, a safe practice form.

Step 1: Start an agent with form instructions

Section titled “Step 1: Start an agent with form instructions”

The simplest approach — give the agent clear instructions and let it figure out the form:

form_agent.py
from tzafon import Lightcone
client = Lightcone()
for event in client.agent.tasks.start_stream(
instruction=(
"Go to https://httpbin.org/forms/post. "
"Fill in the form with these values: "
"Customer name: Jane Doe, "
"Telephone: 555-0123, "
"E-mail: jane@example.com, "
"Size: Large, "
"Topping: Bacon, "
"Topping: Cheese. "
"Submit the form. "
"You're done when you see the JSON response showing the submitted data."
),
kind="browser",
max_steps=15,
):
print(event)
form_agent.ts
import Lightcone from "@tzafon/lightcone";
const client = new Lightcone();
const stream = await client.agent.tasks.startStream({
instruction:
"Go to https://httpbin.org/forms/post. " +
"Fill in the form with these values: " +
"Customer name: Jane Doe, " +
"Telephone: 555-0123, " +
"E-mail: jane@example.com, " +
"Size: Large, " +
"Topping: Bacon, " +
"Topping: Cheese. " +
"Submit the form. " +
"You're done when you see the JSON response showing the submitted data.",
kind: "browser",
max_steps: 15,
});
for await (const event of stream) {
console.log(event);
}

Run this and watch the agent work. You’ll see streaming events as it navigates, clicks form fields, types values, selects radio buttons and checkboxes, and submits.

Step 2: Use fire-and-poll for background execution

Section titled “Step 2: Use fire-and-poll for background execution”

If you don’t need real-time streaming, start the task in the background and check the result later:

import time
task = client.agent.tasks.start(
instruction=(
"Go to https://httpbin.org/forms/post. "
"Fill in: Customer name: Jane Doe, Telephone: 555-0123. "
"Submit the form."
),
kind="browser",
max_steps=15,
)
print(f"Task started: {task.task_id}")
# Wait for completion
while True:
status = client.agent.tasks.retrieve_status(task.task_id)
if status.status in ("completed", "failed"):
print(f"Done! Status: {status.status}, exit code: {status.exit_code}")
break
time.sleep(3)
const task = await client.agent.tasks.start({
instruction:
"Go to https://httpbin.org/forms/post. " +
"Fill in: Customer name: Jane Doe, Telephone: 555-0123. " +
"Submit the form.",
kind: "browser",
max_steps: 15,
});
console.log(`Task started: ${task.task_id}`);
// Wait for completion
while (true) {
const status = await client.agent.tasks.retrieveStatus(task.task_id!);
if (status.status === "completed" || status.status === "failed") {
console.log(`Done! Status: ${status.status}, exit code: ${status.exit_code}`);
break;
}
await new Promise((r) => setTimeout(r, 3000));
}

An exit code of 0 means success. Non-zero means the agent encountered an error or couldn’t complete the task.

After the agent finishes, create a new session to verify the result — or use a persistent session to inspect where the agent left off:

# Start with persistence so we can inspect afterward
task = client.agent.tasks.start(
instruction=(
"Go to https://httpbin.org/forms/post. "
"Fill in: Customer name: Jane Doe, Telephone: 555-0123. "
"Submit the form."
),
kind="browser",
max_steps=15,
persistent=True,
)
# ... wait for completion ...
# Inspect the final state
status = client.agent.tasks.retrieve_status(task.task_id)
print(f"Status: {status.status}")
const task = await client.agent.tasks.start({
instruction:
"Go to https://httpbin.org/forms/post. " +
"Fill in: Customer name: Jane Doe, Telephone: 555-0123. " +
"Submit the form.",
kind: "browser",
max_steps: 15,
persistent: true,
});
// ... wait for completion ...
const status = await client.agent.tasks.retrieveStatus(task.task_id!);
console.log(`Status: ${status.status}`);

If the agent gets confused, you can redirect it mid-task with inject_message:

import time
task = client.agent.tasks.start(
instruction="Go to https://httpbin.org/forms/post and fill in the form.",
kind="browser",
max_steps=30,
)
# Give it a few seconds to start
time.sleep(8)
# Check if it needs help
status = client.agent.tasks.retrieve_status(task.task_id)
if status.status == "running":
client.agent.tasks.inject_message(
task.task_id,
message=(
"For the customer name, type 'Jane Doe'. "
"For telephone, type '555-0123'. "
"Then click the Submit button."
),
)
print("Sent clarifying instructions")
const task = await client.agent.tasks.start({
instruction: "Go to https://httpbin.org/forms/post and fill in the form.",
kind: "browser",
max_steps: 30,
});
// Give it a few seconds
await new Promise((r) => setTimeout(r, 8000));
const status = await client.agent.tasks.retrieveStatus(task.task_id!);
if (status.status === "running") {
await client.agent.tasks.injectMessage(task.task_id!, {
message:
"For the customer name, type 'Jane Doe'. " +
"For telephone, type '555-0123'. " +
"Then click the Submit button.",
});
console.log("Sent clarifying instructions");
}
PrincipleExample
Name every field”Customer name: Jane Doe” not “fill in the first field”
Specify selections”Size: Large” not “pick a size”
Define done”You’re done when you see the confirmation page”
Be explicit about checkboxes”Check the Bacon and Cheese toppings”
Order mattersList fields in the order they appear on the form
Set max_stepsSimple forms: 10-15, multi-step wizards: 20-30

In this tutorial, you:

  1. Used an AI agent to fill a form without writing coordinate-based code
  2. Streamed agent progress to watch it work in real time
  3. Used fire-and-poll for background form submission
  4. Inspected results with persistent sessions
  5. Steered a stuck agent with mid-task message injection
  • Form automation — coordinate-based form filling for when you need precise control
  • Run an agent — more agent patterns including pause/resume
  • Agent Tasks — full configuration reference for agent tasks
  • Best practices — writing effective agent instructions