Tutorial: Automate a Form with AI
Build an AI agent that fills and submits a web form using natural language instructions.
In this tutorial, you’ll build a script that uses an AI agent to fill out a web form — no hardcoded coordinates or CSS selectors needed. The agent sees the page through screenshots and figures out where to click and what to type. You’ll learn how to write effective agent instructions, stream progress in real time, and verify results.
Prerequisites: Complete the Quickstart and have TZAFON_API_KEY set in your environment.
Time: About 10 minutes.
What you’ll build
Section titled “What you’ll build”A script that:
- Starts an AI agent with form-filling instructions
- Streams the agent’s progress so you can watch it work
- Verifies the form was submitted correctly
- Handles failure by adding more detail to the instructions
We’ll use httpbin.org/forms/post, a safe practice form.
Step 1: Start an agent with form instructions
Section titled “Step 1: Start an agent with form instructions”The simplest approach — give the agent clear instructions and let it figure out the form:
from tzafon import Lightcone
client = Lightcone()
for event in client.agent.tasks.start_stream( instruction=( "Go to https://httpbin.org/forms/post. " "Fill in the form with these values: " "Customer name: Jane Doe, " "Telephone: 555-0123, " "E-mail: jane@example.com, " "Size: Large, " "Topping: Bacon, " "Topping: Cheese. " "Submit the form. " "You're done when you see the JSON response showing the submitted data." ), kind="browser", max_steps=15,): print(event)import Lightcone from "@tzafon/lightcone";
const client = new Lightcone();
const stream = await client.agent.tasks.startStream({ instruction: "Go to https://httpbin.org/forms/post. " + "Fill in the form with these values: " + "Customer name: Jane Doe, " + "Telephone: 555-0123, " + "E-mail: jane@example.com, " + "Size: Large, " + "Topping: Bacon, " + "Topping: Cheese. " + "Submit the form. " + "You're done when you see the JSON response showing the submitted data.", kind: "browser", max_steps: 15,});
for await (const event of stream) { console.log(event);}Run this and watch the agent work. You’ll see streaming events as it navigates, clicks form fields, types values, selects radio buttons and checkboxes, and submits.
Step 2: Use fire-and-poll for background execution
Section titled “Step 2: Use fire-and-poll for background execution”If you don’t need real-time streaming, start the task in the background and check the result later:
import time
task = client.agent.tasks.start( instruction=( "Go to https://httpbin.org/forms/post. " "Fill in: Customer name: Jane Doe, Telephone: 555-0123. " "Submit the form." ), kind="browser", max_steps=15,)print(f"Task started: {task.task_id}")
# Wait for completionwhile True: status = client.agent.tasks.retrieve_status(task.task_id) if status.status in ("completed", "failed"): print(f"Done! Status: {status.status}, exit code: {status.exit_code}") break time.sleep(3)const task = await client.agent.tasks.start({ instruction: "Go to https://httpbin.org/forms/post. " + "Fill in: Customer name: Jane Doe, Telephone: 555-0123. " + "Submit the form.", kind: "browser", max_steps: 15,});console.log(`Task started: ${task.task_id}`);
// Wait for completionwhile (true) { const status = await client.agent.tasks.retrieveStatus(task.task_id!); if (status.status === "completed" || status.status === "failed") { console.log(`Done! Status: ${status.status}, exit code: ${status.exit_code}`); break; } await new Promise((r) => setTimeout(r, 3000));}An exit code of 0 means success. Non-zero means the agent encountered an error or couldn’t complete the task.
Step 3: Verify with a manual check
Section titled “Step 3: Verify with a manual check”After the agent finishes, create a new session to verify the result — or use a persistent session to inspect where the agent left off:
# Start with persistence so we can inspect afterwardtask = client.agent.tasks.start( instruction=( "Go to https://httpbin.org/forms/post. " "Fill in: Customer name: Jane Doe, Telephone: 555-0123. " "Submit the form." ), kind="browser", max_steps=15, persistent=True,)
# ... wait for completion ...
# Inspect the final statestatus = client.agent.tasks.retrieve_status(task.task_id)print(f"Status: {status.status}")const task = await client.agent.tasks.start({ instruction: "Go to https://httpbin.org/forms/post. " + "Fill in: Customer name: Jane Doe, Telephone: 555-0123. " + "Submit the form.", kind: "browser", max_steps: 15, persistent: true,});
// ... wait for completion ...
const status = await client.agent.tasks.retrieveStatus(task.task_id!);console.log(`Status: ${status.status}`);Step 4: Steer a stuck agent
Section titled “Step 4: Steer a stuck agent”If the agent gets confused, you can redirect it mid-task with inject_message:
import time
task = client.agent.tasks.start( instruction="Go to https://httpbin.org/forms/post and fill in the form.", kind="browser", max_steps=30,)
# Give it a few seconds to starttime.sleep(8)
# Check if it needs helpstatus = client.agent.tasks.retrieve_status(task.task_id)if status.status == "running": client.agent.tasks.inject_message( task.task_id, message=( "For the customer name, type 'Jane Doe'. " "For telephone, type '555-0123'. " "Then click the Submit button." ), ) print("Sent clarifying instructions")const task = await client.agent.tasks.start({ instruction: "Go to https://httpbin.org/forms/post and fill in the form.", kind: "browser", max_steps: 30,});
// Give it a few secondsawait new Promise((r) => setTimeout(r, 8000));
const status = await client.agent.tasks.retrieveStatus(task.task_id!);if (status.status === "running") { await client.agent.tasks.injectMessage(task.task_id!, { message: "For the customer name, type 'Jane Doe'. " + "For telephone, type '555-0123'. " + "Then click the Submit button.", }); console.log("Sent clarifying instructions");}Tips for writing form instructions
Section titled “Tips for writing form instructions”| Principle | Example |
|---|---|
| Name every field | ”Customer name: Jane Doe” not “fill in the first field” |
| Specify selections | ”Size: Large” not “pick a size” |
| Define done | ”You’re done when you see the confirmation page” |
| Be explicit about checkboxes | ”Check the Bacon and Cheese toppings” |
| Order matters | List fields in the order they appear on the form |
| Set max_steps | Simple forms: 10-15, multi-step wizards: 20-30 |
What you learned
Section titled “What you learned”In this tutorial, you:
- Used an AI agent to fill a form without writing coordinate-based code
- Streamed agent progress to watch it work in real time
- Used fire-and-poll for background form submission
- Inspected results with persistent sessions
- Steered a stuck agent with mid-task message injection
Next steps
Section titled “Next steps”- Form automation — coordinate-based form filling for when you need precise control
- Run an agent — more agent patterns including pause/resume
- Agent Tasks — full configuration reference for agent tasks
- Best practices — writing effective agent instructions