Tutorial: Automate a Form with AI

Tutorials

Build an AI agent that fills and submits a web form using natural language instructions.

In this tutorial, you’ll build a script that uses an AI agent to fill out a web form — no hardcoded coordinates or CSS selectors needed. The agent sees the page through screenshots and figures out where to click and what to type. You’ll learn how to write effective agent instructions, stream progress in real time, and verify results.

Prerequisites: Complete the Quickstart and have TZAFON_API_KEY set in your environment.

Time: About 10 minutes.

What you’ll build

A script that:

Starts an AI agent with form-filling instructions
Streams the agent’s progress so you can watch it work
Verifies the form was submitted correctly
Handles failure by adding more detail to the instructions

We’ll use httpbin.org/forms/post, a safe practice form.

Step 1: Start an agent with form instructions

The simplest approach — give the agent clear instructions and let it figure out the form:

from tzafon import Lightcone

client = Lightcone()

for event in client.agent.tasks.start_stream(
    instruction=(
        "Go to https://httpbin.org/forms/post. "
        "Fill in the form with these values: "
        "Customer name: Jane Doe, "
        "Telephone: 555-0123, "
        "E-mail: jane@example.com, "
        "Size: Large, "
        "Topping: Bacon, "
        "Topping: Cheese. "
        "Submit the form. "
        "You're done when you see the JSON response showing the submitted data."
    ),
    kind="browser",
    max_steps=15,
):
    print(event)

import Lightcone from "@tzafon/lightcone";

const client = new Lightcone();

const stream = await client.agent.tasks.startStream({
  instruction:
    "Go to https://httpbin.org/forms/post. " +
    "Fill in the form with these values: " +
    "Customer name: Jane Doe, " +
    "Telephone: 555-0123, " +
    "E-mail: jane@example.com, " +
    "Size: Large, " +
    "Topping: Bacon, " +
    "Topping: Cheese. " +
    "Submit the form. " +
    "You're done when you see the JSON response showing the submitted data.",
  kind: "browser",
  max_steps: 15,
});

for await (const event of stream) {
  console.log(event);
}

Run this and watch the agent work. You’ll see streaming events as it navigates, clicks form fields, types values, selects radio buttons and checkboxes, and submits.

Step 2: Use fire-and-poll for background execution

If you don’t need real-time streaming, start the task in the background and check the result later:

import time

task = client.agent.tasks.start(
    instruction=(
        "Go to https://httpbin.org/forms/post. "
        "Fill in: Customer name: Jane Doe, Telephone: 555-0123. "
        "Submit the form."
    ),
    kind="browser",
    max_steps=15,
)
print(f"Task started: {task.task_id}")

# Wait for completion
while True:
    status = client.agent.tasks.retrieve_status(task.task_id)
    if status.status in ("completed", "failed"):
        print(f"Done! Status: {status.status}, exit code: {status.exit_code}")
        break
    time.sleep(3)

const task = await client.agent.tasks.start({
  instruction:
    "Go to https://httpbin.org/forms/post. " +
    "Fill in: Customer name: Jane Doe, Telephone: 555-0123. " +
    "Submit the form.",
  kind: "browser",
  max_steps: 15,
});
console.log(`Task started: ${task.task_id}`);

// Wait for completion
while (true) {
  const status = await client.agent.tasks.retrieveStatus(task.task_id!);
  if (status.status === "completed" || status.status === "failed") {
    console.log(`Done! Status: ${status.status}, exit code: ${status.exit_code}`);
    break;
  }
  await new Promise((r) => setTimeout(r, 3000));
}

An exit code of 0 means success. Non-zero means the agent encountered an error or couldn’t complete the task.

Step 3: Verify with a manual check

After the agent finishes, create a new session to verify the result — or use a persistent session to inspect where the agent left off:

# Start with persistence so we can inspect afterward
task = client.agent.tasks.start(
    instruction=(
        "Go to https://httpbin.org/forms/post. "
        "Fill in: Customer name: Jane Doe, Telephone: 555-0123. "
        "Submit the form."
    ),
    kind="browser",
    max_steps=15,
    persistent=True,
)

# ... wait for completion ...

# Inspect the final state
status = client.agent.tasks.retrieve_status(task.task_id)
print(f"Status: {status.status}")

const task = await client.agent.tasks.start({
  instruction:
    "Go to https://httpbin.org/forms/post. " +
    "Fill in: Customer name: Jane Doe, Telephone: 555-0123. " +
    "Submit the form.",
  kind: "browser",
  max_steps: 15,
  persistent: true,
});

// ... wait for completion ...

const status = await client.agent.tasks.retrieveStatus(task.task_id!);
console.log(`Status: ${status.status}`);

Step 4: Steer a stuck agent

If the agent gets confused, you can redirect it mid-task with inject_message:

import time

task = client.agent.tasks.start(
    instruction="Go to https://httpbin.org/forms/post and fill in the form.",
    kind="browser",
    max_steps=30,
)

# Give it a few seconds to start
time.sleep(8)

# Check if it needs help
status = client.agent.tasks.retrieve_status(task.task_id)
if status.status == "running":
    client.agent.tasks.inject_message(
        task.task_id,
        message=(
            "For the customer name, type 'Jane Doe'. "
            "For telephone, type '555-0123'. "
            "Then click the Submit button."
        ),
    )
    print("Sent clarifying instructions")

const task = await client.agent.tasks.start({
  instruction: "Go to https://httpbin.org/forms/post and fill in the form.",
  kind: "browser",
  max_steps: 30,
});

// Give it a few seconds
await new Promise((r) => setTimeout(r, 8000));

const status = await client.agent.tasks.retrieveStatus(task.task_id!);
if (status.status === "running") {
  await client.agent.tasks.injectMessage(task.task_id!, {
    message:
      "For the customer name, type 'Jane Doe'. " +
      "For telephone, type '555-0123'. " +
      "Then click the Submit button.",
  });
  console.log("Sent clarifying instructions");
}

Tips for writing form instructions

Principle	Example
Name every field	”Customer name: Jane Doe” not “fill in the first field”
Specify selections	”Size: Large” not “pick a size”
Define done	”You’re done when you see the confirmation page”
Be explicit about checkboxes	”Check the Bacon and Cheese toppings”
Order matters	List fields in the order they appear on the form
Set max_steps	Simple forms: 10-15, multi-step wizards: 20-30

What you learned

In this tutorial, you:

Used an AI agent to fill a form without writing coordinate-based code
Streamed agent progress to watch it work in real time
Used fire-and-poll for background form submission
Inspected results with persistent sessions
Steered a stuck agent with mid-task message injection

Next steps

Form automation — coordinate-based form filling for when you need precise control
Run an agent — more agent patterns including pause/resume
Agent Tasks — full configuration reference for agent tasks
Best practices — writing effective agent instructions

Tutorial: Automate a Form with AI

What you’ll build

Step 1: Start an agent with form instructions

Step 2: Use fire-and-poll for background execution

Step 3: Verify with a manual check

Step 4: Steer a stuck agent

Tips for writing form instructions

What you learned

Next steps

What can I help you with?

Suggestions