Skip to main content

March 31, 2026

How to Collect AI Training Data via API in 5 Minutes

Post a job, let real people capture photos, audio, or video, and get back quality-scored files with auto-generated annotations. Here is the whole flow from zero to downloaded data.

Prerequisites

You need a FirstHandAPI account and an API key. Sign up at the dashboard, navigate to Settings → API Keys, and create a live key. It will start with fh_live_. You also need credits in your account — the starter pack is $10.

Step 1: Install the SDK

FirstHandAPI has official SDKs for TypeScript and Python. Install the one that matches your stack.

TypeScript

npm install @firsthandapi/sdk

Python

pip install firsthandapi

Step 2: Initialize the Client

TypeScript

import FirstHandAPI from '@firsthandapi/sdk';

const fh = new FirstHandAPI({
  apiKey: process.env.FIRSTHAND_API_KEY!,
  // baseUrl defaults to https://api.firsthandapi.com
});

Python

import firsthandapi

fh = firsthandapi.Client(
    api_key=os.environ["FIRSTHAND_API_KEY"],
    # base_url defaults to https://api.firsthandapi.com
)

Step 3: Post a Data Collection Job

A job describes what you need. Set the content type (image, audio, or video), write a clear prompt, specify how many files you want, and set a per-file reward. Workers on the FirstHandAPI iOS app will see your job and start capturing content.

TypeScript

const job = await fh.jobs.create({
  title: 'Photos of coffee shop menus',
  description: 'Take a clear, well-lit photo of the full menu board at any coffee shop. Must be legible.',
  type: 'data_collection',
  content_type: 'image',
  file_formats: ['jpeg', 'png'],
  quantity: 50,
  reward_per_file: 75, // credits (1 credit = $0.01)
});

console.log(job.id);
// => "job_01J5K9..."
console.log(job.status);
// => "active"

Python

job = fh.jobs.create(
    title="Photos of coffee shop menus",
    description="Take a clear, well-lit photo of the full menu board at any coffee shop. Must be legible.",
    type="data_collection",
    content_type="image",
    file_formats=["jpeg", "png"],
    quantity=50,
    reward_per_file=75,  # credits (1 credit = $0.01)
)

print(job.id)
# => "job_01J5K9..."
print(job.status)
# => "active"

Step 4: Poll for Completion (or Use Webhooks)

You can poll the job status, or configure a webhook to get notified when files are approved. Polling is simpler to start with.

TypeScript

// Poll until we have at least 10 approved files
let files = [];
while (files.length < 10) {
  const result = await fh.files.list({
    job_id: job.id,
    status: 'approved',
  });
  files = result.data;
  if (files.length < 10) {
    await new Promise((r) => setTimeout(r, 30_000)); // wait 30s
  }
}

console.log(`Got ${files.length} approved files`);

Python

import time

# Poll until we have at least 10 approved files
files = []
while len(files) < 10:
    result = fh.files.list(job_id=job.id, status="approved")
    files = result.data
    if len(files) < 10:
        time.sleep(30)  # wait 30s

print(f"Got {len(files)} approved files")

Step 5: Download Files with Annotations

Every approved file comes with a pre-signed download URL and auto-generated annotations. You do not need a separate labeling pipeline.

TypeScript

for (const file of files) {
  console.log(file.download_url);     // pre-signed S3 URL
  console.log(file.annotations);      // auto-generated labels
}

// Download all files to a local directory
await fh.files.downloadAll({
  job_id: job.id,
  output_dir: './training-data/coffee-menus',
});

Annotation Response

Here is what the annotation metadata looks like on an approved image file. This is included automatically — no extra API call or cost.

{
  "file_id": "file_01J5KA...",
  "job_id": "job_01J5K9...",
  "content_type": "image",
  "format": "jpeg",
  "score": 4,
  "status": "approved",
  "download_url": "https://files.firsthandapi.com/...",
  "annotations": {
    "object_labels": [
      { "label": "menu_board", "confidence": 0.97 },
      { "label": "text", "confidence": 0.95 },
      { "label": "price_list", "confidence": 0.91 },
      { "label": "chalkboard", "confidence": 0.84 }
    ],
    "ocr_text": "Espresso $4.50\nLatte $5.75\nCappuccino $5.50\nCold Brew $5.00...",
    "scene_classification": "indoor_retail",
    "color_palette": ["#2C1810", "#F5E6D3", "#8B4513", "#FFFFFF"],
    "resolution": { "width": 4032, "height": 3024 }
  },
  "created_at": "2026-03-31T14:22:08Z"
}

Bonus: Use MCP for AI Agent Workflows

If you are using Claude Code, Cursor, or another MCP-compatible AI agent, you can skip the SDK entirely. Install the MCP server and your agent can post jobs and download results conversationally:

npx @firsthandapi/mcp-server

See the MCP integration guide for setup details.

What Happens Under the Hood

When you post a job, here is the pipeline that runs for every worker submission:

  1. A worker on the iOS app accepts your job and captures a photo, audio clip, or video.
  2. The file is uploaded to secure cloud storage.
  3. A multi-model AI ensemble scores the submission 1-5 stars (Claude Vision for images, Whisper for audio, ffmpeg + Claude Vision for video frames).
  4. Files scoring 3+ stars are approved. Files scoring 1 star count as a strike against the worker (3 strikes = ban).
  5. Approved files are delivered to your per-job folder with auto-generated annotation metadata.
  6. You are charged only for approved files. Rejected files cost you nothing.

This human-in-the-loop pipeline means you get real-world data captured by real people, quality-gated by AI, with data labeling included — all through a single API call.

Ready to collect your first dataset?

Create a free account, add $10 in credits, and post your first data collection job. Most teams have approved files within the hour.