---
slug: acceptance-criteria-for-ai-coders
title: "Acceptance Criteria for AI Coders: How Specific Is Too Specific?"
excerpt: "Acceptance criteria tell AI coding tools when a feature is done. Too vague and the AI invents the wrong thing. Too specific and it fights your stack."
primaryKeyword: "acceptance criteria for AI coders"
publishedAt: 2026-05-02
readingTimeMin: 7
author: "Robert Boylan"
tags:
  - acceptance-criteria
  - prd
  - ai-coding-tools
  - app-planning
  - prompt-engineering
---

You ask Lovable to build a login screen. It builds one. The user can type an email and password, hit a button, and get in. Technically correct. But it redirected to `/home` instead of `/dashboard`, it shows a generic error message for every failure, and it stored the session in a way that breaks your existing auth middleware. Nobody told it otherwise, so it made reasonable choices.

This is the acceptance criteria problem. Not a bug. A spec gap.

Acceptance criteria (AC for short) are short statements that describe exactly when a feature is considered done. Product teams have used them for decades to stop engineers building a technically correct but functionally wrong version of a feature. They work the same way for AI coding tools, except the AI has much higher confidence and much less embarrassment about winging it when the criteria are missing.

Getting acceptance criteria right for AI coders is less about writing more and more about writing at the right altitude. Too low, and you're doing the AI's job. Too high, and you haven't told it anything useful.

## Why vague acceptance criteria produce the wrong thing

"Users can log in" is not an acceptance criterion. It is a feature name. The AI treats it as one and fills in every blank it didn't see with defaults: session duration, error handling, redirect destination, whether failed attempts are logged, how the loading state looks.

Some of those defaults will be fine. Some will conflict with decisions you made elsewhere and haven't written down yet. The frustrating part is you won't always know which until three sprints later when something quietly breaks.

Vague AC forces the AI to invent your product. Specificity is how you prevent that, but the specificity needs to be pointed at the right things.

Here is a login AC that is too vague:

```
# Too vague
- Users can log in
- Login should work correctly
- Errors should be handled
```

Nothing in there tells the AI what "correctly" means, what errors you care about, or where the user ends up after success. It will fill in all of that from training data. That filling-in is where surprises live.

## Why over-specified acceptance criteria cause different problems

The other failure mode is writing AC that reads like a technical spec.

```
# Over-specified
- MUST use bcrypt with 12 salt rounds for password hashing
- MUST set a JWT with a 24-hour expiry signed with RS256
- MUST set cookie with SameSite=Lax, HttpOnly=true, Secure=true
- MUST redirect to /dashboard after authentication
- MUST call POST /api/v1/auth/login with Content-Type application/json
```

If you are using Supabase, none of the first three bullets are your job. Supabase handles all of it. Writing those criteria doesn't make the code safer. It makes the AI build around the platform's defaults instead of with them, and you get an unnecessarily complicated auth implementation for a v1 that needs to ship this week.

Over-specified AC also creates fragile prompts. When you add a new feature later and the AI re-reads the spec, it treats those implementation details as hard constraints. It will sometimes choose worse solutions to honor them. The AI has no judgment about which of your criteria were real product decisions and which were just things you typed one afternoon.

Over-specification is a thing that happens when you're anxious about the AI getting something wrong. That anxiety is understandable. The fix isn't writing more constraints, it's writing better-aimed ones.

## What good acceptance criteria actually look like

Good acceptance criteria for AI coders describe what the user experiences and what must be true after the action. Not how. They answer two questions: what can the user do, and what state is the system in when they're done?

The format that works: short, active voice, present tense, one condition per line.

```
# Good
- A registered user can sign in with email and password
- After signing in, the user lands on /dashboard
- A wrong password shows an inline error: "Incorrect email or password"
- Three consecutive failed attempts locks the account for 10 minutes
- A "Forgot password" link sends a reset email within 2 minutes
```

Every line here is about experience and outcome, not implementation. The AI knows how to build bcrypt hashing, JWT management, and session cookies. It does not know you want the error message copy to be "Incorrect email or password" instead of "Invalid credentials" unless you say so. It does not know about the account lockout unless you say so.

The pattern: you own the product decisions. The AI owns the implementation.

This is also where writing good AC intersects with [thinking through your spec before you start prompting](/blog/vibe-coding-why-planning-matters). The act of writing "what must be true after this action" forces you to decide things you haven't decided yet. That is a feature, not extra work.

## Three worked examples: login, file upload, search

**Example 1: Login**

Not this:

```
- Users can log in to the app
```

Not this either:

```
- MUST use argon2id hashing with memory=65536, iterations=3
- MUST set Referrer-Policy: strict-origin header on redirect
```

This:

```
- A registered user can sign in with email and password
- After sign-in, the user is redirected to /dashboard
- Wrong credentials show: "Incorrect email or password"
- Successful sign-in persists the session across page reloads
```

---

**Example 2: File upload**

Not this:

```
- Users can upload files
```

Not this:

```
- MUST use multipart/form-data with a 5 MB part limit
- MUST store in S3 under the key pattern users/{uuid}/uploads/{timestamp}-{filename}
- MUST set Content-Disposition: attachment on the pre-signed URL
```

This:

```
- A user can upload a single image file (JPEG or PNG, up to 5 MB)
- Uploading shows a progress indicator
- After upload, the image appears in the user's media library within 2 seconds
- Files over 5 MB show an error before the upload starts: "File too large (max 5 MB)"
- Unsupported formats show: "Only JPEG and PNG files are supported"
```

(Don't worry about reading every line there. Focus on the shape: outcome, limit, error messages. Not storage paths.)

---

**Example 3: Search**

Not this:

```
- Search works well
```

Not this:

```
- MUST use trigram GIN index on the title column
- MUST debounce input by exactly 300ms
- MUST return results in under 200ms at p95
```

This:

```
- A user can search posts by title using the search bar
- Results update as the user types (after a short pause)
- If no results match, show: "No results for [query]"
- Search is not case-sensitive
- The search bar clears when the user navigates away
```

In the third pair, the p95 latency target and the debounce value are real engineering constraints, but they are not acceptance criteria. They belong in a technical note if anywhere. Acceptance criteria are for the person at the keyboard, not the infrastructure.

## How to write acceptance criteria when you're not technical

Most people who build with Lovable, v0, or Bolt.new are not engineers. The "don't over-specify the implementation" advice can feel abstract if you don't know which parts are implementation.

A shortcut that works: write one sentence in the form "A [user type] can [do a thing], and after they do it, [state of the world]." Then add the error cases. That structure almost always lands at the right altitude.

- "A user can search the product catalogue, and after typing a query, matching products appear." Good.
- "A user can upload a profile photo, and after uploading, it appears on their profile page." Good.
- "A user can reset their password, and after resetting, they are signed in and taken to /dashboard." Good.

If you are not sure whether a detail is an implementation detail, try asking yourself: would a user notice if this changed? If the answer is yes (the error message copy, the redirect destination, whether you can search by name vs only by tag), it belongs in the AC. If a user would never know (which hash algorithm, which cloud bucket, which index type), leave it out.

This is the same instinct that makes [a good app spec](/blog/good-lovable-app-spec) work for tools like Lovable or Cursor. You are describing the experience, not the wiring.

For a deeper look at what goes into writing specs that AI tools can actually use, the [what is a PRD guide](/what-is-a-prd) covers the full picture, not just the AC layer.

## Getting acceptance criteria right is worth the ten minutes

Writing AC for every feature in a spec is not glamorous. It is also the single most reliable way to stop an AI coding tool from making product decisions on your behalf.

The pattern that works: describe what the user does, what they see, and what state the system is in when it's over. Skip the how. Add your error messages. Keep each criterion to one condition.

If that still feels like overhead before you have even opened a tool, Draftlytic generates acceptance criteria for each feature when it builds your spec. You describe the idea, answer a few focused questions, and the output includes features with AC already written at the right level. It is not a substitute for your judgment about what the product should do, but it means you are not starting from a blank line before every feature when you should be building.

The AI coders are fast. The spec is still the thing that decides whether they build what you meant.