When Not to Use AI: A Decision Framework

Everyone wants to add AI to their product. Almost nobody asks if they should.

I've watched teams spend months building AI features that would work better as simple rules. I've seen AI projects fail because the problem didn't need intelligence—it needed reliability. And I've built systems where AI was the obvious choice.

The gap isn't about capability. Modern LLMs are powerful. The gap is judgment. Here's a framework for deciding when AI makes sense and when it doesn't.

The Core Question: Is This a Problem or a Decision?

This is where most teams get it wrong.

A problem has a known solution. You need to execute it reliably and at scale. "Send an invoice after payment clears" is a problem. "Validate credit card format" is a problem. These need software—deterministic, testable, auditable software.

A decision requires judgment. You need to evaluate context, weigh tradeoffs, and choose between options. "Should we approve this loan?" is a decision. "What's the sentiment of this customer feedback?" is a decision. These are where AI shines.

The mistake: treating problems as decisions. You don't need an LLM to send an invoice. You need a workflow. You don't need Claude to validate input. You need a regex or a library.

The Three Questions

Before you reach for an LLM, answer these:

1. Can this be solved with rules?

If the answer is yes, use rules. Rules are fast, cheap, auditable, and deterministic. They fail predictably. They don't hallucinate.

Example: Routing a support ticket to the right team. You can do this with keyword matching and a decision tree. It will work 95% of the time. The remaining 5% go to a human. That's fine—and it's better than an AI system that sometimes routes things randomly.

Example: Extracting structured data from a form. If the form is yours, you don't need AI. You need form validation. If you're parsing external documents, then AI makes sense.

2. What happens when it fails?

This is the real question. Not "will it fail?" It will. But what's the cost?

If failure means a customer gets frustrated—that's acceptable risk. You can retry, escalate, or fall back to a human. If failure means you lose money, violate compliance, or break trust—that's different.

High-stakes decisions need human oversight. That's not a limitation of AI. That's how good systems work. The best AI systems don't eliminate humans—they make humans more effective.

3. Do you have the data to evaluate it?

You need a way to measure whether your AI system actually works. Not "it seems good" but real metrics. Accuracy, precision, recall—something quantifiable.

If you can't measure it, you can't improve it. And if you can't improve it, you shouldn't ship it. This ties directly to The Evals Problem: Measuring What Matters—if you can't define success, you're not ready for production.

When AI Actually Makes Sense

AI is the right choice when:

The task requires judgment — You're evaluating context, not executing a known procedure. Examples: analyzing customer feedback, reviewing code quality, making prioritization decisions.
The rules are too complex — You could write rules, but they'd be brittle and hard to maintain. Examples: detecting fraud, identifying relevant documents, understanding intent.
The domain changes — Your rules would need constant updates. AI adapts. Examples: content moderation, competitive analysis, market sentiment.
You have good data — You can train, evaluate, and improve your system. You can measure what matters.
Failure is acceptable — Mistakes are correctable or low-cost. You have a human in the loop. Examples: drafting emails, suggesting next steps, generating ideas.

When AI Is the Wrong Choice

Don't use AI when:

The task is deterministic — There's a right answer that doesn't change. Use software. Examples: calculations, data validation, routing logic.
You need 100% reliability — Critical systems need guarantees. AI gives you probabilities. Examples: payment processing, authentication, safety-critical operations.
The rules are simple — Occam's Razor applies. If a regex works, use a regex. If a lookup table works, use a lookup table.
You can't measure success — No evals, no shipping. If you can't define what "good" looks like, you can't build it.
The cost of failure is high — Medical diagnosis, legal decisions, financial transactions. These need human expertise and accountability, not AI confidence scores.
You don't have data — Building on guesses is how projects fail. As covered in Why Most AI Projects Fail, the foundation matters.

A Practical Example: Document Processing

Let me walk through a real decision.

You need to extract key information from contracts: parties involved, payment terms, renewal dates. Should you use AI?

If the contracts are yours: No. Use a form. Use a structured upload process. Make the user input the data. You get perfect accuracy and zero ambiguity.

If the contracts are from partners: Maybe. The documents vary. You could use OCR and a rule-based parser, but it's brittle. AI makes sense here—Claude can understand context and extract the right information from messy PDFs.

But you need evals. Create a test set of 50 contracts with ground truth. Measure accuracy. Define what "good enough" means. Set up human review for low-confidence extractions. This is Building Reliable AI Tools—structure your system so it fails gracefully.

The Integration Layer

One more thing: even when you choose AI, don't let it own the entire flow.

AI is best at specific tasks within a larger system. It's not a replacement for architecture. You still need data pipelines, error handling, monitoring, and human escalation paths. The integration matters as much as the AI itself. I covered this in The Integration Layer Nobody Talks About—the boring stuff is what makes AI work in production.

The Framework in Practice

Here's a decision tree you can use:

Can this be solved with deterministic rules? → Yes: Use software. Done.
Does this require judgment? → No: Use software. Done.
What's the cost of failure? → High: Add human oversight. Proceed with caution.
Can you measure success? → No: Don't build it yet. Define metrics first.
Do you have data? → No: Start with rules. Collect data. Iterate.
Is this a wrapper around existing software? → Yes: Read The Hidden Cost of AI Wrapper Products. Think hard.

If you pass all these, AI is probably the right choice.

The Real Win

The best teams I know aren't the ones using the most AI. They're the ones using AI where it matters. They use simple, boring software for simple, boring problems. They use AI for genuine judgment calls. They measure everything. They know when to say no.

That discipline—knowing when not to use AI—is what separates projects that work from projects that fail.

The next time someone asks "should we add AI?" ask instead: "what problem are we solving?" That one question will save you months of wasted work.