Why “AI chatbot” is almost never the right frame
When a business unit asks for “AI,” they usually describe a chatbot. But when you ask what problem they’re actually trying to solve, the answer is almost always one of six distinct generative AI patterns — each with its own architecture, evaluation criteria, and failure modes. Using the right pattern frame changes the entire conversation: from “how smart is the AI?” to “how accurate is the retrieval?” or “how consistent is the extraction?” — questions that have engineerable answers.
These six patterns are not mutually exclusive. Production systems often combine two or three — a code generation tool might use RAG to retrieve relevant existing code as context, and structured extraction to parse the output into a specific format. Understanding them individually is the prerequisite for combining them correctly.
Pattern 1: RAG-powered Q&A
The most widely deployed generative AI pattern. A user asks a question; the system retrieves relevant documents from a knowledge base; the LLM generates an answer grounded in the retrieved context. The answer is only as good as the retrieval — which is why most RAG projects that struggle are struggling with chunk quality, embedding model selection, or missing re-ranking, not with the generative model.
Best for: employee-facing internal knowledge bases, customer support deflection on well-documented topics, policy and compliance Q&A. Not appropriate for: questions where the answer requires calculation or reasoning over structured data (use SQL generation instead), or where 100% factual accuracy is required without human review.
Pattern 2: Document generation
Data in, document out. The system takes structured inputs (a deal size, a customer name, specific contract terms) and generates a formatted document (a contract, a proposal, a report). The key engineering challenge is output consistency — you need the document to reliably contain exactly the right information, in the right format, with no hallucinated clauses.
This is solved through template-constrained generation: the LLM fills in sections of a pre-defined template rather than generating free-form documents. Combined with output validation (a second pass that checks required fields are present and formatted correctly), this pattern achieves very high reliability. We use this in our player support case study for generating structured resolution summaries.
Pattern 3: Structured extraction
Unstructured text in, structured JSON out. This is one of the highest-ROI generative AI patterns in enterprise software — converting the vast amounts of unstructured data that already exist (emails, PDFs, call transcripts, form submissions, meeting notes) into structured records that CRM and ERP systems can process.
The critical engineering practice for extraction is output schema enforcement: define the exact JSON schema you need, instruct the model to produce only that schema, and validate every output against it before writing to your database. Modern LLMs are very good at this with a well-designed schema and a few examples in the system prompt.
Pattern 4: Synthetic data generation
LLMs are excellent at generating realistic synthetic training data — examples of the input/output pairs you want a model to learn from. This is particularly valuable when you have limited real-world labeled data, when your real data contains PII that can’t be used for training, or when you need to generate edge cases that are rare in production but important for model robustness.
A typical workflow: define your task schema (input format, output format, quality criteria), generate 100–500 synthetic examples using a strong LLM (GPT-4, Claude), have a human review a sample, then use the validated examples to fine-tune a smaller model for the specific task. This “teacher-student” pattern produces specialized models that are faster and cheaper than the original large model for the specific task.
Pattern 5: Code generation
LLMs are now capable of generating production-quality code for well-defined tasks: Apex classes, Lightning Web Components, SQL queries, test suites, data migration scripts. The key constraints are context and specificity: the model needs to see relevant existing code (via RAG or tool context), receive a precise specification of what to generate, and produce output that is validated before execution.
In tools like Cursor and GitHub Copilot, this pattern is interactive — the developer iterates with the model. In automated pipelines (test generation, boilerplate scaffolding), it runs autonomously with validation gates. The pattern is mature and the ROI is proven — see our toolchain post for how we integrate code generation into our delivery workflow.
Pattern 6: Multimodal pipelines
Processing non-text inputs — images, PDFs, audio recordings, video — and extracting structured insights from them. This is the most complex pattern and the one with the most active model development. Use cases: extracting data from scanned invoices (vision), transcribing and summarizing sales calls (audio), analyzing product images for quality control (vision + classification).
The engineering challenge is pipeline design: each modality requires different preprocessing (OCR for scanned documents, speech-to-text for audio), and the outputs need to be normalized into a consistent format before downstream processing. Claude’s vision capabilities and Gemini’s multimodal API are the most capable options in the enterprise space today.