Recipes for AI: From Pilots to Production

Álvaro de Nicolás · January 1, 2026

What follows are the practical lessons, methodologies and recipes I have distilled from the Revamp, HBX and Zapier playbooks, cross-checked against the latest industry reports. They are organized to be applied immediately: model management first, then segmentation, then the cultural traits and the timeline-by-timeline recipes that separate the companies extracting value from the companies still stuck in the pilot graveyard.

1. Managing the model: hallucinations, reasoning and “thinking”

The core lesson. Do not mistake fluency (sounding smart) for reliability (being right). Large language models act as a vast index of memorized patterns, not a logical brain.

Practical recipes to avoid hallucinations

Use “thinking slow” architectures. Do not ask the LLM to solve complex constraint problems directly. Use it as a universal UI to translate human language into code or commands for reliable tools (calculators, databases).
The “data structure” fix. Simple RAG implementations often fail because models struggle to link data silos. Invest in offline processing where you use LLMs to pre-build indices, summaries and links between your documents before a user asks a question.
Force “thinking” mode. For complex tasks, use chain-of-thought reasoning models (OpenAI o-series, DeepSeek R1). They plan before answering and raise accuracy on logic tasks — though they are not immune to errors.
The golden rule of prompting. If the information isn’t in the prompt, the model generally doesn’t know it. Treat the model as stateless and predictive: it guesses the next word, it does not check facts.

Questions to ask ourselves

Am I asking the AI to reason (high risk) or to transform information I provided (lower risk)?
If the data changes (a new CEO, a new policy), will the model fight its training data? RAG systems often hallucinate when new facts conflict with pre-training.

2. The Revamp Matrix and LLM cross-checking

The Revamp principles map cleanly onto current LLM capabilities — and reveal where the analogy strains.

“Data must remain in-house for high-sensitivity tasks.” Convergent. Open-weight models (DeepSeek R1, Llama) allow local hosting and align with the privacy mandate. Divergent: proprietary models still lead on multi-step agentic behaviour.
“Human-in-the-loop is mandatory for judgment-based tasks.” Convergent. Current AI achieves only ~60–70% reliability on basic reasoning. Using AI for drafting (contracts, HR letters) is safe; using it for decisions (hiring) without oversight is dangerous.
“Focus on task automation, not job replacement.” Convergent. The PwC Jobs Barometer shows AI-affected industries see ~3× higher productivity growth, not job collapse. The focus is augmenting human judgment, not replacing it.

3. Methodology: segmentation and triage

Do not ask “can AI do this?” Ask “should we do this now?”

Step 1 — the feasibility filter

Data readiness. Is the data clean and accessible? If you need to scrape from three incompatible systems, the use case is Tier 3 or 4 (park / later).
Process maturity. If the process varies every time a human does it, AI cannot automate it.
GDPR & privacy.
- Green (low risk): no personal data (policy summarization, coding). Deploy immediately.
- Yellow (medium risk): pseudonymized or internal data. Enterprise licence + DPIA required.
- Red (high risk): automated decision-making on personal data (firing/hiring). Do not deploy.

Step 2 — the value filter

High value solves a strategic bottleneck (skills shortage) or a massive time-sink (reporting). Low value is “cool tech” with little business impact (generative avatars for internal meetings).

The resulting matrix

Tier	Profile	Example	When
Tier 1	High value + high feasibility	Standardized HR reply generator, code documentation	Year 1
Tier 2	High value + medium feasibility	Burnout detection (sensitive data)	Year 1–2 after governance
Tier 4	Low value + low feasibility	Full autonomous org design	Ignore

4. Corporate traits and pitfalls

Traits of successful adopters

Executive clarity. The CEO or CFO must explicitly state “AI is a baseline capability.” A cultural signal, not just an IT project.
“Code red” urgency. Treating AI adoption as immediate strategic necessity, not a “nice to have.” In the early stages, speed matters more than perfection.
Distributed innovation. They don’t wait for a central AI team to build everything. They empower AI champions in every department to build their own workflows.

Common pitfalls

The pilot graveyard. Many pilots, no plan to scale. Fix: define scaling metrics (ROI) before the pilot starts.
Legacy mindsets. Bolting AI onto broken processes. Fix: redesign the workflow first. Don’t automate a bad process — eliminate the friction.
Data paralysis. Waiting for “perfect data.” Fix: start with use cases that rely on unstructured data (text, documents) which don’t require perfect schemas.

5. Recipes by timeline and impact

Year 1 — the quick wins (cost & efficiency)

Minimal integration, immediate time savings.

Recipe 1 — the “first draft” engine. AI drafts standard responses (HR queries, customer support), contracts and policy documents. Upload policy PDFs to a secure RAG system; prompt: “Answer this employee query based only on the attached policy.” ROI: 250–300 hours saved per team per year.
Recipe 2 — meeting & knowledge distillation. Auto-summarize meetings and extract action items via Microsoft Copilot or specialized plugins. ROI: 150+ hours per manager per year.
Recipe 3 — coding & technical acceleration. GitHub Copilot or Cursor for code generation, documentation and legacy refactoring. ROI: 20–50% developer productivity gain.

Year 2 — strategic integration (value & revenue)

Requires data integration and governance maturity.

Recipe 4 — the skills intelligence layer. Shift from job titles to a skills-based organization. Use AI to infer skills from CVs, project history and reviews to build a dynamic skills graph. ROI: enables internal mobility, reduces hiring cost.
Recipe 5 — customer/talent signal detection. Sentiment analysis on sales calls or employee surveys flags churn and flight risk early. ROI: 10–15% reduction in unwanted attrition.
Recipe 6 — testing & calibration. Use AI to check performance reviews for bias and inconsistency across managers. Anonymize the data and score for clarity, bias and actionability. ROI: higher fairness and defensibility in HR decisions.

The board-level takeaway

The competitive edge in 2026 is not adoption of AI — it is the discipline of AI. Everyone has access to the same frontier models. The moat is organizational: governance, security, workflow redesign, and an honest tiering of use cases that respects feasibility and value at the same time. Less hype, more plumbing.

Start a Conversation