Enterprise AI Thinking

A Framework for
Getting Enterprise AI Right

AI adoption in enterprises fails more often than it succeeds — not because the technology doesn't work, but because the organisational, data, and security foundations aren't in place. This page covers the core questions every enterprise must answer and the horizontal capabilities that underpin serious AI deployments.

01 — Getting Started

Where Do We Start?

Most AI initiatives fail at the first step: choosing the wrong problem. The right entry point balances near-term value with long-term capability building — and it starts with honest assessment, not enthusiasm.

🎯 The use-case prioritisation trap

Not every problem needs AI. Start by listing the 10–15 repetitive, high-effort tasks in your organisation where the cost of a wrong answer is low and the volume is high. These are your best pilot candidates. Avoid starting with high-stakes decisions (credit, compliance, HR) until you have operational experience.

⚖️ Build vs. buy vs. API: a decision framework

Use a frontier model API (GPT-4, Claude, Gemini) when speed matters and your data isn't sensitive. Fine-tune an open-weight model (Llama, Mistral) when you need data control or cost at scale. Build from scratch only for competitive-moat use cases where the model behaviour is deeply proprietary. Most enterprises should start with APIs and migrate selectively.

⚡ Quick wins vs. strategic bets

Quick wins (document summarisation, internal Q&A, meeting notes) deliver ROI within weeks and build internal confidence. Strategic bets (agentic workflows, customer-facing AI, decision automation) need 12–18 months of infrastructure and trust-building. Portfolio both. Don't let quick wins become the permanent ceiling.

🏗️ What "AI readiness" actually means

AI readiness isn't about GPU clusters. It's about: (1) clean, accessible data in the relevant domain, (2) at least one technically literate champion in the business team, (3) a governance structure that can say "no" to a use case, and (4) executive tolerance for a 6-month iteration cycle before measuring ROI.

02 — Data Privacy

What About Data Privacy?

Enterprise data is fundamentally different from public internet data. Before any AI deployment touches real business information, the data governance questions need clear answers — legally, contractually, and architecturally.

🌍 Data residency and sovereignty

India's Digital Personal Data Protection (DPDP) Act, sector regulations (RBI, SEBI, IRDAI), and customer contracts may restrict where data can be processed. Audit which data crosses borders when you call a cloud LLM API. Many enterprises are surprised to find their contracts prohibit sending certain data to US-based inference endpoints.

📋 How LLM vendors handle your data

Read the data processing addendum, not just the privacy policy. Key questions: Is my data used to train future models? Are prompts logged, and for how long? What is the data retention policy on conversation history? Enterprise-tier agreements from major vendors typically offer opt-outs, but you must negotiate them — they are not on by default.

🔒 PII scrubbing and data masking

A data gateway (see capabilities section) can strip or pseudonymise PII before it leaves your perimeter. Customer names, Aadhaar numbers, account IDs, and mobile numbers should be replaced with tokens before hitting any external LLM. The model can still reason; the sensitive identifiers never leave your network.

🏢 On-premises and VPC inference

For the most sensitive workloads (legal, HR, patient data, trading signals), run models inside your own infrastructure. Open-weight models like Llama 3 and Mistral can run on a modest 4×A10 cluster or even CPU-only for smaller models. Cloud providers also offer VPC-isolated endpoints where your data never touches shared infrastructure.

03 — Security

What Are the Security Implications?

AI systems introduce attack surfaces that traditional enterprise security wasn't designed for. The threats are real, the mitigations are known, but they require deliberate architectural choices.

💉 Prompt injection and adversarial attacks

Prompt injection is the AI equivalent of SQL injection: a malicious input in a document or email can hijack what your AI agent does. If your AI reads external content (emails, PDFs, web pages) and then takes actions, it is vulnerable. Mitigations include strict output parsing, privilege separation, and never letting the AI both read untrusted input and take high-value actions in the same pipeline step.

🕳️ Data leakage in RAG pipelines

Retrieval-augmented generation (RAG) pulls documents into the context window at query time. If your retrieval layer doesn't enforce document-level access controls, a user can ask a question that retrieves a document they aren't supposed to see. Apply row-level security in your vector store, and ensure the retrieval pipeline respects the same permissions as your document system.

🔑 Access control for AI systems

Treat AI agents like privileged service accounts, not superusers. Give each agent the minimum permissions required for its task. Use short-lived credentials for external API calls. Apply the same IAM policies you'd apply to a junior analyst — they can read, they can draft, but they can't commit, transfer, or delete without a second confirmation.

📝 Audit trails and explainability

Every AI decision that affects a customer or triggers a workflow should be logged: the prompt, the retrieved context, the model response, and the downstream action. Not for ML debugging — for compliance, incident response, and regulator conversations. Build structured logging into your AI layer from day one; retrofitting it later is painful.

04 — From Pilots to Production

How Do We Scale Beyond the Pilot?

Most enterprise AI pilots succeed on their own terms. Most never reach production. The gap isn't technical — it's organisational, economic, and architectural.

🚧 Why 90% of pilots don't make it

Common causes: (1) No production owner — the pilot was run by IT or a consultant with no internal champion. (2) Data isn't clean enough at scale — the pilot used curated data. (3) Integration cost was underestimated — connecting to legacy systems takes 10× longer than the AI part. (4) The KPI was fuzzy — "better experience" doesn't survive a budget review.

🏛️ Organisational blockers

The biggest scaling blockers are human, not technical. Risk and compliance teams need a risk framework before they'll approve a live deployment. Legal needs liability language for AI outputs. HR needs a reskilling narrative. Line managers need to trust the output enough to act on it. These conversations take months and should start during the pilot, not after.

📊 Infrastructure at scale

A pilot running 100 queries a day has very different infrastructure requirements from 100,000. Latency, token costs, rate limits, and model availability become real engineering problems. Plan for caching, request batching, fallback models, and cost alerting. A single LLM call can cost ₹0.01 — at 10 million calls a month, that's ₹1 lakh in inference alone.

💹 Measuring ROI beyond vanity metrics

"Hours saved" is a start, but the CFO wants to see headcount efficiency, error rate reduction, or revenue per FTE. Pick one KPI per use case, establish a baseline before the pilot, and measure the same way after. Also track failure modes: hallucination rate, escalation rate, user override rate. These tell you where the model isn't trusted yet.

05 — Horizontal Capabilities

The Enterprise AI Capabilities Stack

Beyond individual use cases, enterprises need a reusable infrastructure layer. These six capabilities appear in every serious enterprise AI architecture — they don't belong to any one use case; they serve all of them.

🔀

Model Routing & Orchestration

Route each request to the right model based on task type, required intelligence, latency, and cost. A question about a PDF header doesn't need GPT-4 — it needs a fast, cheap model. Orchestration layers like LangGraph or custom routers manage multi-step agent workflows.

🚪

Data Gateway

A single controlled layer through which all AI systems access enterprise data. Enforces access policies, strips PII before external calls, logs every data access event, and provides a consistent interface for RAG pipelines, agents, and analytics tools.

🔌

MCP — Model Context Protocol

An emerging open standard (Anthropic-originated, now broadly adopted) that lets AI models connect to tools, databases, and APIs through a standardised interface. Think of it as USB-C for AI integrations — instead of building custom connectors for each tool, you implement MCP once.

🛡️

Security-Friendly Architecture

VPC-isolated inference endpoints, zero-trust networking for agent-to-tool calls, air-gapped deployments for regulated workloads, and secrets management for API keys. AI adds new attack surfaces; the architecture must treat it as a privileged internal service, not a SaaS integration.

📡

Evaluation & Observability

Continuous measurement of model quality in production — not just accuracy on a benchmark, but hallucination rate, citation accuracy, user override rate, and latency percentiles. Tools like LangSmith, Arize, or custom eval harnesses catch model drift before it becomes a business incident.

🤝

Human-in-the-Loop Design

Deliberate insertion of human review at the right points in AI workflows — not everywhere (defeats the purpose) but at high-stakes, low-volume decision points. The design question is: when does the cost of a wrong AI answer exceed the cost of a human review? Build the answer into the workflow.

Who This Is For

The Right Audience for This Conversation

Enterprise AI decisions are made by a specific set of people — and resisted by others.

This is for

CEOs and MDs setting the AI agenda
CIOs and CTOs evaluating architecture choices
Business leaders owning a high-value use case
Risk, compliance, and legal heads stress-testing AI
Strategy and transformation teams running pilots

Not the focus

Developers building AI applications (different conversation)
Data scientists optimising model performance
Vendors pitching AI products
Those looking for a specific tool recommendation

About

Swapnil Pawar works at the intersection of quantitative finance, AI systems, and enterprise strategy. He has built and deployed AI applications across financial services, legal, and research domains — and has advised senior leadership teams on how to think about AI adoption at scale.

His background spans investment management, management consulting, and entrepreneurship. He holds a degree from IIT Bombay and an MBA from IIM Ahmedabad.

This page reflects frameworks developed through engagements with enterprise teams — not vendor material, not academic theory. It is updated as the landscape evolves.

A Framework forGetting Enterprise AI Right