Frequently asked questions

Real questions. Honest answers.

Everything buyers, procurement, security review, and architecture teams ask us, collected from years of enterprise AI conversations. Skip ahead with the section nav.

Section 01

Concepts & definitions

What is the difference between an AI agent and a chatbot?

A chatbot makes one LLM call and stops. An AI agent uses an LLM combined with tools and a reasoning loop: it plans, calls tools (CRM, ERP, email, databases), reads results, decides what to do next, and repeats until done. A single message can trigger zero, one, or many tool calls, the agent decides at runtime. See how agents work.

What is the difference between agentic AI and generative AI?

Generative AI produces output (text, images, code) on demand. Agentic AI uses that generation capability inside a goal-directed loop with tools, it doesn't just write a response, it takes actions across systems to accomplish a goal. Every agentic AI system uses generative AI under the hood; not every generative AI system is agentic.

How is an AI agent different from RPA (robotic process automation)?

RPA scripts a fixed sequence of UI clicks. It breaks the moment the underlying screen changes. AI agents read intent and decide what to do at runtime, they navigate variability rather than memorize a path. RPA works well for stable, structured, high-volume tasks. Agents win on workflows with judgment, unstructured input, or systems that change. See our deeper comparison.

What does "Model Context Protocol" (MCP) mean and why does it matter?

MCP is an emerging open standard for how AI agents discover and call tools. Instead of each agent shipping bespoke integrations for every data source, MCP defines a uniform way to expose tools, resources, and prompts to an agent. We build MCP-native agents wherever the protocol is supported, with direct adapters where it isn't yet. As MCP adoption grows, integrations move forward without rebuild.

What is a "capability ladder" for an agent?

The five stages we use to describe how an agent matures in production: Inform (answers questions), Qualify (routes intent), Operate (does the work), Triage (escalates urgency), Anticipate (runs on a schedule). Most agents start at Stage 1 and climb. See the full ladder.

Will agents replace our team?

No, and that's not the design goal. The pattern is agent and people working together. Agents absorb the long tail of routine, repetitive work, and the team focuses on judgment calls, relationships, and the work only humans should do. Most clients reallocate rather than reduce headcount.

Section 02

Pricing & timeline

How much does an enterprise AI agent cost?

Engagements run in three phases: a fixed-scope Strategy & Discovery Sprint (4–8 weeks), an Agent Build (3–6 months to production), and ongoing Operations (monthly retainer). Enterprise integrations are quoted per system. Pricing depends on scope: number of integrations, data residency and regulatory requirements, custom model work, multilingual scope, SLA, and whether it's a single agent or a fleet. The Strategy Sprint produces a firm, procurement-ready quote. See how we engage, or bring us a workflow for a firm number.

How long does it take to build an enterprise AI agent?

Strategy & discovery: 4–8 weeks. MVP build to production: 8–16 weeks. Enterprise integrations add 2–8 weeks per system. A focused proof-of-concept on a single workflow can be running in 6–8 weeks. Full agent fleets across multiple workflows are 6–18 months.

What's the ROI of an AI agent?

Four sources: (1) deflected work, tier-1 tickets handled without staff time, (2) accelerated work, drafting in minutes vs. days, (3) recovered revenue, leads qualified in real time, churn flagged early, (4) risk reduction, compliance enforced consistently, audit trail always on. Most production deployments break even in 6–12 months. See our ROI framework.

Can we run a paid POC before committing to a full Build?

Yes. A focused proof-of-concept on a single workflow lets us build a working agent on a real integration, deliver a quality report, and let you decide whether to extend into a full Build. Most do. We scope and quote the POC up front.

Are there hidden costs we should plan for?

Three line items beyond engagement fees: (1) model inference (pass-through at provider rates), (2) infrastructure (your AWS/Azure/GCP bill for hosting, monitoring, storage), (3) any third-party SaaS the agent calls (CRM seats, messaging, paid API endpoints). We forecast all three during the Strategy Sprint.

Do you offer outcome-based or revenue-share pricing?

Occasionally, for workflows where outcomes are cleanly measurable and attributable (recovered revenue from churn flagging, deflected tier-1 ticket cost). We propose it where it fits. Most engagements are flat-fee or T&M because outcomes get muddied by everything else the business is doing.

Will you publish a real SOW we can take to procurement?

Yes, at the end of a Strategy Sprint, you walk out with a procurement-ready SOW: deliverables, timeline, milestones, dependencies, acceptance criteria, not-to-exceed ceiling. No surprises at signing.

Section 03

Integration & tech

Which CRM, ERP, EHR, and LMS systems can Quantilus integrate?

Pre-built connectors for Salesforce, HubSpot, Microsoft Dynamics, NetSuite, SAP, Workday, Oracle, ServiceNow, Zendesk, Freshdesk, Jira, Epic, Cerner, Athenahealth (via HL7/FHIR), Canvas, Blackboard, Moodle, Google Classroom, Schoology, PowerSchool, Workday Student, Salesforce EDA, Slate, Ellucian Banner, Klopotek, Biblio, Firebrand, Arc XP, WordPress VIP, Drupal. Custom adapters for legacy/homegrown systems in 2–8 weeks.

Can the agent run inside Slack or Microsoft Teams?

Yes, most of our agents have a Slack/Teams surface alongside or instead of a web UI. The agent's logic stays the same; the UI is just one channel. Many clients also run the agent on email and webhooks.

What about voice agents, phone calls, IVR replacement?

Yes. We've shipped voice agents on Twilio and similar telephony providers. Architecture is the same (LLM + tools + loop) with speech-to-text and text-to-speech in front. Voice adds latency requirements (sub-2-second responses) that shape model selection and caching strategy.

Does Quantilus build agents on Bedrock, Azure OpenAI, or Vertex?

All three, plus direct API access to Anthropic/OpenAI/Google for clients who prefer that, plus open-weight self-hosting for clients who need full data isolation. Model gateway is a design decision we make during Strategy based on your data-handling policy and cost-quality preferences. See /security.

What programming languages and frameworks do Quantilus agents use?

Typically Python or TypeScript/Node for agent orchestration, with provider SDKs (Anthropic, OpenAI, Google AI). Frontend admin consoles in React. Integration adapters in whatever the target system expects.

What about open-source agent frameworks, LangChain, LlamaIndex, AutoGen, CrewAI?

We use whatever fits, LangGraph and the Anthropic Agents SDK are common in our newer builds. The framework is an implementation detail. What matters is the agent's behavior, your guardrails, and the eval harness. We don't lock you into a vendor-specific framework that may not exist in 18 months.

What if our target system has no public API?

Several options: (1) database-direct integration with read-only views, (2) browser automation via Playwright for SaaS apps with no API (we've done this on AgilLink, several legacy practice-management systems, and proprietary financial tools), (3) email/file-based integration where the agent emits structured artifacts your existing system ingests.

Section 04

Security & compliance

Can Quantilus deploy AI agents without sending data to OpenAI or Anthropic?

Yes. Three private deployment models: (1) open-weight models in your VPC (LLaMA, Mistral, Qwen, Gemma), (2) frontier models via your cloud account (AWS Bedrock, Azure OpenAI, Google Vertex), (3) fully air-gapped with no internet egress. Your data never leaves the perimeter you control. See /security.

Are Quantilus AI agents HIPAA compliant?

Yes, we ship HIPAA-aligned deployments with BAA availability, PHI redaction, full audit trails, clinician-in-the-loop approval gates, and inference inside HIPAA-aligned environments (open-weight on customer VPC or via AWS Bedrock in HIPAA-eligible regions).

Does Quantilus support FERPA-aware AI for education?

Yes. Student records stay scoped to the institution, disclosure controls are enforced at the tool layer, and every access is logged. FERPA-aware throughout the integration layer for Canvas, Blackboard, Moodle, PowerSchool, and others.

Is Quantilus SOC 2 Type 2 ready?

Yes. Quantilus engagements include SOC 2 Type 2 controls and audit evidence. Customer-side KMS, BYOK, HSM integration available. Per-tool permission boundaries and audit retention configurable per customer policy.

Does Quantilus support FedRAMP or air-gapped deployments?

Yes. FedRAMP-aligned deployment patterns, GovCloud-region deployments, and fully air-gapped configurations for defense, classified, and sensitive-but-unclassified workloads. No internet egress, no model-provider relationship, open-weight models on your hardware only.

How do you handle data residency?

We deploy in your region. EU data stays in EU regions, US data in US regions, India data in Mumbai/Hyderabad. We use region pinning and never cross-region replicate without explicit approval. For sovereign cloud (GovCloud, Microsoft Sovereign), we ship in those too.

What if the agent gets something wrong?

Three layers: (1) the agent cites its sources, so wrong answers are auditable, (2) low-confidence outputs route to a human before action, (3) every action is reversible via audit log + tool design. We tune from real failures during Operations, the agent gets better, the eval harness expands.

How do humans stay in the loop?

Approval gates at design time: pricing changes, contract actions, regulated communications, sensitive customer cases, anything above a confidence threshold. The agent drafts and proposes; a human approves before action takes effect. Every decision is logged with the agent's reasoning attached.

Section 05

Operating an agent

How is agent quality measured?

Every Quantilus agent ships with an evaluation harness: a versioned set of test cases drawn from your real workflow, scored automatically on every change, with regression alarms. We measure task completion rate, factual accuracy (with citations), latency, cost per task, and human-override frequency. You see the dashboard.

What happens when AI models improve, do we have to rebuild?

No. The agent's behavior is defined by your policies, your knowledge layer, your tool definitions, and your eval harness, none of which are tied to a specific model. Swapping to a newer model is a config change plus a regression test pass. Most clients move to better models within a week of release.

What does ongoing Operations actually include?

24/7 monitoring (uptime, latency, cost, output quality, drift), quality regression tests on every change, monthly ops report with usage/cost/accuracy/deflection metrics, policy refinement from real conversations, quarterly capability reviews + roadmap, model upgrades, bug triage, incident response, customer-side support.

Can our internal team take over Operations after launch?

Yes. About 30% of our clients run Operations themselves after the first 6–12 months. We deliver the runbook, train the team, and stay on retainer for upgrades and incident response. The other 70% find that 6-figure annual ops cost is cheaper than hiring an in-house team.

What if we want to switch models, say, Claude to Gemini?

Run the eval harness against the new model, review the diff, ship. We've migrated agents across providers in a single sprint when there's a clear cost or quality reason. The eval harness is what makes this safe.

How do we know the agent is actually saving us money?

Every monthly Operations report includes deflection metrics (tasks the agent completed without human time), acceleration metrics (tasks completed faster than human baseline), and cost breakdown (engagement + inference + infrastructure vs. equivalent staffing cost). Some clients also instrument net-new revenue captured via churn flagging or qualified-lead routing.

Section 06

About Quantilus

Where is Quantilus located?

Headquarters at 1345 Avenue of the Americas in New York City. Additional offices in California, Mumbai, Bangalore, and Hangzhou. 200+ engineers across five offices. Founded 2004.

What's the difference between quantilus.ai and quantilus.com?

Same company. quantilus.com is the parent domain covering 20+ years of enterprise software, AI/ML, and big-data practice across Fortune 500 clients. quantilus.ai is the focused product positioning for business AI agents, the agent and workflow-automation practice of the firm.

Who is Quantilus a fit for?

Quantilus focuses on small and mid-sized businesses (roughly 20–2,000 employees), bringing enterprise-grade AI engineering to companies that don't have a large in-house AI team. The best-fit clients have a real workflow problem, an executive sponsor, AI budget approved, and data access. We also serve larger enterprises. We're a poor fit for pre-product startups, demo-only POCs without a path to production, or organizations that haven't decided what they want AI to do.

Does Quantilus do staff augmentation or only product engagements?

Both. eNamix, a Quantilus company, provides staff augmentation, technical recruiting, and project management staffing for AI/ML, data engineering, cloud, full-stack, and PM roles. See /staffing.

Can we white-label Quantilus agents to ship to our customers?

Yes, under a bespoke OEM agreement. Several Quantilus engagements deliver an agent that the client embeds in their own product to ship to their end customers. Pricing, IP, and SLA structure differ from a single-tenant deployment. Contact us to discuss.

Do you sign customer-paper or only your own MSA?

Either. We have a standard MSA we'll provide on request, and we sign customer paper for enterprise clients with their own legal templates. We'll work through your procurement and security review process, most engagements clear in 4–8 weeks.

How do I get started?

Reach out via /contact, email info@quantilus.ai, or call (212) 768-8900. First call is a 30-minute scoping conversation, we'll tell you whether a Strategy Sprint, a paid POC, or a full Build is the right starting point. Most clients are in contract within 2–4 weeks of first conversation.

Still have a question?

Send it over, we'll answer it directly, on the call, and add it here if it's useful for the next person.

Ask Us