Back to Essays

Ontologies: What They Are, Why They Matter Now

Vivek Khandelwal
Vivek KhandelwalChief Business Officer, CoFounder @ CogniSwitch
Jan 30, 2026·9 Min Read·Updated May 2, 2026
Reviewed by: Dilip Ittyera — CEO & Co-Founder, CogniSwitch

It was in June 2024, when I theoretically understood the word "ontology". Another quarter before I saw it being used in a POC. Only in Q1 2025 when we actively started pushing agents to production, my understanding became firm — thanks to actual implementations.

Context graphs have made the concept of ontology wildly popular. Apart from Palantir and knowledge management groups, it's not a commonly used term in new age AI communities. Though Ontologists as a job title has been around for a while, and regulated industries like Pharma and Healthcare have dedicated knowledge management teams.

The question really is — what's an ontology, why does it matter now, and how should you think about using it?

What's an Ontology?

Non-ELI5 version: A formal, machine-readable domain model of concepts, attributes, relationships, and axioms within a domain, enabling shared understanding and reasoning. Sounds dense. Let's break it down.

Anatomy of an Ontology

Six building blocks — each adds a layer of machine-readable intelligence

Interactive Explainer

Built a quick interactive tool to show this instead of just explaining it. Try the Ontology vs Taxonomy Explainer →

Ontologies vs Taxonomies

Taxonomy (hierarchical categorization) is far more well understood as a concept. Think the category menu on Amazon — how products are classified and organized. Ontology captures the relationships between those concepts. As code.

Simply put — human teams route around incomplete information all the time. We know which concept means what in context without it being explicit. We fill gaps with judgment. AI agents don't have that. They take documents and words literally. They don't know that Metformin treats diabetes — they just know those words appear near each other in training data. That's correlation, not meaning.

Same Word, Different Meaning

Fig 1
Healthcare Context

"Discharge" — Patient leaving the hospital after treatment. A discharge summary documents medications, follow-up appointments, and care instructions.

"Exposure" — Contact with a pathogen or harmful substance. Requires isolation protocols and notification chains.

"Pipeline" — Clinical research: the sequence of drug candidates from discovery through FDA approval.

Click toggle to switch between states

Ontologies give AI the relationship structure that humans carry in their heads. That's why everyone's suddenly talking about them.

Sounds Great. What's the Catch?

Most teams don't get to ontologies. ETL pipelines — built. Already shipped the pilot. Prompt iterations are ongoing to fix hallucinations. The instinct is to bolt on the ontology at the end. Layer it on top of your existing data. That's backwards.

Don't LLMs Already Know This Stuff?

Sort of. But not in the way you need. LLMs learn patterns. They know that "Metformin" and "diabetes" frequently appear together. That's statistical correlation. It's not the same as knowing that Metformin treats Type 2 Diabetes, that it's contraindicated in patients with kidney failure, that it belongs to the biguanide drug class.

An LLM might get it right. It might not. Ask the same question differently and you might get a different answer. No guarantee of consistency, no logical structure underneath.

Using ontologies isn't about training LLMs. It's about extending a formal structure to LLMs for reasoning.

Correlation vs Reasoning

Fig 2
LLM Sees (Correlation)

"Metformin" and "diabetes" frequently appear together in training data. Statistical co-occurrence. High probability of association.

Ask differently and you might get a different answer. No guarantee of consistency. No logical structure underneath.

Click toggle to switch between states

Where Should You Start?

Short answer — start with the domain ontology. Not your enterprise data.

Enterprise data is dynamic — evolves, new knowledge is added, older information is sunsetted. But the underlying meaning doesn't change. The domain ontology captures relationships that exist in your field regardless of your specific company. Then you introduce your enterprise data on top of that foundation.

Most teams hit this wall and try to fix it with more data. More documents. More context. Longer prompts. Unfortunately, the wrong fix. The foundation needs the ontology. And the ontology needs to come first, not last.

Go Deeper

For a deeper look at how ontologies enable and govern reasoning, read the Neuro-Symbolic AI Practitioner's Taxonomy →

Frequently Asked Questions

LLMs have seen Metformin paired with diabetes millions of times in training. Why do I need a formal ontology when the model has effectively learned the relationship?

Statistical co-occurrence is not formal relationship. The model knows Metformin and diabetes appear together frequently — that's not the same as knowing Metformin treats Type 2 Diabetes specifically, is contraindicated in patients with eGFR < 30, and belongs to the biguanide class, as versioned auditable facts. Ask the same clinical question two different ways and you may get different answers. An ontology extends formal structure to the model for reasoning — deterministic, versioned, auditable. When compliance requires proof that it got it right every time, 'it usually does' is not sufficient.

We have five years of proprietary clinical data. Why not learn the ontology from our own data rather than adopting an industry standard?

Industry standards exist because the underlying domain relationships don't change based on your enterprise's specific data. Your proprietary data is the enterprise layer: company-specific SOPs, protocols, exceptions, and operational decisions that sit on top of the domain foundation. Starting with your data and building up risks encoding your operational quirks as the foundational layer — which breaks when you bring in any external knowledge source.

Building an ontology took us four months and we still shipped late. The ontology-first argument kills timelines. How do you actually operationalize it?

The four-month cost usually comes from trying to model everything before shipping anything. The sequence that works: identify the three to five domain concepts that directly affect your highest-stakes outputs. Map those to an existing industry standard. Your enterprise SOPs and exceptions layer on top. You're not building a complete formal model upfront — you're building a governed foundation for the specific decision types your agent handles first, then extending.

Most ontologies go stale within 18 months in fast-moving regulatory environments. Who owns it and how does it stay current?

Large enterprises already have this role — it's just disconnected from AI initiatives. Pharma companies employ ontologists. Banks have taxonomists. Maintenance becomes tractable when there's a formal versioning layer: changes are versioned, downstream queries depending on affected concepts are traced, updates trigger impact analysis before promotion to the live corpus.

A taxonomy can be mapped in a weekend. How long does a meaningful ontology actually take to build?

A taxonomy tells you what things are. An ontology tells you how they relate, under what conditions, with what rules. The time scales with decision complexity. For narrow, high-stakes use cases, a working ontology covering decision-critical concepts can be operational in weeks if you're building on an existing industry standard. Starting from scratch with no industry standard is where timelines inflate. The right question isn't how long it takes — it's which decision types justify it.

About the Author
Vivek Khandelwal

Vivek Khandelwal

Chief Business Officer, CoFounder @ CogniSwitch·M.Sc. Chemistry, IIT Bombay

Vivek Khandelwal is the Chief Business Officer at CogniSwitch, where he leads go-to-market strategy, enterprise partnerships, and the company's thought leadership programs. He is the author of Signal, CogniSwitch's weekly newsletter that translates the complex machinery of enterprise AI infrastructure into clear, actionable intelligence for practitioners and executives in regulated industries.