S1 · Ep 2Neuro-Symbolic AI

Discover your ontological core

Tony Seale spent two decades stitching together siloed data inside investment banks — Lehman, Deutsche Bank, UBS — until he concluded the hard part was never the model, it was the data. Now widely followed as "the knowledge graph guy," he makes a precise argument: a large language model is rented intelligence you don't own, while the ontology and knowledge graph you build are the durable IP that stays inside your organization. He walks through the neuro-symbolic loop — why LLMs are continuous and knowledge graphs discrete, and why you cycle between them rather than pick one — coins Seale's Law on how marketing hollows out a word's meaning, and warns that companies that don't consolidate their "ontological core" will watch their advantage quietly leak out. Along the way: semantic data products and the DPROD standard, why true open standards (schema.org, W3C) beat vendor "open," and why the agentic web arriving next makes getting your data estate in order the only move that counts.

Tony SealeBio ↓

Founder, The Knowledge Graph Guys · The Knowledge Graph Guys

The speaker

Tony Seale

Founder, The Knowledge Graph Guys · The Knowledge Graph Guys

Founder of The Knowledge Graph Guys, a consultancy that builds knowledge graphs for large organizations in the Age of AI. Over more than a decade he has delivered mission-critical knowledge graphs into production for Tier 1 investment banks, and his weekly LinkedIn writing on integrating large language models with knowledge graphs earned him the reputation of "The Knowledge Graph Guy" — what began as a secret side project, working from a computer hidden under his desk, became deep expertise in linking enterprise data.

LinkedIn Company

Episode evaluation

What to do with this episode

Four clear reads — who should act, and how urgently.

01For buildersUSE

Cycle the loop, don't pick a side

Don't choose between the LLM and the graph. Run the neuro-symbolic loop — let the model be creative in natural language, let the ontology hold the deterministic, verifiable structure — and move between the continuous and discrete representations to get answers that are both grounded and intelligent.

02For data & AI leadsTEST

Give every concept a URL

Start with decentralized identity: every concept in your ontology gets a unique, resolvable URL, and you expose data as semantic data products that hyperlink to each other. Look up the DPROD standard as the pattern before you build anything bespoke.

03For enterprise buyersWATCH

Interrogate the word "open"

When a vendor departs from a proven, W3C-open standard like schema.org for their own "open" ontology format, compute the motivation. Renting the model is fine; letting your ontological core get siphoned into someone else's platform is how your IP leaks out.

04Bottom lineSHIP

Concentrate on your data first

The one move: stop scattering prototype projects and point the AI you already have back at your data estate. Consolidate your ontological core — the essence that puts you out of distribution with everyone else — before the agentic web arrives. The rest falls into place.

Show notes

We discuss

01From two decades of data integration inside investment banks (Lehman, Deutsche Bank, UBS) to "the knowledge graph guy" — and the moment it clicked that the bottleneck was the data, not the model.
02Rented intelligence vs owned IP: why an LLM is electricity you swap in and out, while the ontology and knowledge graph are the durable asset you keep.
03Ontology without the jargon: a formal, first-order-logic conceptual model of your world — verifiable and deterministic, the opposite of (and rocket fuel for) an LLM.
04The neuro-symbolic loop: LLMs are continuous, knowledge graphs are discrete, and the "Apple: company or fruit?" example for why you want both.
05Seale's Law — the semantic value of a word is inversely proportional to its marketing value — and how "ontology" and "context" are being hollowed out.
06Why flat files and MD dumps aren't memory: good memory lives in a network structure, and "context" means how a fact connects to everything around it.
07Discover your ontological core — or watch your IP leak out to ontology platforms and agent swarms that piece your strategy together.
08Semantic data products, the DPROD standard, true open standards (schema.org, W3C) vs vendor "open," and the agentic web that makes getting your data estate in order urgent now.

Reference

Transcript

VivekYou've spent a lot of your career inside investment banks — Lehman, Deutsche Bank, UBS. How does someone who worked in banks end up as "the knowledge graph guy" on LinkedIn?

TonyMy whole career was around data integration, and banks were early adopters of information technology — which is both a blessing and a curse. You end up with so many siloed applications and databases, and I spent a lot of my time stitching them together to get things done for traders. It was at Deutsche Bank, over a decade ago, that I thought there must be a better way, and started experimenting with knowledge graphs and ontologies. It snowballed from there — I've been obsessed with the idea ever since. Honestly, I wasn't even convinced it would work at first; I was surprised how well it did.

VivekWhen did you realize the problem wasn't the model — that it's the data? We see AI teams behaving as if a more powerful model is what was blocking their agents, as if they've outsourced the problem to the model.

TonyMy hypothesis, from before we had large language models, was that machine learning was going to be a huge force. But the players training these — Google and others — had vast datasets: hundreds of millions of users doing the same thing, everything written on the internet. Inside an investment bank you don't have that. It's a different paradigm from deterministic programming — data has to be fed in for these things to predict well. So an organization needs lots of data going in. And if your data is heterogeneous — complicated, domain-specific — how do you turn that into "lots of data"? The inescapable conclusion is: you connect all the data you've got together. I went to knowledge graphs to connect information. I didn't go for the ontology part at first — but it becomes apparent that since you won't have as many samples as the foundational players, you have to put more information in. You have to define the semantics, take the knowledge sitting in people's heads, and encode it.

TonyEach time a more powerful model arrives — Fable this week — it's a big deal; you can suddenly do more. But the realization that slowly dawned on me is that that's rented intelligence. You don't own it. So what does an organization do to preserve its IP? You could train your own model, and some do, but there's a brittleness to that — you're constantly retraining. There's a cleaner solution: connect your data into what some people call a context graph but is really just a knowledge graph. The learning then occurs in symbolic form, in your ontology — something humans can inspect — and you retain that IP inside your organization.

VivekYou use the word ontology very passionately. A lot of people hear "ontology" and "taxonomy" as English words and can't relate them to something living inside an AI agent. Break it down — what is it?

TonyFirst — the argument about whether this is the right thing to do is over. It's won. Databricks has an ontology, Snowflake, Amazon, Google, Microsoft. It's gone from a niche thing to consensus among all the major players as the way to convey meaning to AI agents. So, what is an ontology? It's a conceptual model of your world. If you run trains, it's tracks and trains and signal boxes; a hospital, it's patients and beds; in law, precise legal concepts. It's getting very clear about those concepts and how they relate — not just a database schema, but a rich conceptual model that's linked together. And crucially it's done in a formalism.

TonyNatural language is incredibly powerful — it's what large language models deal in, you can say anything — but it's not formal. A computer language or mathematics is a formalism: you execute it and get the same result every time. An ontology uses first-order predicate logic — this 2,000-year-old idea of logic — put into a formalism, so you can deterministically ask questions of your domain and get the same answer back. That's good on its own, but it's dynamite once you pair it with a large language model, which is the exact opposite — creative, natural-language, it can go wherever it wants. That's its superpower. But sometimes you don't want it going wherever it wants: if a patient is about to be issued a drug dose, you don't want creativity; there's a prescriptive path it must not diverge from. Designing or operating a machine — use creativity to analyze what's wrong, but "don't put that cog there" would blow the whole thing up. Ontology brings the clear, formal, deterministic structure; the LLM brings the creativity.

TonyI'm going to write up a jokey thing I call Seale's Law: the semantic value of a word is inversely proportional to the marketing value of that word. Ontology has become popular, so now everyone's an ontologist, everyone's slapped ontology onto their product, and people talk very loosely about it.

JoshuaYou've spoken about LLMs being continuous and knowledge graphs being discrete. Can you illustrate that difference — and where vectors and embeddings fit?

TonyIt goes down to base mathematics. A continuous representation is like a line chart — plot house prices as a line and you can zoom in forever, to as many decimal places as you like. Large language models operate in that continuous regime: back-propagation requires it, and the embedding vectors are continuous numbers — that's what those weights with all the decimal places are. A discrete representation is the bar chart: chop the house prices up by postcode, and something fundamental has happened — you've made what we call an ontological commitment. You've decided on categories and bucketed the numbers into them. That's what you do when you build an ontology: you decide what the meaningful concepts are and carve the world up along them, and the instance data you extract becomes discrete nodes.

TonyTake the term Apple. Are you talking about Apple the company or Apple the fruit? Imagine a vector with just two numbers — one for how fruity something is, one for how techy. Banana is high fruity, low techy; Amazon high techy, low fruity. Apple has to sit in the middle, half techy, half fruity — the continuous representation blurs the two concepts together. In a knowledge graph you'd have two separate nodes with two separate types. They're both networks — a neural network with probabilistic edges in the continuous realm, and the same network shape in a discrete representation. It's not that one is right and one is wrong; it's like line charts versus bar charts — I want both, and I want to use them together.

TonyThe framing I use is the neuro-symbolic loop. You have your neural, continuous representation — embedding vectors, natural language, LLMs — and your discrete representation — knowledge graph, ontology. Each has certain strengths, and what you want to do is cycle between them, so the answer you give is both grounded in your world model and your data, and intelligent.

VivekYou said "own the ontology, which lets you rent the model." I was talking to a buyer today who loads everything into the agent's memory — but as flat text, MD files thrown in. A lot of people are literally doing that right now. There's a lot said about memory being the moat. Why aren't you advocating for flat files? What changes when you think about memory properly?

TonyIf you just pile in a load of text — what text do you retrieve, and how do you work that out? You can use embedding-vector RAG to pull it, and for some cases that's still the right thing, but it's proven not to be sufficient in most. A good memory system exists in a network structure — that's the structure learning occurs in. The neural network is itself a network; nearly all of nature, from trees to brains to the systems in your body, is a network. And you take the categorical approach: what do you actually care about in that text? A lawyer, a finance person, and a salesperson care about very different concepts. Forgetting matters too — you remember the facts and concepts that matter to you. And those facts don't exist in isolation. Everyone talks about context — context literally means how this thing relates to the things around it. So you put the fact into a graph, with the connections to everything else you know, and you understand it in the context of everything it's related to.

TonyWhat is your organization actually learning if you're just throwing chunks of text at your AI? I don't think the penny has dropped for most leaders about how existential this moment is. They need to discover their ontological core — the core concepts that are the essence of the business, what puts them out of distribution with everyone else — formalize it, and connect all their data into it. It's a big job; you can't just drop an AI in and have it done, because a lot of it is tacit knowledge in people's heads, spread across systems. If you don't, your advantage leaks out — someone comes in the back door and siphons your ontology into their platform to sell to your whole sector, or an agent swarm asks questions across parts of your organization that don't realize they're connected and pieces together your strategy. It bleeds out like heat from a body.

VivekPeople used to fear lock-in — am I locked into Salesforce, ServiceNow? You could at least brute-force export the data and migrate, painfully. People are confused about what migration looks like for agents and their context or ontology. Is it as simple as changing the model endpoint from OpenAI to Anthropic?

TonySwapping the model is easy — I do it all the time. In the Knowledge Graph Guys we have a powerful interconnected knowledge graph with an ontology, and I swap Codex or Claude Code through it and they don't even know the other exists. Intelligence is like electricity; the persistent artifact is the knowledge graph and the ontology. The bigger question is how you store the ontology and knowledge graph down — and that's where true open standards matter. Be wary: it's easy to slap "open" on something, or for a cohort of vendors to get together. There are real open standards — schema.org is in half the web already, little pieces of JSON-LD on half the world's pages, W3C, 100% open, no vendors, in mass production, proven. When someone pushes a brand-new proprietary ontology standard, compute why they're so motivated to depart from something open and proven.

JoshuaYou've said a knowledge graph is more of an architectural practice than a product. How do you get a CDO or CFO to understand the line item — how to decide on the systems and services?

TonyI point people to the idea of a semantic data product. At Deutsche Bank and then UBS we put semantic data products around things. The data-product movement had started — you think about exposing your information for others to use, a shift-left. The problem with data products alone is that without semantics it gets messy and they don't link up. A semantic data product publishes information related to a concept in the ontology, and uses unique URLs to represent each data item — so one can hyperlink to another, just like the web. That simple architectural pattern puts you on the right path. There's a standard a group of us produced called DPROD — it rolled out first at UBS and now elsewhere — an ontology for describing semantic data products. Point your AI agent at it and think about how it maps to your organization.

TonyHere's my next prediction. We had the web — we put our information on it and linked it with hyperlinks. Foundational models compressed all of that into a continuous representation of floating-point numbers — for the first time something that looked like general intelligence. Web was phase one, the compression into LLMs phase two, and we're now in phase three: organizations realizing they need to consolidate their context and learn their own model. But the phase after that is coming faster than people think — the agents go back out onto the web, and we get the agentic web: a distributed data market where agents pull the data products they need. That's why open standards are so important — a proprietary ontology format doesn't prepare you for that market.

VivekOne last question we ask every guest: one thing you could change about how enterprises implement AI, and it happens tomorrow — what is it?

TonyConcentrate on your data first. That's the move. Don't run various prototype projects all over the place. Take the AI we've got now, focus it back on your data estate, get your data sorted and organized — and the rest will fall into place. You'll be fine.

Operationalize

Take this episode to your AI

Open this episode in your assistant with the summary, key points, and a link to the full transcript — then put the insights to work in your own context.

ChatGPT Claude

Opens in a new tab · shares only this episode's public transcript link

Keep listening

S1 · Ep 5Casey Hart

An ontology fits on a Post-it note

Casey Hart is a rare thing: an actual ontologist — the job title Palantir once seemed to have a monopoly on. A philosophy PhD who answered a job ad from Cycorp and spent a decade building knowledge for machines under Doug Lenat and then at Olive, Amazon, Gro Intelligence, and Ford, he spends this episode deflating the word everyone is suddenly selling. An ontology, he argues, is just a summary of what your business cares about and how those things relate — you can start one on a Post-it note. He separates the machine-learning "system one" from the deterministic "system two" that ontologies supply, makes the case for a hybrid, and walks through building one from the ground up: taxonomies, relationships, turtle files and triple stores — or just the metadata, so you get value before migrating a single row. Along the way: why "hallucination" flatters a text generator doing exactly what it was built to do, the open-world versus closed-world assumption, and why vibe-coding an ontology out of an LLM is a fine way in but not a finished asset.

Ontology EngineeringView episode

S1 · Ep 5Karthik Soman

A different kind of answer, not a better number

Karthik Soman invented KG-RAG — knowledge-graph-based retrieval augmented generation — while building biomedical knowledge graphs at UCSF, and now leads enterprise-scale agentic AI at SAP America. One question carries from a PhD in computational neuroscience to the enterprise: how do you build intelligent systems that actually work in the real world? His answer isn't a better accuracy number but a different kind of answer — one a human can trace, question, and learn from. He walks through the case that convinced him: enriching patient records with a 40-million-node biomedical graph surfaced an olfactory-receptor gene that flagged Parkinson's five years early, catching a prodromal case a clinician had missed — not because the model was more accurate, but because it pointed at a mechanism. Then he moves to the enterprise, where the curated ontologies of biomedicine don't exist. You lean on the topology already inside your documents. Graph-based reasoning turns out to be a sixty-year-old idea that LLMs merely made usable on the fly. And a graph earns its keep over vector RAG in specific places — multi-hop questions, smaller models, private data the model never saw — before the least glamorous advice in AI: data hygiene first, then AI hygiene.

Knowledge GraphsView episode

S1 · Ep 4Melli Annamalai

A graph is one tool, not the destination

Melli Annamalai has spent 27 years at Oracle watching technology waves crest and break — multimedia retrieval, the semantic web, big data, property-graph analytics, and now knowledge graphs for AI. As the Distinguished Product Manager who leads graph technologies there, she makes an argument you rarely hear from a database vendor: a knowledge graph is one tool in the kit, not the destination. She traces why semantic-web tech stalled for two decades: a steep learning curve, a custom RDF/OWL/SPARQL ecosystem, and a year-and-a-half payback that senior management wouldn't fund. Then what AI finally changed, and where these projects still quietly fail — over-engineering everything into a graph, tuning and tooling gaps, and the security officer who shuts it all down. Along the way, a working definition of "ontology" for non-technical buyers, natural language as the new query language, and Oracle's converged-database bet to collapse the graph, vector, and agent layers into the place the data already lives.

Knowledge GraphsView episode

Go deeper

Discover your ontological core

Tony Seale

What to do with this episode

Cycle the loop, don't pick a side

Give every concept a URL

Interrogate the word "open"

Concentrate on your data first

We discuss

Transcript

Take this episode to your AI

An ontology fits on a Post-it note

A different kind of answer, not a better number

A graph is one tool, not the destination

Further reading

Neuro-Symbolic AI

Intelligence Migration is NOT Possible

Intelligence Infrastructure

Tony Seale

What to do with this episode

Cycle the loop, don't pick a side

Give every concept a URL

Interrogate the word "open"

Concentrate on your data first

We discuss

Transcript

Take this episode to your AI

Related episodes

An ontology fits on a Post-it note

A different kind of answer, not a better number

A graph is one tool, not the destination

Further reading

Neuro-Symbolic AI

Intelligence Migration is NOT Possible

Intelligence Infrastructure