iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📖

re:Invent 2025: Leveraging Symbolic AI and Knowledge Graphs in the Age of LLMs

に公開

Introduction

By transcribing various overseas lectures into Japanese articles, we aim to make hidden valuable information more accessible. The presentation we will cover in this project, proceeding with this concept, is here!

Regarding the re:Invent 2025 transcript articles, information is compiled in this Spreadsheet. Please check it as well.

📖re:Invent 2025: AWS re:Invent 2025 - Symbolic AI in the age of LLMs (DAT443)

In this video, Ora explains the history and modern applications of Symbolic AI. AI has a history of over 80 years, repeating cycles of excitement and disappointment. Knowledge graphs and ontologies are introduced as core technologies of Symbolic AI, with practical examples of standardized ontology languages such as RDF, OWL, and SHACL. Using the dog Wally as an example, the basic principles of set theory and logical inference are explained, and the differences between the open-world assumption and closed-world assumption are touched upon. Furthermore, to address issues like hallucinations and energy inefficiency faced by LLMs, the effectiveness of a hybrid approach combining knowledge graphs and symbolic reasoning is proposed. Differing from Graph RAG, an innovative idea of using LLMs to improve the usability of knowledge graphs, and the potential of neuro-symbolic AI are also mentioned. An implementation example using Amazon Neptune is also introduced.

https://www.youtube.com/watch?v=Atf4DVKGuMg

  • Please note that this article is automatically generated, striving to maintain the content of the existing lecture as much as possible. Typographical errors or incorrect information may be present.

Main Content

Thumbnail 0

Thumbnail 30

Re-evaluating Symbolic AI: Over 80 Years of AI History and the Third AI Summer

Hi, I'm Ora. I'm going to talk to you about symbolic AI, which may be a strange topic now that everyone wants to do generative AI, but I promise you if you're not already familiar with it, this is something everyone needs to understand and learn about. So, a little bit about me. These are scenes from my misspent youth. I have a PhD in AI and computer science. I was involved in creating the original vision for what became known as the Semantic Web. I'm a co-author of the original RDF specification. If you don't know what RDF is, I'll tell you about it later. I'm also the author of this book. And as a fun fact, my software at one point flew to the asteroid belt. That was symbolic AI, and we'll touch on these technologies in a moment.

Thumbnail 70

Thumbnail 70

This is my plan for today. I'm going to talk about a very brief history of AI, the part of AI that's more than three years old, as you might think. I'll talk about what the elements of symbolic AI are, and then what symbolic AI is today, why it's relevant, and what useful technologies you can apply today. And then finally, I'll close with a little discussion about what this all means in terms of non-symbolic AI, generative AI, all the so-called new stuff.

Thumbnail 120

Thumbnail 130

Now, when I first wrote this talk, I had a much bigger section on history, but the powers that be didn't want that. So, I'm just going to give you a very brief introduction. This is essentially the history of AI. The important thing to understand is that it spans over 80 years, and it's been something of a pattern of excitement and disappointment. We're currently in the middle of our third AI summer. We call these summers and winters. And I was perhaps a little skeptical about the new, new, new AI, because I don't want to see a third AI winter. I went through the second one, and I continued to work in AI, but people were overtly hostile and would ask me why I was doing that. They'd say AI doesn't work. But here we are.

Thumbnail 190

Constituent Technologies of Symbolic AI: Rule-Based Systems as Mainstream Technology and New Programming Paradigms Evolved from AI

Symbolic AI, or what was originally just called AI, is a collection of all sorts of technologies, and this is my incomplete list. But I think it's a representative list in the sense that these are the key core technologies in symbolic AI. And it's important to understand, of course, that the history of symbolic AI also includes successful deployments of systems that used these technologies. So not everything was just research. I'm not going to talk about all of these. I just don't have time. I'd be here all day. But if there's anything you're particularly interested in, please come up and talk to me afterward. I'm happy to talk about any of these at length.

Another really important thing to understand about the history of AI is that there's a pattern where a technology comes along, and you think it's AI, and then you learn what it's actually about. And once you understand it, you think, well, that can't be AI, because I understand how it works. And a lot of these constituent technologies in the past have stopped being AI. They just became mainstream. So rule-based systems, nobody thinks that's AI anymore. Heuristic search, no, certainly nobody thinks that's AI, but when they were invented, they were very much AI.

This bottom bit, two new programming languages emerged from the long history of AI research. And you might think, well, programming languages always emerge. But these weren't just languages. These were two completely new programming paradigms: functional programming and logic programming, and they emerged from AI research. And now, we don't consider those AI anymore.

Thumbnail 350

Foundations of Symbolic Reasoning: Mathematical Logic and Set Theory as Reasoning Rules, Using Wally as an Example

Throughout this talk, I'm going to have to talk about things like symbolic reasoning and logical inference. So, I'm going to give you a very brief introduction to that. So this is Wally. Wally is a good boy. In symbolic reasoning, we use mathematical logic and set theory as a sort of representation language. We can represent all dogs as a set. We'll call it dogs. And we can say that Wally is a member of this set. That's one way of saying that Wally is a dog.

Thumbnail 390

Thumbnail 400

Thumbnail 410

Thumbnail 420

So let me show you a simple rule of inference, or how we do symbolic reasoning. We start by saying, let's say X belongs to some set. Let's call that set C. And we also say that C is a subset of set D. If these things hold, then we can conclude that X is in fact also a member of this bigger set, set D. And this is a rule of inference. So we can say that dogs are a subset of mammals, and if we know that Wally belongs to dogs, then we can conclude that Wally also belongs to the set of mammals. Or in other words, Wally is a mammal.

Thumbnail 460

Now, this sounds incredibly trivial, of course, right? But that's what symbolic reasoning is. And real-world applications of symbolic reasoning are far more complex than this, but the principles are the same. So let me give you a visual representation. We have the set of dogs, and Wally is in there. And we have the bigger set of mammals. So now you all understand what symbolic reasoning is. Nobody? Excellent. Good. Let's move on.

Thumbnail 480

Knowledge Graphs and Ontologies: Modern Realization of the Semantic Web and Applications to Data Integration

So, let's talk about knowledge graphs and ontologies. Because these are really the most prominent manifestations of symbolic AI today. And in some ways, they are because there was a sort of resurgence of symbolic AI technologies about 25 years ago, and that led to this. So first, let's cover some terminology. When we talk about graphs, we're referring to graphs in the mathematical sense. So that's a structure of nodes and edges. Typically, if you want to draw this on a whiteboard, it's circles and arrows between them. But we don't mean graphs in the graphical sense.

Thumbnail 560

And a knowledge graph is one application of a graph, where essentially the nodes represent or denote real-world entities. And the edges denote various types of relationships between these entities. So you can use a knowledge graph to link and organize information. And there was this thing that came out 25 years ago called the Semantic Web, and knowledge graphs today are really the modern realization of that idea. The primary purpose of knowledge graphs is really to capture knowledge. And often, that's done within a collaborative organizational context. But there are other uses, very good uses, for the same technology.

One thing I've seen over and over again, and built systems myself, is making knowledge accessible to people, and now more recently, accessible to LLMs. There's a low-level feature of knowledge graphs that comes along almost like a bonus benefit. And that is that knowledge graphs are exceedingly good at data integration.

Any data that you have, we call it legacy data, you can map that into a knowledge graph. And if you do that correctly, the knowledge graph then represents an integration of these different data sources. And finally, there's this thing called digital twins, where people build complex representations of physical systems in digital form. So you can imagine a digital twin of an electrical system, for instance. And you can use that to ask questions like, what happens if I flip this switch, or if this circuit breaker blows, who is affected.

To build a knowledge graph, you need to have a model of what the information within that knowledge graph is. And this model is called an ontology. And I'll come back to what that means later. Now, once you finally build this knowledge graph, you have to store it somewhere. You have to be able to manipulate it. And for that, you need a graph database. For a purely random example, Amazon Neptune is such a graph database. And a very scalable one at that.

Thumbnail 700

Definition of Ontology and Semantics: Concepts, Properties, Constraints, and How Meaning is Captured

So let's talk a little bit about what an ontology is. There's a generic definition of what an ontology is. And if you look it up, you always get the same definition, and it's completely unhelpful. So I'm not going to repeat it here. This is my definition of what an ontology is. An ontology consists, first of all, of concepts that you identify. You can think of them as classes. So following our example, dog would be one of those concepts.

Then you define properties on those concepts. So for dogs, that would be things like fur color. You also define relationships between instances of those concepts. So, for instance, a dog typically has an owner, and that owner is typically a human. You also specify logical constraints that must be met. And an example here is that the intersection of dog and cat, or dogs and cats, is empty. This means that if you think of these as sets, these sets have no common parts. So they're disjoint.

And then finally, although this is not true of all ontologies, ontologies can also include actual individuals, as we call them. So Wally, for instance, is an individual. You can think of Wally as an instance of the class dog. To define an ontology, you need an ontology language. And what we didn't have back in the nineties, and what we do have now, are standardized ontology languages. And I'll talk a little bit more about what those standardized ontology languages are in a moment. But having standards has really made the use of these sorts of things all that much easier. Because having standards means that people can build tools and so on.

Thumbnail 820

Now, I mentioned that ontologies capture the semantics of a domain. So what is semantics actually? I mean, has anybody heard anybody use the word semantics? Yeah, I think people use it a lot. Most people don't know what it means. And I find that a little bit ironic, because semantics is really about meaning. And I think of semantics in a somewhat informal way, and it's really a definition of how software can interpret your data.

And given that, semantics can come from the relationship between your data and the definitions that you have in your ontology. The ontology is a means of capturing and communicating semantics. But it can also come from the relationship between two pieces of data. That is, they are somehow related, and that provides meaning. For instance, for humans, if we say that this person is the child of that person, we might be able to conclude that that other person is the parent.

And for those things that you can't represent this way, you just have to, of course, hardwire the semantics in code. And historically, of course, all, all semantics was hardwired.

The big goal is that we can move towards a situation where things become more definition and data-driven, and less buried in a black box that you can't easily inspect.

Challenges of Formal Semantics: Limitations of Machine Understanding as Seen Through JSON and the Finnish Language Example

What's interesting about semantics is that people usually have a very hard time distinguishing between formal semantics and, say, their own interpretation of some data. And this is important because it leads to unrealistic expectations about what software systems can actually do. And as a result, I often hear people say things like, JSON is understandable. How many people in this room think JSON is understandable? A few of you, okay. Not all of you, which is good.

Thumbnail 970

So let me give you an example of what I'm talking about. So this is a little JSON snippet. I'll give you a moment to look at it. And as you read this, you're going to think, oh, okay, I know what that is. I know the meaning of this data. And that's a very human response.

Thumbnail 1000

Let me show you another JSON snippet that on the surface has exactly the same meaning, and tell me if you still understand the meaning. Perhaps somebody here actually speaks Finnish and can therefore interpret this in the same way as the previous one. But I think of Finnish as one of those weak cipher mode type languages, where the key is known, but very few people have it.

Anyway, this is the exact same thing. I've just translated the keys and some of the values into Finnish. In the English version, you basically performed a grounding of these symbols, the symbols like degree and hobbies. You interpreted what they were based on your experience and your understanding of the English language, and you now think that that's the meaning.

Thumbnail 1080

And of course, you could say that this data means something completely different, and it has nothing to do with those English words that are in there. But what you're seeing here is how all machines always see data. So no, I don't think JSON is understandable.

Ontology Selection and Evaluation: Criteria for "Build or Buy" and the Problem of Ontological Commitment

Okay, so this is still background. Now let me talk a little bit about what it means to work with ontologies. So let's say you've decided you need a knowledge graph, and therefore you need to have an ontology. Your first question is really build or buy. And by buy, I don't literally mean buy. Because ontologies tend not to be sold, they tend to be shared, they tend to be published. But it's kind of the same thing.

Thumbnail 1120

So your options are, first of all, you can go out and find some ontology and say, okay, this is perfect for my use case, I'm just going to use it as is. You have to evaluate the fitness very carefully. And I'll talk a little bit about what that evaluation actually is. And this will help you as you start choosing your ontologies.

Thumbnail 1150

You can also take an existing ontology and say, yeah, this is almost what I need, but it's not quite, and I'm going to extend it. The ontology languages are designed in such a way that you can take an existing ontology and extend it for your own purposes. And there are also things called upper ontologies, which can be a good starting point for this sort of work.

Thumbnail 1190

And it's important to understand that every ontology, or the designer of that ontology, has made some decisions about how the world is modeled. And you need to understand what those decisions are, or else bad things can happen. And then your third option is, you can just start from scratch. And you say, okay, I don't need anything that anybody else has done, I'm just going to define my own. That's perfectly fine.

Just be prepared for the work. Because I often hear these days, oh, I'll just have the LLM write an ontology for me. LLMs will write initial drafts of ontologies, but then you really need a human eye on it.

Thumbnail 1230

Okay, so let's say you found an ontology, and you're thinking maybe I'll extend this. But first, I have to evaluate its fitness. So what is that? Well, first of all, think about your use cases. It doesn't necessarily have to be a single use case. Pick a few use cases, and be prepared for more use cases to be found in the future. Knowledge graphs have an interesting property where you build something and you think, okay, I've covered my use case, and then later on you realize, oh my God, this knowledge graph can answer questions that I didn't even anticipate. And so there may be many more use cases and uses for that knowledge graph.

When we design ontologies, we usually define things called competency questions. These are really, if you do this right, the questions that you expect the resulting knowledge graph to be able to answer. And in a sense, competency questions should be translatable into graph queries at some point. And then coverage is, you have something you want to model, and you look around your company, your organization, and you say, okay, we deal with these sorts of concepts, and then you make sure that those concepts exist in the ontology that you're going to use. And if they don't, you extend the ontology to cover all of them.

Thumbnail 1320

Thumbnail 1350

Then there's the problem that I referred to earlier. People have already made decisions about how they model the world. I refer to this as ontological commitment. And this is where what I call the law of unintended consequences comes into play. That is, in other words, it seemed like a good idea at the time. You adopt this ontology, and then later on you find, oh my God, I can't do certain things because of decisions that were made way back when. This happens all the time.

And then finally, there's the problem of expressivity. Expressivity is really a measure of what you can say using the ontology. And for that, you typically need something called a reasoner. And there's always this trade-off. It's a trade-off between expressivity and the computational expense of using a particular ontology that requires that particular level of expressivity. Generally speaking, as a rough generalization, the more expressive it is, the more computationally difficult it is as well.

Thumbnail 1400

Practical Application of Ontologies: From Automating Data Processing to Concrete Implementations with RDF and OWL

So what can you use these things for? A very typical use for an ontology is that you can operationalize or automate the processing of data. So you essentially have data, and the meaning, the semantics, of that data comes along with that data. And the ideal end goal is that you can have software that can process that data, that doesn't actually have to be coded, but can somehow interpret that data. You just feed it the ontology, and it interprets the ontology and can do meaningful things with that data. And this is actually something that can be done.

Ontologies can also be used to make implicit information explicit. So that whole example with Wally is a dog, and now we realize, oh, Wally is also a mammal. We didn't say Wally is a mammal. We made explicit something that was implicit in the data. Ontologies also serve as documentation. And having said all these things, there's a lot of sharing here, and at least for me, there's an implication that I really want to move away from closed, proprietary ontologies. While they can certainly solve specific problems, you won't get the full benefit of what ontologies can do.

Thumbnail 1490

Now, some examples of shared public ontologies that you can use. First of all, there are these upper ontologies that you can use as a starting point. Basic Formal Ontology is very, very abstract, and I always feel like you really need a philosophy grad degree to use this. GIST is much more pragmatic. Both are popular.

Thumbnail 1520

Then there are domain-specific ontologies. FIBO is about the financial industry and various concepts in the financial industry. SNOMED is about medicine, and CIDOC is for people who want to deal with cultural heritage, museums, archives, that sort of thing.

Thumbnail 1540

Thumbnail 1570

Now, in addition to these two categories, there are these narrow ontologies for single purposes that you can incorporate into your own definitions. These are four of my favorites. Dublin Core has been around for a long time. PROV-O is an excellent model for capturing provenance, and I'll show you an example of what SKOS does later on when we build taxonomies and thesauri.

Thumbnail 1610

So let's build a simple ontology and see how it works. This uses RDF and OWL as the ontology languages, and it uses a particular syntax of RDF called Turtle. RDF has multiple syntaxes, depending on what you prefer. First, we define a class, dog. And we say that it's first of all a subclass of mammal, and it's also a subclass of pet. And we say that it's disjoint with cat. This is in line with what I talked about earlier. Then we define some classes: cat, human, mammal, pet, very straightforward stuff.

Thumbnail 1650

And then we define some properties. We define a fur color property, and we say that the fur color property applies to dogs, and its value is some sort of string. We don't specify what's in that string. And owner is a relationship that links a pet and a human. So this is our simple ontology, and in this representation, we refer to this as the T-box or the terminology box. These are the definitions of the data. And then there's typically something else called the A-box or the assertions box. And that's actually the instantiation of those definitions.

Thumbnail 1680

So Wally is a dog, has white and black fur, and has an owner. Coco is another dog. Dora is a human. That sort of thing. These are individuals, and they don't necessarily have to be part of the ontology, but they go into the knowledge graph. And because we're using RDF, there's a graphical representation of this. So here it is. Wally is an instance of class dog. Dora is an instance of class human, and Wally has owner Dora. Simple graph structure, but RDF has this kind of self-similar structure.

Thumbnail 1710

Thumbnail 1720

Thumbnail 1750

So you can actually include the definitions themselves in the graph as well. In this case, dog and human are both instances of a class called class. So this would be kind of a metaclass. And this is an example of querying the data that we just looked at using Neptune. I've extended Jupyter notebooks so that you can query directly from the cell and get results. And those results can either be in tabular form or visualized as a graph. Everything I'm talking about is perfectly real, and when you want to start building knowledge graphs, we can help you with that.

Comparing Ontology Languages: Features of RDF, OWL, SHACL, and Open/Closed World Assumptions

Okay, I mentioned that there are different ontology languages. So first of all, RDF is a very simple ontology language. I'll give you a few examples. We define pet, we define the owner relationship. And then we can use this to infer new data. So we say that owner is a relationship that links a pet and a human, and then for Wally who is a dog, and we know that dogs are pets, we say it has an owner Dora, but we don't say anything else about Dora. From that, we can infer that Dora must be human.

Thumbnail 1800

Now, OWL is a more expressive ontology language. And we can use the subclass mechanism to express constraints.

So here we're saying that a pet must have exactly one owner, and that owner must be a human. And when you process the graph with a reasoner, if there are constraints that are not met, two things can happen. The reasoner will either report that there's an inconsistency, or it will add new information to the graph such that those constraints are met. Similarly, if we introduce an owner but it's not a human, the reasoner can actually infer that that owner is a human.

Thumbnail 1850

SHACL is an RDF-based language that was originally designed to validate RDF data. But you can also use it to express constraints. And these constraints in SHACL are referred to as shapes. But unlike RDF and OWL, SHACL never infers new data. So if there's a constraint that's not met, the SHACL engine will simply tell you that the data didn't validate.

Thumbnail 1890

So why is that? Why does RDF and OWL behave differently than SHACL? And for that, we need to talk about something called world assumptions. First of all, the closed-world assumption assumes that everything in your database is all there is and nothing else. So all of the inferences you do can rely on the fact that you have complete knowledge. And that means you can use something called negation by failure. So if you can't prove something to be true, you must conclude that it's false. Traditional databases tend to use the closed-world assumption.

Thumbnail 1940

The open-world assumption, on the other hand, is the idea that we don't know everything. There might be more information out there, and we can't assume that we have everything. And that means you can't use negation by failure as a rule of inference. We adopted this for the Semantic Web because we assumed that on the web, there might be more out there that we just haven't gotten to a particular server, or we haven't collected a particular piece of data, and it made perfect sense.

Thumbnail 1970

So, let me just briefly talk about these ontological commitments, these modeling commitments. We defined that a pet can only have one owner. But what if we have a case like co-ownership of a pet? How do you deal with that? This happens all the time. You pick an ontology, you do your modeling, and then later on you realize, oh, I can't do this. Possible solutions are, if you're in control of those upper definitions, you can refactor the code a little bit, the ontology code I mean. And make it so that you can have multiple owners, or you can introduce a new relationship. Co-owner, or that sort of thing. There are ways to fix this.

Thumbnail 2020

Modeling Dog Breeds: Class Hierarchies vs. SKOS Concept Schemes for Different Representations

And then when you're defining an ontology and you have different sorts of concepts, sometimes you have concepts that are variations of each other. You can capture this using class hierarchies. The alternative is to capture these variations using something called a concept scheme. And I'll explain this. So for instance, let's say we want to represent dogs, including their breeds.

Thumbnail 2050

So here are some definitions. Dog is a class, Toy Dog is a subclass of Dog, Shih Tzu is a subclass of Toy Dog. These are, what is it, American Kennel Club classifications. So we introduce all these definitions, and now we say Wally is a Shih Tzu and a Poodle. Yes, these are what are now referred to as designer breeds. I would call them mutts, but anyway, so here we're doing it using a class hierarchy, and it looks something like this.

Thumbnail 2090

And the thing to note there is that you can have an instance that is an instance of multiple classes. So multiple inheritance in RDF is not just for classes.

Thumbnail 2110

If you don't like that, you can introduce a designer breed. So this one is actually called a Sheepoo. This is actually, this is a real thing. And you can say that a Sheepoo is a subclass of both a Shih Tzu and a Poodle. And Wally is an instance of this class. So that works out fine.

Thumbnail 2130

Thumbnail 2150

Now, the other way is the idea that all dogs are actually pretty similar. We don't need multiple classes. We don't want to clutter up our ontology with tons and tons of classes. So let's just give dogs a property called breed, and let's decide what values that property can take. And for that, we need this SKOS vocabulary that I referred to earlier. And then you build something called a concept scheme. And then all these things that we had defined earlier as classes, you now define as SKOS concepts. And then you can just say Wally is a dog, and Wally's breed is Shih Tzu and Poodle. So we're basically expressing the same thing. It's just now we only have one class.

Thumbnail 2190

Thumbnail 2200

And again here, if you want to only have one value for breed, you can introduce the designer breed as another concept, and you can do it that way. So that's how that works. That's how you define ontologies and deal with classes and definitions. So now let me just briefly talk about a few ways that you can actually use inference to work. And you don't need a math PhD for this. You can all do this. Although some of you may have a math PhD, and that's okay, but it's not a prerequisite here.

Thumbnail 2220

Thumbnail 2240

Practical Inference and Materialization: From Discovering Implicit Information to Query-Time Inference and Truth Maintenance

There are many use cases for inference. One of them is to make implicit information explicit, which we've already talked about. If I say Wally is a ShihPoo, I can conclude Wally is a dog. I can also conclude Wally is a mammal. These are useful things to know. You can also discover inconsistencies. If I said that Wally is a cat and a dog, the reasoner will tell me that there's a fundamental inconsistency that can't be fixed by adding data to the graph.

Thumbnail 2250

Thumbnail 2280

Inference can also be used to make it easier to author queries. Because you can, for instance, operate using only the base classes, and your ontology can evolve. But inference essentially isolates your queries from changes in your ontology. Even if I add new dog breeds, this query will continue to work. RDF basically provides almost taxonomic inference only. But in addition to class hierarchies, properties can also have hierarchies, and this is a very useful feature when you're extending somebody else's ontology. You can introduce your own properties, but you can say that these are actually extensions of properties that came from some other ontology.

Thumbnail 2310

Thumbnail 2340

So here I've defined an OWL class, black poodle. And I've defined it in such a way that the breed must have a value poodle, and the fur color must have a value black. Now, I told you earlier that you use subclassing to introduce constraints, but here I've used something called an equivalent class. And the difference is this. When you use a subclass, you introduce necessary conditions. All black poodles must meet these constraints. But when you use an equivalent class, they are necessary and sufficient conditions. Which means that if the reasoner finds something whose breed is poodle and whose fur color is black, it can actually conclude that this is an instance of the black poodle class. Which means that the reasoner can be used to classify data.

Thumbnail 2380

Thumbnail 2400

Now, the natural question here is, how and when do you do this? Some triplestores and graph databases for RDF have built-in inference engines. If you don't, there are still options. You can use SPARQL, which is a query language, as an inference engine, if you will.

Thumbnail 2430

So here's an example of figuring out what the class of an instance is. If you don't use inference, you can rewrite your query to basically climb up the ontology hierarchy and do what a real inference engine would do. And this is done at query time. If you want to do this before query time, you can pre-materialize the results of these inferences. By the way, the results of inferences are called entailments. You'll see that word pop up from time to time. So when I say entailment, think of it as the result of an inference.

Thumbnail 2460

So this is how you would materialize the entailments in the graph. And you have a choice between these two: you can do it at query time, or you can do it beforehand. If your data changes frequently, it might be more convenient to do it at query time. But if your data is mostly static, then if you do it beforehand, you don't have to do anything at query time.

Thumbnail 2480

When you do materialization, you run into this problem that we refer to as truth maintenance. And I used to think truth maintenance was something that authoritarian governments did, but truth maintenance is really the problem of, okay, I've materialized some entailments, and now I've changed my data. Which entailments do I have to undo because they're no longer supported by the original data? And this can be a very expensive operation.

Thumbnail 2530

If you're using SPARQL, which is a query language, as an inference engine, you can do what I think of as poor man's truth maintenance. You can materialize them, put them into a different named graph. RDF has this feature called named graphs. And then when there's a change, you just drop that graph and recompute it. So, that's that.

Thumbnail 2550

Challenges of New AI and the Role of Symbolic AI: Hallucinations, Accountability, and Energy Efficiency Issues

So let's talk a little bit about the new AI and its relationship to the old AI, if you will. And so the real question is, can the old AI help the new AI? So you've all seen this quote. The important thing to note about this quote is that Arthur C. Clarke said, any sufficiently advanced technology is indistinguishable from magic. He did not say it's magic. And there are a lot of people who treat the new AI as if it's some sort of magic, and it's not. And it's really important to keep that in mind.

Thumbnail 2580

Thumbnail 2600

Thumbnail 2610

Now, there are some challenges. You all know that these are challenges, so first of all, hallucinations, obviously. LLMs make stuff up. And that's a feature. They're designed to do that. But that makes it difficult to build systems. Anthropomorphism. People start to think of their chatbots as if they're actual human beings. And the answers sound convincing. People believe them. They don't check. And that's a problem. And computational efficiency is definitely a problem. I mean, if you already know how to calculate something, just calculate it. Don't ask an LLM. It's incredibly inefficient. And our energy supplies are finite.

Thumbnail 2630

Related to this, there are some interesting historical criticisms of AI. Joseph Weizenbaum built what was arguably the first chatbot back in the 1960s. And it was, of course, very crude compared to what we have today. It used things like rules and pattern matching. But he noticed that people started to converse with this chatbot as if it were a real human being, and Weizenbaum then became a vocal critic of AI. He didn't like the idea of anthropomorphizing computers.

The Chinese room experiment is an interesting philosophical question that really asks what is consciousness and what is understanding? Can a machine understand something? Can an artificial system be conscious? I used to think this was just a philosophical question, but more recently, it's perhaps a more practical question. Even these are very relevant today.

Thumbnail 2690

The history of AI is about scarcity. We've always had a scarcity of something. We didn't have the right algorithms. We didn't have enough memory. We didn't have fast enough processors. These are more or less solved in many ways. There's enough of them. But today, we have other scarcities.

So here's a list of the scarcities that we face today. So let's talk a little bit about this.

Thumbnail 2720

Thumbnail 2740

We want correct answers, right? We don't want the system to hallucinate nonsense. So instead of having the LLM generate the answer, you can extract the answer from a knowledge graph. Accountability, trustworthiness, this is the same thing. You want to know where the answer came from. If it came from your knowledge graph, you can ensure that that knowledge graph is curatable and auditable and contains verifiable factual information.

Thumbnail 2760

You want to be able to explain the answer. An insurance company tells a policyholder that their claim has been rejected. The first question that the policyholder is going to have is, probably, why was it rejected? They want an explanation, and explainability is a salient feature of symbolic reasoning. When you do symbolic reasoning, you can always backtrack to why you got to a particular answer.

Thumbnail 2790

Thumbnail 2810

And then there's this energy problem. Getting an answer to a question from a knowledge graph is literally orders of magnitude more efficient than getting it from an LLM. So there's a growing understanding that future AI systems are really going to be hybrid. They're going to combine different sorts of technologies, symbolic and these new non-symbolic ones.

Hybrid AI Architectures: Dual Process Models and Integration of LLMs and Knowledge Graphs

But in cognitive science, there was this idea of cognition that they referred to as the dual process model. And it's an idea that was actually popularized by Daniel Kahneman's book, "Thinking Fast and Slow." It wasn't originally his idea, but he popularized it. And the idea is that you have a fast process that gives you an answer immediately, but that answer might be inaccurate or biased, or you can deliberate a little bit more on an answer, and then you'll get a more reliable answer.

And back in the seventies, there was a lot of very interesting research when people started to really seriously think about intelligence and consciousness, and they started to draw what we would now think of as architecture diagrams, software architecture diagrams. Aaron Sloman did a lot of that, Daniel Dennett did a lot of that. So here's my very simplified idea of how you could use an LLM and a knowledge graph.

Thumbnail 2880

Thumbnail 2910

So you ask a question, the LLM converts the question into a query. The query is executed, and the answer comes back from the knowledge graph. And of course, that could include symbolic reasoning, not just basic query processing. So how would this fast and slow mechanism work? Well, it might work something like this. The LLM gives you a fast answer, or you can go through another route and get a more effortful answer. Other architectures are possible, of course, and completely reasonable, but as a simple example to explain what these might look like, I like this one.

Thumbnail 2920

Now you're going to ask, well, this is Graph RAG, right? This is not Graph RAG. Graph RAG uses graphs, not necessarily knowledge graphs, but it essentially enhances the performance or the accuracy or whatever of the LLM. But I want to turn this around, and I want to use LLMs to enhance the performance and the usability of knowledge graphs. And now with LLMs, we finally have decent natural language processing capabilities. So taking a question and converting it into something, but the answer comes from a knowledge graph, seems to me like an excellent idea.

Thumbnail 2970

Neurosymbolic AI: Diverse Approaches Combining Symbolic and Non-Symbolic Processing

And then finally, people are talking about neurosymbolic AI, so let me just explain what that is. Symbolic AI is often seen as unrealistic and too idealistic. Everything has to be perfect for things to work well. On the positive side, the representations are completely transparent, and you can explain the results.

Thumbnail 2990

Non-symbolic AI, on the other hand, can deal with messy real-world data, but the representations are opaque. You can't actually look inside, and you can't explain the results. So what we really want is to get all the green bits, right?

Thumbnail 3010

So let me give you a few examples of what neurosymbolic AI might be. It's not a single architecture or a single methodology. There are many ways of combining symbolic and non-symbolic processing. You can combine the two, for instance, by using heuristic search, but then the evaluation function in your search actually uses neural processing. And obviously, this System One and System Two architecture is also an example of that.

You can also take non-symbolic inputs, like pictures and images, convert them into some sort of symbolic representation, and then use symbolic techniques to reason from those symbolic representations. Or you can take symbolic data and use it together with non-symbolic machine learning algorithms. For instance, there's been some research where they simplify mathematical equations and give that as training data to a machine learning system, and then they claim that they've created a machine learning system that can do mathematical simplification. This is still speculative, but the results are surprisingly good.

Thumbnail 3090

Thumbnail 3100

Conclusion: The Modern Significance of 80-Year-Old AI Technologies and the Future of Hybrid Systems

Okay, I should probably wrap things up. I hope you take something away from this. AI has been around for a long time, much longer than three years, right? And over this 80-year period, many useful techniques and technologies have been developed. And these technologies, once they become well-understood, actually stop being called AI. In the early 2000s, there was a dramatic shift from symbolic to non-symbolic AI, but symbolic AI technologies are still relevant.

Thumbnail 3130

Thumbnail 3150

Ontologies, you can do this. You can use ontologies, build knowledge graphs, operationalize the processing of your data, use ontologies to better document your data, and even use inference. All these things are very real. And then finally, you can use symbolic techniques to mitigate some of the problems that LLMs have. Not just ontologies and knowledge graphs, but things like planning and autonomous agents that came out of symbolic AI are also very useful. I didn't have time to talk about that, but I'm here, so if you're really interested, come and ask. The whole idea of planning, in particular, tends not to be done very much in modern agent implementations, but I've always thought that for an agent to exhibit autonomy, it has to be able to plan. So you're going to see a lot more of these hybrid systems in the future. We have a little bit of time. I won't take questions now, but come up and talk to me afterward, because it's much easier to answer questions that way.

So with that, thank you very much. And if you have any questions later, feel free to email me. Just remind me of this lecture. My email address, I'm Ora, and my email address is very hard to remember. It's Ora@Amazon.com. So thank you very much.


  • This article was automatically generated using Amazon Bedrock, while striving to maintain the original video's information as much as possible.

Discussion