AI Knowledge Map: How To Classify AI Technologies
I have been in the space of artificial intelligence for a while and am aware that multiple classifications, distinctions, landscapes, and infographics exist to represent and track the different ways to think about AI. However, I am not a big fan of those categorization exercises, mainly because I tend to think that the effort of classifying dynamic data points into predetermined fixed boxes is often not worth the benefits of having such a “clear” framework (this is a generalization of course as sometimes they are extremely useful).
I also believe this landscape is useful for people new to the space to grasp at-a-glance the complexity and depth of this topic, as well as for those more experienced to have a reference point and to create new conversations around specific technologies.
What follows is then an effort to draw an architecture to access knowledge on AI and follow emergent dynamics, a gateway of pre-existing knowledge on the topic that will allow you to scout around for additional information and eventually create new knowledge on AI. I call it the AI Knowledge Map (AIKM).
On the axes, you will find two macro-groups, i.e., the AI Paradigms and the AI Problem Domains. The AI Paradigms (X-axis) are the approaches used by AI researchers to solve specific AI-related problems (it includes up to date approaches). On the other side, the AI Problem Domains (Y-axis) are historically the type of problems AI can solve. In some sense, it also indicates the potential capabilities of an AI technology.
Hence, I have identified the following AI paradigms:
- Logic-based tools: tools that are used for knowledge representation and problem-solving
- Knowledge-based tools: tools based on ontologies and huge databases of notions, information, and rules
- Probabilistic methods: tools that allow agents to act in incomplete information scenarios
- Machine learning: tools that allow computers to learn from data
- Embodied intelligence: engineering toolbox, which assumes that a body (or at least a partial set of functions such as movement, perception, interaction, and visualization) is required for higher intelligence
- Search and optimization: tools that allow intelligent search with many possible solutions.
These six paradigms also fall into three different macro-approaches, namely Symbolic, Sub-symbolic and Statistical (represented by different colors above). Briefly, the Symbolic approach states that human intelligence could be reduced to symbol manipulation, the Sub-symbolic approach is one that no specific representations of knowledge is provided ex-ante, while the Statistical approach is based on mathematical tools to solve specific sub-problems.
The vertical axis instead lays down the problems AI has been used for, and the classification here is quite standard:
- Reasoning: the capability to solve problems
- Knowledge: the ability to represent and understand the world
- Planning: the capability of setting and achieving goals
- Communication: the ability to understand language and communicate
- Perception: the ability to transform raw sensorial inputs (e.g., images, sounds, etc.) into usable information.
The patterns of the boxes divide the technologies into two groups, i.e., narrow applications and general applications. The words used are on purpose, yet may appear slightly misleading.. bear with me a moment as I explain. For anyone getting started in AI, knowing the difference between Weak/Narrow AI (ANI), Strong/General AI (AGI), and Artificial Super Intelligence (ASI) is paramount. For the sake of clarity, ASI is simply a speculation up to date, General AI is the final goal and holy grail of researchers, while narrow AI is what we really have today, i.e., a set of technologies which are unable to cope with anything outside their scope (which is the main difference with AGI).
The two types of lines used in the graph (continuous and dotted) explicitly point to the distinction to help add some confidence when you read other introductory AI material. However, at the same time, the difference here outlines technologies that can only solve a specific task (generally better than humans — Narrow applications) and others that solve multiple tasks today or in the future and interact with the world (better than many humans — General applications).
Finally, let’s see what there is within the graph itself. In the map, the different classes of AI technologies are represented. Note, I am intentionally not naming specific algorithms but rather clustering them into macro-groups. I am also not providing you with a value assessment of what works and what does not, but simply listing what researchers and data scientists can tap into.
So how do you read and interpret the map? Let me give you two examples.. If you look at Natural Language Processing, this embeds a class of algorithms that use a combination of a knowledge-based approach, machine learning and probabilistic methods to solve problems in the domain of perception. At the same time though, if you look at the blank space at the intersection between Logic-based paradigm and Reasoning problems, you might wonder why there are not technologies there. What the map is conveying is not that a method does not categorically exist that can fill a space, but rather when people approach a reasoning problem they prefer to use Machine Learning, for instance.
Here is a list of technologies:
- Robotic Process Automation (RPA): technology that extracts the list of rules and actions to perform by watching the user doing a certain task
- Expert Systems: a computer program that has hard-coded rules to emulate the human decision-making process. Fuzzy systems are a specific example of rule-based systems that map variables into a continuum of values between 0 and 1, contrary to traditional digital logic which results in a 0/1 outcome
- Computer Vision (CV): methods to acquire and make sense of digital images (usually divided into activities recognition, images recognition, and machine vision)
- Natural Language Processing (NLP): sub-field that handles natural language data (three main blocks belong to this field, i.e., language understanding, language generation, and machine translation)
- Neural Networks (NNs or ANNs): a class of algorithms loosely modeled after the neuronal structure of the human/animal brain that improves its performance without being explicitly instructed on how to do so. The two majors and well-known sub-classes of NNs are Deep Learning (a neural net with multiple layers) and Generative Adversarial Networks (GANs — two networks that train each other)
- Autonomous Systems: sub-field that lies at the intersection between robotics and intelligent systems (e.g., intelligent perception, dexterous object manipulation, plan-based robot control, etc.)
- Distributed Artificial Intelligence (DAI): a class of technologies that solve problems by distributing them to autonomous “agents” that interact with each other. Multi-agent systems (MAS), Agent-based modeling (ABM), and Swarm Intelligence are three useful specifications of this subset, where collective behaviors emerge from the interaction of decentralized self-organized agents
- Affective Computing: a sub-field that deal with emotions recognition, interpretation, and simulation
- Evolutionary Algorithms (EA): it is a subset of a broader computer science domain called evolutionary computation that uses mechanisms inspired by biology (e.g., mutation, reproduction, etc.) to look for optimal solutions. Genetic algorithms are the most used sub-group of EAs, which are search heuristics that follow the natural selection process to choose the “fittest” candidate solution
- Inductive Logic Programming (ILP): sub-field that uses formal logic to represent a database of facts and formulate hypothesis deriving from those data
- Decision Networks: is a generalization of the most well-known Bayesian networks/inference, which represent a set of variables and their probabilistic relationships through a map (also called directed acyclic graph)
- Probabilistic Programming: a framework that does not force you to hardcode specific variable but rather works with probabilistic models. Bayesian Program Synthesis (BPS) is somehow a form of probabilistic programming, where Bayesian programs write new Bayesian programs (instead of humans do it, as in the broader probabilistic programming approach)
- Ambient Intelligence (AmI): a framework that demands physical devices into digital environments to sense, perceive, and respond with context awareness to an external stimulus (usually triggered by human action).
In order to solve a specific problem, you might take one or more approaches, which in turn means one or more technologies since many of them are not mutually exclusive and are complementary.
Teaching computers how to learn without the need to be explicitly programmed is a hard task that involves several technologies to deal with multiple nuances, and even though this map is far from perfect, it is at least a first attempt to make sense of a messy landscape.
I am aware that a strong Pareto principle emerges here, i.e., 80% (if not more) of current efforts and results are driven by 20% of the technologies pictured in the map (namely, deep learning, NLP, and computer vision), yet I am also sure that having a full spectrum might help researchers, startups, and investors.
I am open to feedback on this first version and am planning to take two additional steps: one is creating a layer for the type of challenges AI is facing (e.g., memory issues and catastrophic forgetting, transfer learning, learning from fewer data with things like zero and one-shot learning, etc.) and what technology can be used to overcome that specific issue. Second, is to apply lenses to look at the different technologies and not the problems they are solving but rather the ones they are creating (e.g., ethical issues, data-intensive problems, black-box and explainability problem, etc.).