With increasing interest in Artificial Intelligence and cognitive systems, it may be time to lay the foundations for robopsychology. What can we learn from human psychology about cognitive systems?
I believe that a useful analogy can be based on the work of Nobel-laureate Daniel Kahneman and his late colleague Amos Tversky, as described in Kahneman’s wonderful book, Thinking, Fast and Slow. They describe two modes of thought, called “System 1” (“fast thinking”) and “System 2” (“slow thinking”). System 1 is automatic, instinctive, fast, parallel, and emotional; it enables jumping to conclusions quickly. In contrast, System 2 is deliberative, conscious, slow, and logical; it involves careful and explicit reasoning. System 1 is optimized for common cases, and therefore has biases that result in wrong conclusions in unusual situations. The book explores these biases by presenting a series of problems, and asking the reader to respond immediately. In most cases, your responses will be wrong; I know mine were. This is due to the fact that these problems were carefully chosen to expose the various biases of System 1 thinking. In contrast, System 2 thinking is more accurate, but requires considerably more effort. It is therefore impossible to use it for every task, or we would be able to perform very little. Just imagine having to compute mathematically the trajectories of all cars you see before being able to cross the street.
The early days of AI were mostly concerned with the mechanization
of slow thinking, and resulted in knowledge-based systems (or “expert
systems”), planning systems such as STRIPS,
and machine learning based on models (for example, explanation-based learning). Much of this work was based on the Knowledge
Representation Hypothesis, which states that:
Any mechanically embodied intelligent process will be
comprised of structural ingredients that (a) we as external observers naturally
take to represent a propositional account of the knowledge that the overall
process exhibits, and (b) independent of such external semantic attribution,
play a formal but causal and essential role in engendering the behavior that
manifests that knowledge.
According to this hypothesis, in order for us to assign
intelligence to a computer system, we need to be able to see how it represents
its knowledge explicitly, and how it uses that knowledge to perform tasks that
we would consider to be intelligent. It
isn’t sufficient for the task to require intelligence, since machines can use
other means (such as computation speed or access to large sources of data) to
perform the task without an explicit representation of knowledge.
Knowledge-based technology, originally developed for AI, is
widely used today in various industries, such as insurance and finance, and
there are several successful commercial and open-source rule systems available
for building such applications. For most
AI tasks, unfortunately, knowledge-based systems did not scale well, and were
not very successful commercially.
This led to a backlash in AI research (sometimes called the “AI winter”), the abandonment of the Knowledge Representation Hypothesis, and the rise of statistical methods in AI. These methods, including various forms of neural networks used to support large-scale machine learning, were very successful in many applications such as the deciphering handwriting and speech understanding, which had been very difficult to automate previously. These methods scale well with increasing computational power, and can solve larger and more difficult problems as more resources become available.
This manifest success of System 1 methods in AI has profoundly changed the field, its methods and goals. Instead of focusing on understanding, it is now concerned with classification and information retrieval. These systems learn, but cannot in general externalize what they learn, since it is encoded in a large set of numeric weights in a neural network or similarly opaque representation.
The domain-specific focus of knowledge-based systems was driven by the need to create domain knowledge manually, but the goal was to create universal mechanisms for representing the knowledge, so that different sources could eventually be used together in a single system. In contrast, statistical methods are now being tuned for each problem separately. They can work together to the extent that they can be combined through their end results, but they don’t share a common representation. For example, in this paradigm it is impossible to combine one system that solves geometry problems with another that solves algebraic problems to obtain a system that can solve problems that require both geometrical and algebraic reasoning. Thus, each of these systems is an island; it can perhaps be pushed further in a single direction, but is unlikely to work with other systems to solve more complex problems (except by combining results rather than mechanisms).
Pat Langley’s paper “The Cognitive Systems Paradigm” (Advances in Cognitive Systems 1, 2012) analyzes this shift in the field of AI and cognitive systems, and proposes six features that distinguish research on fundamental issues of machine intelligence from more recent approaches, which he characterizes as producing “impressive but narrow idiot savants.” I recommend this paper for anyone interested in the original goals of AI and in getting it back on track.
This led to a backlash in AI research (sometimes called the “AI winter”), the abandonment of the Knowledge Representation Hypothesis, and the rise of statistical methods in AI. These methods, including various forms of neural networks used to support large-scale machine learning, were very successful in many applications such as the deciphering handwriting and speech understanding, which had been very difficult to automate previously. These methods scale well with increasing computational power, and can solve larger and more difficult problems as more resources become available.
This manifest success of System 1 methods in AI has profoundly changed the field, its methods and goals. Instead of focusing on understanding, it is now concerned with classification and information retrieval. These systems learn, but cannot in general externalize what they learn, since it is encoded in a large set of numeric weights in a neural network or similarly opaque representation.
The domain-specific focus of knowledge-based systems was driven by the need to create domain knowledge manually, but the goal was to create universal mechanisms for representing the knowledge, so that different sources could eventually be used together in a single system. In contrast, statistical methods are now being tuned for each problem separately. They can work together to the extent that they can be combined through their end results, but they don’t share a common representation. For example, in this paradigm it is impossible to combine one system that solves geometry problems with another that solves algebraic problems to obtain a system that can solve problems that require both geometrical and algebraic reasoning. Thus, each of these systems is an island; it can perhaps be pushed further in a single direction, but is unlikely to work with other systems to solve more complex problems (except by combining results rather than mechanisms).
Pat Langley’s paper “The Cognitive Systems Paradigm” (Advances in Cognitive Systems 1, 2012) analyzes this shift in the field of AI and cognitive systems, and proposes six features that distinguish research on fundamental issues of machine intelligence from more recent approaches, which he characterizes as producing “impressive but narrow idiot savants.” I recommend this paper for anyone interested in the original goals of AI and in getting it back on track.
Very interesting. Just read "the master algorithm" by Pedro Domingos. He describes the efforts to combine the Symbolic and Statistical approaches into a unified "master algorithm". Would it be a move towards human-like reasoning combining Kahneman’s system-1 and system-2?
ReplyDelete