Tuesday, June 7, 2016

Robopsychology, Part 1: What Goes on in the Mind of a Cognitive System?

Dr Susan Calvin in Asimov’s I, Robot is a robopsychologist.  As Asimov explains in the preface, this means a psychologist of robots, and is not to be misunderstood as a robot who is also a psychologist.  Robopsychology should perhaps be a sub-discipline of Cognitive Psychology, but the latter term is firmly entrenched as relating exclusively to human psychology.

With increasing interest in Artificial Intelligence and cognitive systems, it may be time to lay the foundations for robopsychology.  What can we learn from human psychology about cognitive systems?

I believe that a useful analogy can be based on the work of Nobel-laureate Daniel Kahneman and his late colleague Amos Tversky, as described in Kahneman’s wonderful book, Thinking, Fast and Slow.  They describe two modes of thought, called “System 1” (“fast thinking”) and “System 2”  (“slow thinking”).  System 1 is automatic, instinctive, fast, parallel, and emotional; it enables jumping to conclusions quickly.  In contrast, System 2 is deliberative, conscious, slow, and logical; it involves careful and explicit reasoning.  System 1 is optimized for common cases, and therefore has biases that result in wrong conclusions in unusual situations.  The book explores these biases by presenting a series of problems, and asking the reader to respond immediately.  In most cases, your responses will be wrong; I know mine were.  This is due to the fact that these problems were carefully chosen to expose the various biases of System 1 thinking.  In contrast, System 2 thinking is more accurate, but requires considerably more effort.  It is therefore impossible to use it for every task, or we would be able to perform very little.  Just imagine having to compute mathematically the trajectories of all cars you see before being able to cross the street.
Cognition requires both types of thinking.  For example, you would use fast thinking to zero in on a small number of candidate solutions, then use slow thinking to evaluate these solutions carefully, in order to consciously overcome System 1 biases.  To complete the cycle, you can use slow thinking to direct a new burst of fast thinking, leading to a new evaluation step.

The early days of AI were mostly concerned with the mechanization of slow thinking, and resulted in knowledge-based systems (or “expert systems”), planning systems such as STRIPS, and machine learning based on models (for example, explanation-based learning).  Much of this work was based on the Knowledge Representation Hypothesis, which states that:
Any mechanically embodied intelligent process will be comprised of structural ingredients that (a) we as external observers naturally take to represent a propositional account of the knowledge that the overall process exhibits, and (b) independent of such external semantic attribution, play a formal but causal and essential role in engendering the behavior that manifests that knowledge.
According to this hypothesis, in order for us to assign intelligence to a computer system, we need to be able to see how it represents its knowledge explicitly, and how it uses that knowledge to perform tasks that we would consider to be intelligent.  It isn’t sufficient for the task to require intelligence, since machines can use other means (such as computation speed or access to large sources of data) to perform the task without an explicit representation of knowledge.

Knowledge-based technology, originally developed for AI, is widely used today in various industries, such as insurance and finance, and there are several successful commercial and open-source rule systems available for building such applications.  For most AI tasks, unfortunately, knowledge-based systems did not scale well, and were not very successful commercially.

This led to a backlash in AI research (sometimes called the “AI winter”), the abandonment of the Knowledge Representation Hypothesis, and the rise of statistical methods in AI.  These methods, including various forms of neural networks used to support large-scale machine learning, were very successful in many applications such as the deciphering handwriting and speech understanding, which had been very difficult to automate previously.  These methods scale well with increasing computational power, and can solve larger and more difficult problems as more resources become available.

This manifest success of System 1 methods in AI has profoundly changed the field, its methods and goals.  Instead of focusing on understanding, it is now concerned with classification and information retrieval.  These systems learn, but cannot in general externalize what they learn, since it is encoded in a large set of numeric weights in a neural network or similarly opaque representation.

The domain-specific focus of knowledge-based systems was driven by the need to create domain knowledge manually, but the goal was to create universal mechanisms for representing the knowledge, so that different sources could eventually be used together in a single system.  In contrast, statistical methods are now being tuned for each problem separately.  They can work together to the extent that they can be combined through their end results, but they don’t share a common representation.  For example, in this paradigm it is impossible to combine one system that solves geometry problems with another that solves algebraic problems to obtain a system that can solve problems that require both geometrical and algebraic reasoning.  Thus, each of these systems is an island; it can perhaps be pushed further in a single direction, but is unlikely to work with other systems to solve more complex problems (except by combining results rather than mechanisms).

Pat Langley’s paper “The Cognitive Systems Paradigm” (Advances in Cognitive Systems 1, 2012) analyzes this shift in the field of AI and cognitive systems, and proposes six features that distinguish research on fundamental issues of machine intelligence from more recent approaches, which he characterizes as producing “impressive but narrow idiot savants.”  I recommend this paper for anyone interested in the original goals of AI and in getting it back on track.