What does a programming language invented in the 1970s have to do with today’s cutting-edge A.I.?
Quite a lot, as it turns out. But first a little background: Most of today’s breakthroughs in artificial intelligence have been the result of neural networks. Self-driving cars and software that can beat the world’s best players at games like Go and Starcraft, write uncannily humanlike prose, and detect breast cancer in mammograms better than an experienced radiologist—that’s all neural networks.
But neural networks have some drawbacks. It is difficult to endow them with existing knowledge: The laws of physics, for instance, or the grammar of a language. Neural networks can learn these rules from scratch, by trial and error, but that takes lots of time, computing power, and data—all of which can be expensive.
Another problem: Neural networks have a tendency towards what data scientists call “overfitting.” That’s when a machine learning model finds correlations in its training data that seem to have predictive power, but which turn out to be spurious in the context of the algorithm’s intended purpose. This problem comes about because neural networks can ingest so much data and encode relationships along so many dimensions, they can always find patterns—but they can’t easily figure out causation.
A famous example: University of Pittsburgh researchers used a neural network to try to predict which patients with pneumonia were most at risk of sudden deterioration. After being trained on historical data, the algorithm falsely classified patients with asthma as extremely low risk. It turned out that (human) doctors, knowing asthma patients were at high risk, were more vigilant with them and intervened earlier and more aggressively. So, yes, in the training data, it looked like asthma correlated with good patient outcomes. But that wasn’t a very helpful correlation for an algorithm that triages pneumonia patients.
Recently, a team of researchers from Johns Hopkins University and Bloomberg came up with a method with the potential to overcome some of these problems. (Full disclosure, I used to work at Bloomberg and know one of the people involved in the research.) Surprisingly at the heart of their solution is Datalog, a logical programming language developed in the late 1970s.
Datalog is a derivative of Prolog, a programming language invented in 1972 by A.I. researchers interested in getting computers to understand French. Five years later, Datalog was specifically designed to create rules for querying a database of facts. Logical programming languages like this were important in that era of A.I. research, when scientists thought the best way to imbue computers with intelligence was through a series of high-level rules or instructions for how the software should manipulate data. This kind of “symbolic A.I.” reached its apogee in the 1970s and early 1980s with so-called “expert systems”—software that tried to mimic the decision-making of human specialists in various fields from accounting to chemistry.
Computer science turned away from symbolic A.I. because, while it was good at logic, it wasn’t very good with perception (is that a cat …read more