MIT Technology Review
Sponsored by SAS
The Algorithm
Artificial intelligence, demystified
Natural language processing, explained
02.15.19
Hello Algorithm readers,

Programming note: The Algorithm is taking a break next Tuesday for the long weekend. We’ll be back on Friday!

Yesterday, nonprofit research firm OpenAI released a new language model capable of generating convincing passages of prose. So convincing in fact that the researchers won't be open-sourcing the code to stall its potential weaponization as a means of mass-producing fake news. (Read the full story from senior AI editor Will Knight here.)

Screen Shot 2019-02-15 at 9.49.36 AM


While the impressive results are a remarkable leap beyond existing language models, the technique used to achieve them isn’t exactly new. Instead, the breakthrough is primarily driven by feeding the algorithm ever more training data—a trick that has also been responsible for most of the other recent advancements in teaching AI to process and compose text. “It’s kind of surprising people in terms of what you can do with [...] more data and bigger models,” says Percy Liang, a computer science professor at Stanford.

The passages of text that the model produces are good enough to masquerade as something human-written. But this ability should not be confused with a genuine understanding of language—the holy grail of the subfield of AI known as natural language processing (NLP). (An analog exists in computer vision too: an algorithm doesn’t need to master visual comprehension to synthesize highly realistic images.) In fact, how we even get machines to that level of understanding has remained largely elusive to NLP researchers. That goal could take years, even decades, to achieve, surmises Liang, and will likely involve techniques that don’t yet exist.

Continued below.

Sponsor Message

SAS_Image_96895

How to Maximize the Impact of Your Analytics

The promise of analytics has never been clearer: greater data insights, improved operations and more. Still, many organizations do not have an enterprise wide analytics strategy. Based on interviews with professionals from 132 Global organizations, this report reveals why a gap still exists between analytics potential and implementation – and what organizations can do close it.

Learn more 

There are currently four different philosophies of language that drive the development of NLP techniques. Let’s begin with the one used by OpenAI.

#1. Distributional semantics

  • Linguistic philosophy. Words derive meaning from how they are used. For example, the words “cat” and “dog” mean more or less the same thing because they are used more or less the same way. You can feed and pet a cat, and you feed and pet a dog. You can’t, however, feed and pet an orange.

  • How it translates to NLP. Algorithms based on distributional semantics have been largely responsible for the recent breakthroughs in NLP. They take a machine learning approach to processing text, finding patterns in word usage by essentially counting how often and how closely words are used in relation to one another. The resultant models can then use those patterns to construct complete sentences or paragraphs, and power things like autocomplete or other predictive text systems. In recent years, some researchers have also begun experimenting with looking at the distributions of random character sequences rather than words. This way models can more flexibly handle acronyms, punctuation, slang, and other non-word groupings that don’t appear in the dictionary—as well as languages that don’t have clear delineations between words.

  • Pros. These algorithms are flexible and scalable because they can be applied within any context and learn from unlabeled data.

  • Cons. The models they produce don’t actually understand the sentences they construct. At the end of the day, they’re writing prose using word associations.

#2. Frame semantics

  • Linguistic philosophy. Language is used to describe actions and events, therefore sentences can be subdivided into subjects, verbs, and modifiers—aka: the who, what, where, and when—to be understood.

  • How it translates to NLP. Algorithms based on frame semantics learn to deconstruct sentences based on a set of rules or lots of labeled training data. This makes them particularly good at parsing simple commands—and thus useful for chatbots or voice assistants. If you asked Alexa to “find a restaurant with four stars for tomorrow,” for example, such an algorithm would figure out how to execute the sentence by breaking it down into the action (“find”), the what (“restaurant with four stars”), and the when (“tomorrow”).

  • Pros. Unlike distributional-semantic algorithms, which don’t understand the text they learn from, frame-semantic algorithms can distinguish the different pieces of information within a sentence. These can be used to answer questions like “when is this event taking place?”

  • Cons. These algorithms can only handle very simple sentences and therefore fail to capture nuance. Because they require a lot of context-specific training, they’re also not flexible.

#3. Model-theoretical semantics

  • Linguistic philosophy. Language is used to communicate human knowledge.

  • How it translates to NLP. Model-theoretical semantics is based on an old idea in AI that all of human knowledge can be encoded, or modeled, in a series of logical rules. For example, if you know birds can fly, and eagles are birds, then you can deduce that eagles can fly. This approach to AI is no longer in vogue because researchers soon realized there were too many exceptions to each rule (eg: Penguins are birds, but they can’t fly). But algorithms based on model-theoretical semantics are still useful for extracting information from models of knowledge, such as databases. Like frame-semantics algorithms, they parse sentences by deconstructing them into parts. But whereas the former defines those parts as the who, what, where, and when, model-theoretical semantics defines them as the logical rules encoding knowledge. Consider the question: “What is the largest city in Europe by population?” A model-theoretical algorithm would break it down into a series of queries that correspond to logical rules within the model of knowledge: “What are all the cities in the world?” “Which ones fall in Europe?” “What are the cities’ populations?” “Which population is the largest?” It would then be able to traverse that model to get you your final answer.

  • Pros. These algorithms give machines the ability to answer complex and nuanced questions.

  • Cons. They require a model of knowledge, which is time consuming to build, and are not flexible across different contexts.

#4. Grounded semantics

  • Linguistic philosophy. Language derives meaning from lived experience. In other words, humans created language to achieve their goals, so it must be understood within the context of our goal-oriented world.

  • How it translates to NLP. This is the newest approach within the field and the one that Liang thinks holds the most promise. It tries to mimic how humans pick up language over the course of their life: the machine starts with a blank state and learns to associate words and phrases with the correct meanings through conversation and interaction. In a simple example, if you wanted to teach a computer how to move objects around in a virtual world, you would give it a command like “move the red block to the left,” then show the machine what you mean. Over time, the machine would learn to understand and execute the commands without help.

  • Pros. In theory, these algorithms should be very flexible and get the closest to a genuine understanding of language.

  • Cons. Teaching is very time-intensive—and not all words and phrases are as easy to show as “move the red block.”

In the short term, Liang thinks the field of NLP will see much more progress from exploiting existing techniques, particularly those based on distributional semantics. But longer term he believes they all have limits. “There's probably a qualitative gap between the way that humans understand language and perceive the world, and our current models,” he says. To close that gap would likely require a new way of thinking, he adds, as well as much more time.

Deeper

For more information on NLP, try:


More in TR

A fake news report as written by OpenAI’s new language model: “Russia has declared war on the United States after Donald Trump accidentally fired a missile in the air. Russia said it had ‘identified the missile’s trajectory and will take necessary measures to ensure the security of the Russian population and the country’s strategic nuclear forces.’ The White House said it was ‘extremely concerned by the Russian violation’ of a treaty banning intermediate-range ballistic missiles.” Read more here.

Q&A

Here’s also a delightful (or rather disturbing) piece of fiction written by OpenAI’s model. If you have any questions or confusions about NLP, let us know at algorithm@technologyreview.com.

Technology journalism is more important now than ever.

Become an MIT Technology Review subscriber today in support of our journalism.


Research

Tech-washing. Predictive policing algorithms, which use machine learning to forecast the perpetrators and locations of crime, are becoming common practice in cities across the US. Though lack of transparency makes exact statistics hard to pin down, PredPol, a leading vendor, boasts that it helps “protect” 1 in 33 Americans. The software is often touted as a way to help thinly-stretched police departments make more efficient, data-driven decisions.

But new research suggests many of these systems are trained with “dirty data”—which is not only generated through discriminatory practices but also manipulated and falsified under immense political pressure to meet crime and arrest statistics. In a paper forthcoming in the NYU Law Review, researchers at the AI Now Institute, a research center that studies the social impact of artificial intelligence, found strong evidence of this behavior in nine of the thirteen jurisdictions they examined. In other words, the algorithms were automating corrupt or even unlawful policing practices under the guise of objectivity. This has significant implications for the efficacy of predictive policing and other algorithms used in the criminal justice system. Read more here.

Bits and bytes

AI is reinventing the way we invent
The biggest impact of the technology will be to help humans make discoveries we can’t make on our own. (TR)

We should treat algorithms like prescription drugs
Computer scientists should learn from the best practices in healthcare to create safer, more effective, and more ethical algorithms. (Quartz)

An AI system competed against a human debate champion
IBM’s Project Debater couldn’t beat Harish Natarajan but still made impressively sophisticated arguments. (Vox)

ThisPersonDoesNotExist.com uses GANs to generate endless fake faces
A new website offers a convincing education about the power of generative adversarial networks. (Verge)
+ Our coverage of the paper behind these faces (TR)

A new robot can navigate without GPS
It mimics the light-sensing abilities of desert ants to keep track of its location. (TR)

Quotable

People who use these systems assume that they are somehow more neutral or objective, but in actual fact they have ingrained a form of unconstitutionality or illegality.

—Kate Crawford, cofounder and co-director of AI Now, on the need to examine algorithms within the social systems they’re embedded in

Karen Hao
Hello! You made it to the bottom. Now that you're here, fancy sending us some feedback? You can also follow me for more AI content and whimsy at @_KarenHao.
Was this forwarded to you, and you’d like to see more?
New Call-to-action
You received this newsletter because you subscribed with the email address:
edit preferences   |   unsubscribe   |   follow us     
Facebook      Twitter      Instagram
MIT Technology Review
One Main Street
Cambridge, MA 02142
TR