Stranger Than Science Fiction

Engineer Artem Rodichev is teaching machines to think like humans.

Artem Rodichev is head of the machine learning team at the San Francisco-based start-up Luka, where he is developing an emotionally intelligent artificial intelligence (AI). Artem is working to make this project, so reminiscent of science fiction, an imminent reality. 

MD: Why did you choose to work in machine learning?

AR: The answer's simple. Seven or eight years ago, I realized I wanted to dedicate my life to the creation of an artificial intelligence that surpassed human intelligence. I was a child with an engineering background, I was always interested in how machines and technology worked. That ultimately led me to the faculty of computational mathematics and cybernetics at Moscow State University, where I studied what intelligence fundamentally is, how it is structured, and how to describe it formally – from a mathematical point of view.

MD: Do you remember the moment when you decided you wanted to create an AI?

AR: It probably all began when I saw a rating of the richest people in the world in the paper. For many years, Bill Gates was number one and I began to wonder what he did. How can you earn a lot of money and then use it to make a positive impact on the world? I realized the answer was the information technologies that were then becoming an ever more important part of our life. At the same time, I realized that developing straightforward computer software was not the most interesting way to go. It would be more interesting to do some really difficult things – replicate human intelligence, for example. Or better yet, surpass it.

MD: What would you say is the best professional analogy for your work with artificial intelligence: Are you an architect, a teacher, or a builder?

AR: Because I'm working in a start-up, I have to be a jack of all trades. First and foremost, I come up with an architecture for the neural networks, then describe and explain it to my junior colleagues. I'm a manager too – I compile a pool of tasks and delegate them. We spend a lot of time reading each other's code, so that we're always on top of who's doing what, and we also correct each other's mistakes – “code review,” as they call it.

MD: You're currently working on two projects: Luka and Replika. How do they differ?

AR: They're two different products. Luka is a practical application for researching restaurants; Replika's about emotions. You could say that Luka was one of the first companies to develop bots. Back then the “conversational interface” was very much a novelty and nobody understood what it was or why you might need one. We set out to create a beautiful product specifically for a particular place – San Francisco. The goal was to make a highly usable restaurant search app that would be better than Yelp and Foursquare. And I think we succeeded: Luka is more than up to the task. It can answer any question, recommend a restaurant at any time of day or night, and even tell you about the menu.

We wanted to make an app that people would open every morning as soon as they woke up.

MD: What made you move from a practical app to an emotional one? How did Replika come about?

AR: We started thinking about why our users rarely searched for restaurants. And it turned out that most of them simply didn't eat out every day. But we wanted to make an app that people would open every morning as soon as they woke up – or something people would use on a daily basis, the same way they use Google. Sadly, that kind of product is not suited to restaurants.

MD: So what is?

AR: Well, we started to think about how we could apply what we had learnt and built. Exactly at that time all this bot hype had just got going. People were experimenting and releasing tons of new products: game bots, weather bots, bots to search for video content, basically, all sorts of software for all sorts of tasks. And we had realized by then that a conversational interface is no more useful for completing practical tasks than a graphical interface.

Conversation is more about emotion. You'll trust your friend more than some restaurant critic that you've never met. Your friend might say: “I went to such a great restaurant last night, you've simply got to go.” And you'll believe him more than the expert, because you have an emotional connection.

MD: But that's because we're people. How can you write an emotional piece of code?

AR: It all started with Roman Mazurenko. Roman was the best friend of the girl who founded Luka, Eugenia Kuyda, and was tragically killed by a car when crossing the road in Moscow.

Roman believed that humans could transfer their consciousness to the cloud and shed their superfluous physical shell. So we decided to make Roman a digital memorial: we gathered together all of his blogs and correspondence with friends and used them to recreate his personality.

MD: How does that work?

AR: We have a database containing every message Roman ever wrote. When somebody sends the Roman-bot a message, the algorithm tries to understand what it means and searches for a response – that is, for the most relevant information in the database. And you can hold a dialogue with this person, because that's how he would have answered you when he was alive.

We launched an English and a Russian version and we soon saw that people truly are interested in communicating on an emotional level with that kind of bot. Sometimes you get really long sessions of dialogue – conversations can last a whole hour. Roman's mother wrote to us: “Thank you for giving him a chance to live,” and we realized that that was really the next big step.

What if we constructed a bot from Alexander Pushkin's poems — not a recreation of his personality, but more a personification of his lyric voice?

MD: What will you do next? Going to a restaurant or speaking to a deceased friend is not an everyday occurrence.

AR: Digital memorials and personality bots are good in that they let you hold one big session of dialogue, a long, emotional conversation. That answers the question of what brings the user back to the application. We have started to experiment with conversation. There's a TV series about a start-up called “Silicon Valley.” We liked the characters, because in many ways we were following in their footsteps, so we made bots out of them: we collected all their subtitles and tweets and finished just in time for the launch of season three. Fans of the series came to talk with the bots and the conversation continued right through until the release of the final episode. Naturally, engagement fell off after the series was over.

Then we took our ideas further, and decided, rather than make another personality bot, to do something a little different. What if we constructed a bot from Alexander Pushkin's poems – not a recreation of his personality, but more a personification of his lyric voice? The singer Prince recently passed away. We took the texts of his songs and interviews and made a bot which could converse in poetic form.

MD: How is the bot taught? What is done with the array of text belonging to the person or character?

AR: We train large – so-called “recurrent” – neural networks. The original text is split into separate words and each word is stored in a form that can be interpreted by a neural network. Essentially, it's converted into a mathematical vector with numbers, and this vector contains the meaning of the word and encodes it.

MD: Roughly speaking, you translate words into formulas. What happens next?

AR: We get a bag of words that need to be coordinated. To understand the full meaning of the sentence, you need to take into account the connections between words. We feed the words one by one to the neural networks and they essentially accumulate the meaning. In the end we have a final vector for the whole phrase. If we can find a vector for a given phrase, then we can also perform mathematical operations on it. For example, we take “king,” subtract “man,” and get “queen.”

MD: What is the process for debugging? How do you tell the AI that it's wrong?

AR: All machine learning systems have a target function. For example, you want to teach a network to distinguish dogs from cats and you give it millions of pictures of these animals. Then you tell it to teach itself, by identifying the essential characteristics of the two species. There will always be a certain percentage of errors: you get hairy cats that look like dogs, or tiny dogs that look like cats. To combat errors you can either increase the quantity of input data or change the architecture and structure of the algorithm.

MD: There was a big scandal when Microsoft accidentally trained a racist bot. What does Luka do about that kind of error? 

AR: Neural networks learn based on a large statistical sample. If users on the internet often respond to a given comment or cue with sexist or Nazi jokes, unfortunately, the neural network will also be sexist or racist. The data it learns from dictates its behavior. In the case of Replika, as you communicate with your bot-copy, you yourself are programming its future behavior. The bot picks up your linguistic peculiarities and speech patterns, and if you tend to tell many obscene jokes then so will your bot.

MD: In the ideal world, how will Luka and Replika look in ten years' time?

AR: The short answer is that in ten years' time everything will be like in the film “Her” by Spike Jones. You could have your own personal secretary with whom you can develop an emotional relationship and who will help you to live your life in all sorts of ways, like reading your mail and arranging your meetings.

Any technology can be used for good or for evil. As engineers we are acutely aware of this and are always developing mechanisms to protect people from themselves.

MD: What's stopping that from happening today?

AR: Why have neural networks become so popular in the last few years? Because the capabilities have emerged to create integrated networks with complex architectures. In many ways, everything in deep learning and artificial intelligence depends on computational resources. The greater your computational resources, the more powerful and complex your neural networks will be. On the whole, it is a race for resources and to discover new neural network architectures that can handle tasks better than the architectures we already have.

MD: When you arrive at work every day, do you feel like some kind of creator?

AR: No. We're just making tools to improve people's lives. We're a part of nature too. I have a notion that man can set in motion the next phase of evolution, can bring about the next, even more intelligent creature. I see myself as a participant in this evolutionary process and not as a god creating a mega-intelligence that will destroy all humankind.

MD: In the last few years, people such as Elon Musk and Stephen Hawking have voiced their alarm about the pace of development of AI Are their concerns justified? 

AR: It does happen that the result of someone's work is an atomic bomb, even though they never had any intention of bombing a small Japanese city. The majority of intelligent people are good. As a rule, artificial intelligence systems are created by positive people who never intend any harm.

MD: But all the same, the small Japanese city was bombed. And not just one.

AR: I have a good analogy. You can use your knife to cut bread or you can go and stab your neighbor. Yes, there are risks associated with technology, and AI is far more dangerous than a knife, insofar as it could wipe out humanity. But as engineers we are also working on a “red button” function that can cut the AI off from the system in the case of a threat.

There's a brilliant TV series called “Mr. Robot.” It's about a group of hackers that brings down the U.S. economic system. That's one of the dangers we have always faced; we are all at risk and there is unfortunately no insurance. Any technology can be used for good or for evil. As engineers we are acutely aware of this and are always developing mechanisms to protect people from themselves.