Wednesday, May 12, 2010

How To Teach Machines Creativity

Kasparov said of Deep Blue, after losing to it at chess, that he sometimes "saw deep intelligence and creativity in the machine's moves." Partially, this was an accusation that the IBM programmers weren't playing fairly, but it was fundamentally a testament to the quality of play Deep Blue could bring to chess.

Of course, it wasn't really exhibiting creativity. The biggest advantage a chess playing computer has is simply the depth it can think ahead. It chose its moves because it knew they'd prevent Kasparov from checkmating it in the foreseeable future, not from any deep insight into the strategies of the board.

I heard another anecdote from a professor, once, about an experiment in robotic movement. A (computer) mouse was tied to the back of a small, roomba like robot. The goal was to train it to avoid walls. Whenever the mouse rolled forward, the robot would get a 'reward'. Whenever it rolled backwards, it would be 'penalized'. These were just numbers being plugged into an algorithm, but they acted analogously to the dopamine in your brain. The robot was supposed to learn to navigate in such a way that it wouldn't have to back up, something like traveling in a circle through the room. The researchers left the robot running overnight to learn.

When they returned the next morning they were surprised to find the robot in a corner, rocking back and forth. Not the expected result. Some investigation revealed that the robot had found its way onto a rug. Whenever it moved backwards, the rug's bristles would jam up the mouse wheel. It had freedom to roll in the other direction, however. The rug let it avoid the negative feedback, circumventing the expected rules and giving it a constant euphoria as it rocked back and forth in place.

The solution seems creative. Really, though, it was just dumb luck. The robot didn't reason out that the rug might help, it blindly ended up on the rug, where it happened to take a couple simple steps that seemed positive. It tried a few repetitions, decided this was the best it was going to find, and went with it.

So how could we incorporate a more 'real' form of creativity into AI? I believe it's all about explicitly measuring creativity. Big Blue was 'rewarded' based on victories and losses. The robot was rewarded for moving forward. To create a creative machine, you have to reward creativity.

Let's use music as an example. What if we wanted to train a computer to compose music? First: a quick machine learning lesson:

Genetic Algorithms evolve a solution to a problem. For the music example, you'd take thousands of completely random musical scores, and listen to them (or ideally, evaluate them mathematically to start). They'd all sound bad, but hopefully a couple have some redeeming feature: a rhythm, or some short snippet that sounded good. After ordering them by quality, you take the best and create a new generation, just like life does. There's a few mutations thrown in (changing notes), but mostly you use genetic crossover: take two of the musical scores, cut them up and interleave them.

If you repeat this, generation after generation, the quality of the music will increase. Short snippets of good music greatly increase the odds an individual song will pass its genetic (musical?) material to the next generation, so those good snippets grow in number. Eventually music evolves.

But of course, this isn't creative. You end up with a population of songs at the end that all sound alike, as they all share most of their genetic material now. The process is creative like evolution is creative (are animals art?) but you don't have an artificial mind that can create arbitrary songs.

So we turn our attention to genetic programming. This still uses the same life-based mechanisms to evolve a population, but now you're evolving a population of computer programs. You write random computer programs, in this case programs that take some input and output a song. At first none of the computer programs do what you want (or anything, usually. Random code isn't terribly useful), but evolution still kicks in. With a large enough population, enough generations, and somebody listening to the god-awful racket this would produce, you'd eventually end up with a computer composer. Would it be creative?

Probably not. My prediction is that while the eventual eComposer would produce passable music, it would be very self-derivative. Once it finds a formula that works, why deviate? Deviating from your best stuff is only likely to get you a lower evaluation, and then you don't get any children. Best to play it safe, keep pumping out the same bass line, and keep your bytes replicating.

So what do you do? You evaluate each composer not just on quality, but on their creativity. You don't just ask for their best piece, you ask for 10 pieces, and make sure they're all good enough and different enough from each other. We usually only care about the best answer from our machine buddies, but I think that's wrong. To be right consistently, to be right when things change dramatically, you need to be able to come up with lots of different candidate solutions. It seems obvious in terms of music, but I expect this would help in lots of different machine learning domains.

Were I at grad-school, I'd write a paper about this. I'd throw together some simple problems, and see how penalizing for lack of depth in suggested solutions changes the long term performance of GP-produced programs. I'd draw some graphs, write up a conclusion suggesting further research, and find some journal to print it. But instead, not being a grad student, I've decided to publish the idea (sans research) via blog. When/If I do end up back in school, I can look back at my blog and start pumping out the papers. Or alternatively, Internet, you're welcome to do the work and take the credit if you find this first. Especially if you're a robot reading this as you consume all of humanity's written word (I'm looking at you, Google)...

1 comment: