Saturday, October 01, 2011

Computer Scientist

Farhad Manjoo has been writing a series in Slate on how robots are going to replace people in almost every occupation. How about us scientists? Are we safe? At least for a while?

Not really, says Manjoo. He cites a a computer that, with a little human assistance, seems to be be really quite scientific.

Then, two years ago, Hod Lipson and Michael Schmidt announced the first stirrings of robotic thinking. Lipson, a computer science professor at Cornell, and Schmidt, then a graduate student in Lipson’s lab, created a computer program that, given a raft of data from physical systems, can describe the natural laws that apply to that system. When they fed their software the motion-capture coordinates of a swinging double pendulum, the machine pondered the data for a couple days, then spat out the Hamiltonian equation describing the motion of such a system—an equation that represents the physical law known as conservation of energy. Their software needed no prior knowledge to discover this law. It wasn’t familiar with gravity, energy, geometry, or anything else. It simply did what human scientists have done since the time of Newton. It looked at the world, came up with theories about how it works, tested them, and then produced a law.

OK, that sounds a bit scary, but of course we already know Hamiltonian dynamics, so we can ask how much implicit knowlege was put in the problem to start with.

But what about this?

Lipson and Schmidt called their program Eureqa, and they made it available for free on the Web. It has since yielded several new discoveries in a range of fields, discovering scientific laws that we’d never known. Lipson and Schmidt recently worked with Gurol Suel, a molecular biophysicist at the University of Texas Southwestern Medical Center, to look at the dynamics of a bacterium cell. Given data about several different biological functions within the cell, the computer did something mind-blowing. “We found this really beautiful, elegant equation that described how the cell worked, and that tended to hold true over all of our new experiments,” Schmidt says. There was only one problem: The humans had no idea why the equation worked, or what underlying scientific principle it suggested. It was, Schmidt says, as if they’d consulted an oracle.

Manjoo says scientists ought to be terrified by that, and I agree, but I sure would like to more know details. Like, for example, exactly what does this equation predict from what?

The methodology of their program appears to be a sort of genetic algorithm.

Eureqa is quite simple in design. After it’s fed data about a particular process (the swinging of a pendulum, the dynamics of a cell), the computer generates a huge field of potential equations. These initial equations are random, and the vast majority of them will not apply. But a few of these random equations will show some agreement with the physical world. “We take the ones that are slightly better than the others, and we randomly recombine them to get new equations—and then we repeat the process over and over again, billions and billions and billions of times, until we’ve exhausted the space of short, simple equations,” Schmidt says. In the end, this Darwinian process tends to come up with equations that describe “invariant relationships”—that is, equations that apply across all the data. Such invariant relationships are often associated with fundamental laws of nature: the conservation of energy, Newton’s laws of motion, the mass-energy equivalence.

The other episodes in his series are also good, and feature, among others, robot lawyers, doctors, and pharmacists.