An article I wrote in May of 1997 about the Kasparov-Deep Blue Match
Deep Blue Wins,
But What Does it Mean?
Are you feeling a little
cramped, now?
According to many people, the
world just got a little smaller. Enthusiasm is running very high, because Man
has made a Machine that can beat the most Machine-like Man in the game of
Chess. And it all happened, ironically enough, on Ludek Pachman’s birthday.
Now that we understand chess that well, the world seems so much smaller, or so
the argument goes.
Pardon me, but I’m a little
skeptical. No, I don’t buy into the conspiracy theories running amuck about
Gary Kasparov throwing his match, or that there was some kind of “fix” or
“cheating” involved. I am skeptical about what has really been accomplished.
Long ago I learned in an
undergraduate Artificial Intelligence (AI) class that chess was interesting to
AI academia because it was a good test of the state of the art of AI, and it
provided a good way to test the performance of various algorithms in a
practical situation.
The problem is that it
appears Deep Blue may have more in common with a big Oracle database than it
does with an algorithm incorporating artificial intelligence. Actually, very
little is really known about the specific programming of Deep Blue, because it’s
programmers, C. J. Tan and Feng Hsu, have not yet released their promised report.
Some details have leaked out though, because of the marketing process. What
is known is that they have advocated an approach that de-emphasizes
intelligence. In a chess playing algorithm, the critical step is referred to
as eval(), or the evaluation function. The purpose of eval() is to assign
heuristics, or numerical evaluations, that are used to compare positions that
can result from the choice of moves. Deep Blue evaluates positions very fast,
but it does so at the expense of the depth of positional understanding
represented in those evaluations.
So, without getting into the
details, there is a relatively small amount of intelligence in Deep Blue, at
least in comparison to most good chess playing programs. The Deep Blue
approach to chess is “speed at all costs”. The Deep Blue team designed the
evaluation process to work as fast as possible so that Deep Blue can search as
many positions as possible. One way they sped up the computer was to press
much of the logic into custom chips. Part of the thinking here is that the
silicon based intelligence not found in the evaluation algorithm is derived on
the fly from the results of the deep search. It has been presumed that they
sacrificed smarter evaluation algorithm concepts for speed, and no one from the
Deep Blue team is disputing that assumption. This is where the danger lies.
In the process of choosing a
move, Deep Blue will compare positions that can result from one move against
the positions that can result from another move. While all those positions
occur 7 or 8 moves from the current position, if the intelligence used to
evaluate those positions is even a little superficial, then it will not make as
good a positional judgement as Garry Kasparov. Or Susan Polgar. Or even Joel
Benjamin. And that fact cannot be denied.
I can back up my skepticism
about Deep Blue with one other indisputable fact. Chess belongs to a class of
problems computer scientists call “intractable”. Roughly explained, this means
that as you linearly increase the number of moves you want the program to
search ahead, the number of positions it must search increases exponentially.
Consider that, according to the IBM press notes, the Deep Blue team doubled the
search speed of the 96 version of Deep Blue. In 1996 it could evaluate 100
million positions every second, and the 1997 version could evaluate 200 million
positions per second. But that does not mean they doubled the search depth
from 12 ply (i.e., 6 moves) to 24. In fact, Deeper Blue (as the 1997 version
is sometimes called) only searches 2 to 3 ply deeper as a result of the
increase. Doubling the search speed again might not add even one single move
more.
It is also important to
understand how they doubled the search speed. Deep Blue has several computers
handling various tasks for it. All of these computers have been upgraded to be
much faster. There are twice as many “chess chips”, the (now) 512
microprocessors that can be thought of as processing positions in the native
language of chess. Memory has been expanded everywhere. My point here is that
they have done much more than doubling the clock speed of the processors. They
have increased the resources quite a bit, too, spending quite a bit of money in
the process. At some point, it will be impractical to increase resources
enough to attain that the next level of search depth to increase the strength
of the computer. I question whether or not the Deep Blue team will be able to
improve their system’s play much more.
But why would they need to improve the program? By now I
can hear the cries of “But, Deep Blue won! Isn’t that the bottom line?” Not
exactly. Few chess players will argue that Garry Kasparov brought his best
game to the match, or even that he played well. He had a clearly better
position in Games 3, 4, and 5. Kasparov resigned in game 2 when he was on the
brink of saving a draw after playing an opening known to be inferior. Game 6
was, … well, … I can’t explain that one, can you? How often do Grandmasters
confuse the order of opening moves and fall into schoolboy traps? Some have
argued that Garry’s 8th move was book, and that theory indicates
black’s position is sound. I haven’t seen many recent Grandmaster games in
that line, and it is certainly not the kind of position you want to play
against a computer. Game 6 proved that human beings make mistakes in
mechanical mundane processes, and computers rarely do so. Anyone who thinks
Deep Blue proved it is as good as Garry Kasparov should note the bottom line in
the difference in the score of this year’s match and last year’s match. Last
year the score was 4-2 in Garry’s favor, while Deep Blue’s 3.5-2.5 victory in
this year’s match implies a difference of 1.5 points. But it should be noted
this difference in score can be accounted for in a (game 2) resignation in a
drawn position and a (game 6) opening accident. This year’s Deep Blue was good
enough to get a position that Garry thought he was losing, though, while Garry
was never in trouble in the first match after the first game. I consider this
to have been a lucky happenstance, but I have to admit luck like that happens
to good players against lesser competition all the time.
In the last 2 games of the 1996
match Kasparov made it all look easy. He did so by pointing out that this
computer did not have a quality possessed by chess grandmasters described quite
eloquently by Kotov in his “Play/Think Like a Grandmaster” books when he said
that chess mastery is “knowing what to do when there’s nothing to be done”.
Garry did not just win the 1996 match, he refuted the entire Deep Blue approach
to chess in those last 2 games, and to some extent in game 2 of that match.
This is scientifically valid – theoretically, a computer with little chess
knowledge in its algorithms should not be able to judge well between
“equal-ish” positions. And that theory was validated by the 1996 experiment.
That is also why Kasparov was
able to gain advantages so easily in games 1, 3, and 5 by playing off beat
openings. Garry forced the computer to play purely on positional concepts,
because a deep search algorithm is almost completely useless in the openings
unless the program can accurately judge all those positions. This causes me to
question whether this year’s program is really that much different than last
year’s.
This year’s program is not
really stupid in the Artificial Intelligence sense. It does have some very
narrow criteria for when it will selectively extend its search of a particular
line. I also know from some of Joel Benjamin’s comments that they worked very
hard on this program’s ability to determine the value of the Bishops and
Knights relative to any given position. Other comments told the story of
correcting Deep Blue’s misevaluation of pins in the center of the board that
plagued it in the previous match, and game 1 of this years match is definite
evidence of that correction, despite Deep Blue’s loss in that game.
So, what has been proven?
Sadly, I have an answer to this. It may not really have been “proven”, per se,
but it appears that the art of AI is not as advanced as we think. This program
still outperforms every other chess playing computer program in the world by a
great deal. Computer intelligence has been beaten by an exhaustive search to a
depth of 15 ply, so intelligence isn’t doing us much good, right? While this
realization saddens me, I am consoled knowing that a tremendous amount of money
and human capital has been spent on the Deep Blue effort. I would like to see
a similar amount of effort expended on an algorithm that emphasizes
intelligence.
One of the most intelligent
programs is called Crafty. Crafty was written by Dr. Robert Hyatt, professor
of Computer and Information Sciences at the University of Alabama at
Birmingham, and (I think) he has some relationship with the Deep Blue team. He
and others have put a considerable amount of effort into the Crafty on a
strictly volunteer basis, and the source code can be had for free, so many
others have extended it. Crafty searches only about 80 thousand positions per second,
compared to Deep Blue’s 200 million positions per second. Yet, because Crafty
uses intelligence to pare down its search tree, it consistently sees 10 ply
ahead, and it can see further because it selectively extends the search. Now
that we know that the deep exhaustive search technique can only be made good
enough to provide a good match for the best human player(s), I would like to
know if pressing Crafty-like intelligence into silicon chips would make a great
chess player. Several have suggested this, but Robert Hyatt has stated (in a
usenet contribution to the news group rec.games.chess.computer) that it would
not do any better than Deep Blue. I think his reasoning is that while this
deeper search would improve Crafty, it wouldn’t search nearly as deep as Deep
Blue and whatever benefits the added speed would give the program would be
balanced against that lack of search depth. He knows both programs well enough
that I can’t dispute that assessment, but I still want to see a more
“intelligent” approach. One other drawback is that it may be much harder to
press that much logic into silicon chips.
Finally, I would like to
point out several things I noticed about this year’s match. First, Game 5 was
the only “open” middle game position. Pawns were locked up in the center of
the board early in the other games. I notice this because I think the best
position to have against Deep Blue is one with an open center. My reasoning is
that when positions are closed, the Rooks, Bishops, and Queens do not have as
much scope, and Deep Blue does not have as many “equal-ish” positions to choose
between in it’s 15 ply search. Testing my theory, I went back to the 1996
match. In the 1996 match, Game 2 was open in the sense that the middle game
play occurred all on the queen side, and that side of the board was quite open.
Game 3 was semi-open. Game 4 was an open position, and Deep Blue showed its
amazing resourcefulness to hold a draw. Game 5 was open, and in game 6, Garry
had a tremendous amount of activity and was able to cramp up black’s position
nicely. I would like to see someone employ this strategy against the current Deep
Blue, but I must admit that it may not be a good idea to give the master
calculation monster’s pieces so much scope. I have posed this approach to
several good chess players and chess programmers, and they seem to all be
skeptical, but intrigued.
There was another thing I
noticed about this year’s match. I watched the match on the Internet Chess
Club, and saw the commentary from people operating various computers. I
noticed was that in this year’s match, Fritz 4 seemed to predict Deep Blue’s
move quite often in the critical positions. Fritz 4 is a Windows program that
retails for less than $300. Presumably, the operator of that program
commenting on the match was running on a computer not unlike the typical home
Pentium I am using to write this article.
If you had access to the
Internet, coverage of this match was excellent. Numerous servers carried the
games live with commentary from titled players. At one point on the Internet Chess
Club, my screen was continuously scrolling when it was set to read only the
commentary of titled players (IMs, GMs, and FMs). Also, vendors of chess
magazines, books, equipment, and products on the World Wide Web carried some excellent
analysis updated nightly. Obviously, they wanted to pull you in and advertise
their products. Whatever their motives, I really appreciated those efforts,
and coverage of chess events is becoming very entertaining.
|