## Monday, September 24, 2007

### Are the Reals real?

We learn in elementary school about different sorts of numbers. Each seems to be named so as to disparage a larger class of numbers. We have:
• The Naturals. These consist of non-negative integers (negative integers are unnatural). We can define the naturals inductively. 0 is a natural number, and for every natural number x, (x+1) is also natural.
• The Rationals. These consist of numbers that can be written "as fractions" (Postulating the existence of any other sort of number would be irrational). We can define the rationals: for every two integers x and y, x/y is in the set of rationals.
• The Reals. These consist of numbers that can be written as (infinite) decimal expansions (Any other kind of number must be imaginary). We can define the reals as the limits of sequences of binary expansions (i.e. 3, 3.1, 3.14, 3.141, 3.1415,...). In this view, real numbers are actually functions from the naturals to the rationals: A real number f corresponds to the sequence whose i'th term is f(i).

In each case, we can perhaps ascribe the disparaging labels of these class of numbers to the closed-mindedness of the namers. After all, there have been times in history when negative numbers were considered unnatural, and the square root of 2 was actually thought of as not rational. But have we gone too far? Are real numbers really real?

Real numbers pose a problem for people with a computational world view -- those that believe that in principle physical phenomena can be simulated. If quantities in the universe such as position, mass, velocity, temperature, etc. are represented by (continuous) real values, then they can't be perfectly simulated.

The first and most obvious problem we encounter is representation size -- most reals can't be finitely represented. So in any simulation with finite memory, even barring any other problems, we're just not going to have the space to keep track of even a single value.

A second and more insidious problem is even worse -- most real numbers aren't computable. That is, there is no finitely representable procedure for calculating most real numbers, even if we were to avoid the first problem by granting our simulation infinite memory, and even if we were to allow your procedure an infinite amount of time. This is because, no matter what representation you choose for writing down your finite procedure (Turing Machines are popular), the set of such procedures is only countably infinite, whereas there are uncountably many reals.

A third (related) problem is that most anything we would want to do with our real values would be undecidable. Even the problem of comparing two reals is undecidable -- consider the process of comparing the real x_T to pi, where x_T is defined to be the real with the i'th digit of pi in its i'th decimal place for all i less than T_h, where T_h is the step at which Turing machine T halts, and with the i'th digit of e in its i'th decimal place for all i greater than or equal to T_h.

So is anything in our universe actually represented by "variables" defined over the reals? Believing this seems to entail believing that the universe is inherently non-computational, regularly resolves undecidable problems and maintains information to infinite precision.

P.S. Many people object to the axiom of choice, because although it is consistent with (and independent of) other mathematical axioms, it allows for things that go counter to our intuition. For example, the Banach-Tarski Paradox: One may partition a solid 3-dimensional sphere into finitely many non-overlapping pieces, and then with only rotations and translations can reassemble the pieces into two identical copies of the original ball. But maybe it isn't the axiom of choice that challenges our intuition ("we can choose an element from any nonempty set"), which after all, seems reasonable -- maybe it is the existence of the reals and other uncountably infinite sets. It is really only these sets that require the axiom of choice -- we can choose elements from countably infinite sets without resorting to the axiom (We can define a choice function by first putting the elements of our countable set in one to one correspondence with the integers. Then we may choose the smallest element in each nonempty set.)

Can we really choose an element from an uncountable set? It seems less reasonable. Consider the reals - an uncountable set. Remove from this set all reals that can be named (or defined) - a countable set. Now choose an element!

## Sunday, September 23, 2007

### Alice, Bob, a Simple Game, a Surprise, and a Paradox

Consider the following simple game:

Alice thinks of two integers, not both equal. She writes them down, and places one (at random) in each of her hands. She presents both of her closed hands to Bob, who gets to pick one. Alice will show Bob the integer in the hand he picks. Now Bob has to guess which hand contains the larger integer. This is a very simple game, but has some very interesting properties.

A Surprise

First, suppose that Bob knows nothing of the process that Alice uses to pick her integers -- he doesn't know her distribution, or any bounds on the values she might choose. Can he win more than 50% of the time? It would seem that he cannot -- If he picks one of her hands, and sees some number i, for every distribution that she might be using so that he would expect i to be the smaller number, there is another translated distribution such that he would expect i to be the larger number. Now what if Alice has a spy, and gets to hear Bob's strategy before the game begins, and can change her strategy accordingly? Does Bob have any hope now?

Well, Bob can of course guarantee that he wins 50% of the time, even if Alice knows his strategy, by simply guessing a hand at random and guessing that it contains the larger integer (And this is the basic rationale for randomized algorithms -- Alice can't thwart this strategy even if she knows what Bob will do). In fact, to do this, Bob doesn't even have to look at the number in Alice's hand when she shows it to him. But it turns out that if he does look, he can guarantee that he wins with probability strictly more than 50%, no matter what Alice does, even if she knew his strategy ahead of time. Here's one thing Bob can do: Before the game begins, Bob can pick a random integer Q from some distribution that has support over all of the integers (Say he flips a coin i times until he gets a heads, and then flips another coin. If it comes up heads, his number is i, otherwise it is -i). Bob then picks one of Alice's hands at random, and looks at the number in it. Bob pretends that the number in Alice's other hand is Q + 0.5, and plays accordingly. That's it.

Why does this work? Its funny that an overactive imagination can help Bob win a game like this -- after all, he doesn't really have any idea what number is in Alice's other hand. But as we will see, it does work. Suppose Alice's numbers are L and R, and that L is less than R.

• Q+0.5 less than L less than R: Then Bob wins with probability 50%, since he always guesses that the hand he picked is the larger one, and he picks the larger one initially half the time.
• L less than R less than Q+0.5: Then Bob wins half the time again, since he always guesses that the hand he picked is the smaller one, which again, occurs initially 50% of the time.
• L less than Q+0.5 less than R: In this case, Bob wins whichever hand he picks! If he guesses the smaller hand, he imagines (correctly) that the other hand is larger. If he guesses the larger hand, he imagines (correctly!) that the other hand is smaller.

The point is, the third case occurs with positive probability no matter what numbers Alice picked, because Bob picked Q from a distribution with support over all of the integers! Say case 3 occurs with probability p > 0. Then Bob's probability of winning is (1-p)*0.5 + p = 0.5 + p/2 > 0.5!

Strangely, things can get funnier if Bob knows Alice's strategy. Lets change Bob's incentives a bit: Suppose now, Bob gets paid an amount of money equal to the value in the hand that he guessed was larger, minus the value in the hand that he guessed was smaller (So if Bob guesses correctly he wins money, but if he guesses incorrectly, he loses money). Suppose Alice picks her integers as follows: With probability 1/2, she picks (1, 3). With probability 1/4 she picks (3, 9). With probability 1/8, she picks (9, 27). In general, with probability 1/2^i she picks (3^(i-1), 3^i). It is easy to verify that this is a well-defined probability distribution (In fact, Alice can implement it by simply flipping a coin i times until she gets a heads).

Now how should Bob play? If he picks a hand, and sees a '1', he knows that Alice's other hand must contain a 3, and so he should guess her other hand is larger, guaranteeing a payoff of 2 (rather than a guaranteed loss of 2). Suppose he sees a '3'? In this case, he knows that in her other hand, she has either a 1 (with probability 2/3), or a 9 (with probability 1/3). His expected value for guessing that the other hand is larger is (1/3)*(9-3) - (2/3)*(3-1) = 2/3. His expected value for guessing that the hand he sees is larger is -2/3. In general, conditioned on seeing 3^i for i > 0, the event that Alice's other hand contains a larger number is twice as half as likely as the event that Alice's other hand contains a smaller number, and we can calculate that Bob's expected value for switching hands is 2*3^(i-2), while his expected value for not switching is -2*3^(i-2). So what should Bob do?

He should pick a hand at random. He should look at the number. If he sees a 1, he should switch hands. Otherwise, he should calculate the expected value of switching, and if it is higher than the expected value of staying, he should switch. Equivalently, he should pick a hand at random, and look at the number. If he sees a 1, he should switch hands. Otherwise, he should switch hands (Since we calculated that the expected value of switching is ALWAYS larger, no matter what Bob saw). Equivalently, he should pick a hand, look at the number, and switch hands. Equivalently, he should pick a hand, NOT look at the number, and switch hands (since we calculated that the value he would see is immaterial to his decision). So we have argued that his expected value by picking a hand at random and guessing that the other one contains the larger number has a positive expected value, and that picking a hand at random and guessing that IT contains the larger number has a negative expected value.

But wait a minute -- the two strategies are clearly equivalent -- both of them involve picking the left hand with probability 50%, and picking the right hand with probability 50% -- they must both yield the same expected value. What gives? Surely the value of Bob's strategy isn't affected by the physical act of looking at the number in the hand that he picks (since he won't act on it). Yet we have just argued that two clearly equivalent strategies have different expected values... Hm...

This is sometimes referred to as a variant of the two envelopes paradox. You can read some supposed explanations of it, and decide whether they are convincing (I'm not completely convinced). It seems related to the St. Petersburg "paradox".
• ### The human-computer algorithm

The New York Times has published this ode to the algorithm. They write:
Algorithms, as closely guarded as state secrets, buy and sell stocks and mortgage-backed securities, sometimes with a dispassionate zeal that crashes markets. Algorithms promise to find the news that fits you, and even your perfect mate. You can’t visit Amazon.com without being confronted with a list of books and other products that the Great Algoritmi recommends.

The article is in a section titled "Artificial Intelligence - Computers and the Internet", but the focus of the article is the trend of algorithmically harnessing human intelligence -- the goal of Luis Von Ahn. They mention his Google Image Labeller, which is an explicit algorithm with human subroutines. But perhaps more interestingly, they describe Wikipedia as a human algorithm --

A constantly buzzing mechanism with replaceable human parts. Submit an article or change one and a swarm of warm- and sometimes hot-blooded proofreading routines go to work making corrections and corrections to the corrections.

Generally Wikipedia is thought of as an entirely human project, but is it a giant distributed fault-tolerant algorithm? Why not? I think that it is good that algorithms should be thought of as ideas, rather than software code. Algorithms can exist independently of computer hardware, as entities (worthy of study) entirely unto themselves. After all, we don't think of algebraic fields as objects whose existence is inherently tied to paper or chalkboards. Why should algorithms exist only in silicon?

## Monday, September 10, 2007

### The Hunt for a New Solution Concept

The Hunt is On:

In 2006, Daskalakis, Goldberg, and Papadimitriou published a proof that computing Nash Equilibria in matrix games is complete for the complexity class PPAD. For the purposes of this blog entry, lets take that to mean that there is no efficient algorithm to find Nash equilibria in general games (although I'm not sure how strong the consensus is that PPAD is intractable). More recently, it was shown that computing equilibria even in infinitely repeated games (despite the seeming prevalence of such equilibria) is also PPAD complete.

What these results do is to cast doubt upon the Nash equilibrium as a plausible solution concept: Do we really expect a set of disorganized, computationally bounded, possibly ill-informed players to perform a massive intractable computation in some kind of distributed way? This is a call for theory building: As computer scientists, we have to find a solution concept that is as appealing as Nash's, but that is also feasible in a world with computational limitations.

What are we Looking For?

Fabrikant and Papadimitriou have just published a paper that outlines in its introduction three criteria we would like for a good solution concept. It's a good list. In summary:

1. Realism: We are looking, after all, for a solution concept that will help us predict the behavior of selfish agents. The solutions that fall our of our models should be plausibly rational. Pure strategy Nash equilibria have this property (if we can find them): Since everyone is simultaneously playing a best response to everyone else, it is plausible that no one should want to deviate (although one might also wish to model players who are not perfectly rational, or who wish to deviate from a Nash so as to jolt the system into a more favorable Nash). Mixed Nash are sometimes less plausible: They might require that players randomize among equally good alternatives so as to induce the same behavior in other players -- so that players are optimizing not only their own payoff, but the stability of the system. (On the other hand, in large population games, you can imagine mixed Nash as modeling the behavior of a much larger, but deterministic set of players)

2. Ubiquity: We would like that our solution concept always exist. Nash proved that mixed Nash always exist, but unfortunately pure Nash equilibria do not (Try to find the pure strategy equilibria in rock-paper-scissors).

3. Tractability: This is the viewpoint that computer scientists bring to the field, but perhaps this requirement should really be the same as 1: We can't really consider a solution concept to be a realistic model of selfish behavior if it is computationally intractable to find. Unfortunately, neither pure nor mixed Nash equilibria are computationally tractable to find (modulo some big surprises).

We might add a fourth bullet:

4. Flexibility: Playing your part in a Nash equilibrium only gives a meaningful guarantee if everyone else is simultaneously playing their part. If someone deviates from the Nash profile, all bets are off. But do we really expect everyone in a large population game to behave how we predict they will? What if some are irrational, ill-informed, adversarial, mistake-prone, or just have a slightly different objective function than we think they do? It would be nice to be able to still make some predictions about behavior. Degradation in some global objective function due to selfishness is referred to as the Price of Anarchy -- it is odd that existing bounds based on Nash equilibria only hold if everyone is acting in a proscribed manner. Where is the anarchy?

What Have We Got?

There have been some recent attempts at defining new solution concepts. All of them make assumptions about the dynamics of the game, and so really are modeling repeated games. So far, none of them satisfy the full list of desiderata:

Sink Equlibria: A solution concept suggested by Goemans, Mirrokni, and Vetta, sink equilibria live in the better-response graph of a game. Each vertex in the graph corresponds to an action profile, assigning an action to each player. Edges correspond to better-response moves: There is an edge from u to v (with label i) if u and v differ only in the action of player i, and by changing her action from u_i to v_i, player i increases her utility. In the best-response graph (which is what GMV actually study), each player has at most one edge from each vertex, corresponding to the action that improves her payoff the most.

The best-response graph models a dynamic in which players move sequentially, each making a myopic best response (they care only about their immediate payoff). We may imagine play proceeding as a random walk on the graph. Pure strategy Nash equilibria correspond to sinks in the graph (since everyone is playing a best response already, the corresponding vertex has no outgoing edges). Such sinks do not necessarily exist (think rock paper scissors). However, if we imagine collapsing each maximal connected component (Those parts of the graph that we might enter, but can never leave) into a single vertex, the graph is now guaranteed to have sinks. These sinks are the sink equilibria of the game, and their value is the expected value of a random walk on the sink component. The idea is, a random walk in this best-response graph will eventually get caught in a sink, and the value of the game will be dominated by the expected value of the sink.

1. Realism: This model assumes that players are myopic -- they strategize only to the extent that they greedily pick the best payoff in the next time step. This is limiting in that it sometimes leads to foolish play, and would seem to underestimate the rationality of the player. On the other hand, this might be an ok first order approximation, and in many games it at least allows the players to calculate their next move efficiently, which is an important criteria in realism.

2. Ubiquity: By construction, sink equilibria always exist (although they may include every state in the game, and so be of limited value).

3. Tractability: Unfortunately, the FP paper shows that in a large class of concicely representable games, even identifying whether a given game state is part of a sink is PSPACE complete, and so it seems we are no better off than when we started, in terms of computational efficiency.

4. Flexibility: This model assumes that players play on a best-response graph, and sinks are defined with respect to the graph. But if players deviate -- perhaps they incorporate randomness into their play, fail to calculate their best-response move, or strategize more than one move ahead, the graph that the players are playing on changes, as do the sinks. As a result, this model is brittle to unmodeled behavior.

CUREs: The FP paper suggests another model for repeated games. Imagine that players commit to a strategy that is determined by a restricted class of finite automata, that FP call Unit Recall Strategies. The states of the URS for player i correspond to player i's actions, and the automata reads a tape that at time t tells it what action all the other players are playing at time t. The URS is defined by a set of transitions from action profiles to actions. (So the strategy that a player is committing to is limited by the fact that it must specify an action to play at time t+1 given only the state of the game at time t -- it may not remember anything else). If each player commits to an URS, then we have a completely specified set of game dynamics. Since each player's strategy is a finite automata, game play will eventually loop, and the payoff to player i is defined to be her average payoff of state sin this loop. A Unit Recall Equilibria (URE) is a set of URS such that no player can unilaterally improve her payoff by switching to another URS. A component-wise Unit Recall Equilibria (CURE) is a set of URS such that no player can unilaterally improve her payoff by switching only a single transition rule in her existing URS.

1. Realism: The model assumes that players commit to a restricted class of finite automata strategies. In general, this is a reasonable assumption, although the restriction that player strategies cannot have memory seems a bit strong. CURE seem particularly restrictive in that they assume myopic play the same way the sink-equilibria do: Even if players may only switch one transition in their automata strategy per timestep, the CURE concept is assuming that they also cannot plan more than one time-step into the future.

2. Ubiquity: FP show that unfortunately, the more realistic solution concept of the URE need not necessarily exist. The less realistic one, the CURE, however, always exists.

3. Tractability: FP conjecture that finding an example of the more realistic solution concept, the URE, is in polynomial time, although prove only that it is in NP. It is easy, however, to find a CURE. Unfortunately, the CURE that FP show how to construct is a good example of why the CURE lacks realism as a solution concept -- it depends critically on the fact that players are not permitted to think beyond one step into the future.

4. Flexibility: The model depends on all players using restricted finite automata strategies. If any player deviates, URE guarantees no longer apply, and CUREs are no longer equilibria if any player may change more than one transition at a time, or may anticipate more than one move into the future.

No Regret Play: Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett, and I have proposed the following solution concept: We imagine that players behave so that they have no regret. By that, we mean the following: In retrospect, after seeing (and fixing) how all of the other players played, no player should have been able to improve his payoff by playing the same fixed action at every turn. Except for this restriction, they may play arbitrarily. This definition is appealing for several reasons.

a) It is strictly a generalization of the one-shot-game Nash equilibrium, because if players repeatedly play a Nash equilibrium, they will satisfy the no-regret property.

b) Although we make no assumptions about the particulars of how players behave, there are simple and efficient algorithms that players could use to guarantee that they have no regret, regardless of how other players behave.

c) Our generalization is enough to get strong guarantees. In particular, in valid games, traffic routing games (both atomic and nonatomic), generalized Hotelling games, average-load-balancing, and presumably others, the "price of anarchy" for no regret play matches the price of anarchy for Nash equilibria.

d) Proving a no-regret price-of-anarchy upper bound implies a price-of-anarchy upper for pure and mixed Nash equilibria, and is often simpler than proving the result directly for mixed Nash.

1. Realism: The model makes only the assumption that players play so as to have no regret. How realistic is this? Well, its strictly a generalization of the one-shot Nash equilibrium (both pure and mixed), so in this sense its strictly more realistic. In most large-player games, it is rational to desire no regret, in the sense that if you have regret to some fixed action a, you would have been better off pursuing some other strategy (such as playing a the whole time), and there are efficient algorithms that you could run to get this guarantee. On the other hand, the no-regret assumption implicitly assumes that your actions have a negligible affect on other players actions. For example, in a repeated prisoners dilemma game, the (lucrative) collaborative strategy of both players cooperating every round is not no-regret -- both players would have regret to defecting, not realizing that had they defected every round, they would have changed the behavior of their opponent, thus not truly obtaining a higher payoff. So it seems that the no-regret assumption is a realistic model of rational behavior, but only in games for which ones own actions have a minimal affect on the actions of others. (Perhaps formalizing what such games look like would be a fruitful area of research).
2. Ubiquity: Every game has a no-regret history of play. The proof of this is simply the existence of algorithms that give each player the no-regret guarantee.
3. Tractability: It is easy for players to guarantee that they have no regret (and thus find no-regret histories of play) in most classes of games. In particular, we can do this in polynomial time for any game that has only a polynomial number of actions for players, and also in some games that have an exponential number of actions per player (such as routing games, and many valid games).
4. Flexibility: The no-regret property is flexible enough to allow occasional random behavior (A player could build up 'negative regret', and then occasionally spend it with 'irrational' behavior). Since we don't assume players use particular algorithms, players could also use arbitrary strategies that just happen to generate no-regret histories (even if they don't explicitly guarantee this). Perhaps most significantly, we can often prove "price-of-anarchy" style bounds even when only a subset of the players experience no-regret, and others behave arbitrarily (or adversarially). This flexibility seems essential when quantifying "anarchy".