Sunday, July 05, 2009

Asynchronous Games

I'm heading off to California for the week where I will present my paper with Moshe Babaioff and Liad Blumrosen, "Auctions with Online Supply", at the ad-auctions workshop that preceeds EC. If you are going to be around, you should come to my talk!

But this is a post on something fun that I was thinking about this past semester at Microsoft research, with Nina Balcan, Adam Kalai, and Yishay Mansour.

In game theory it often goes without saying that when we are talking about an (infinitely) repeated interaction amongst agents, that we are assuming that the agents are acting syncronously: that is, we implicitely have a universally known sequence of discrete clock ticks, and every agent acts simultaniously at each clock tick, without knowledge of the other's actions. Think about this standard model as modelling what goes on when you play "Rock, Paper, Scissors". I'm not sure why simultanious move repeated games became the standard historically -- my guess is that they are just easier to think about. But its not obvious that when modelling interactions (say) on the internet, that the simultanious move model is the right one -- perhaps some model that has players acting asynchronously, with knowledge of previous actions, is more realistic -- often we are without any syncronizing mechanism.

There is another reason why the simultanious move model might not be the best -- the complexity results in this model have been almost universally negative. It turns out to be hard to compute the Nash equilibria of even a 1-shot (non-repeated) two-player n x n game in which the payoffs are restricted to be in {0,1}. You might think that it is easier to compute the equilibria of a repeated game (since due to the "Folk Theorem", almost anything is an equilibrium in a repeated simultanious move game!), but for n > 2 players, that's known to be hard as well. In addition to accurately modelling the games we are interested in, we would like our solution concepts to be computationally plausible, and simultanious move Nash equilibria often seem to fail in both counts.

A reasonable first pass at a model of asynchrony in a two player game might go as follows: The game is still defined by an action set A for player 1, B for player 2, and utility functions u_i :AxB -> [-1,1]. Every day, a random player gets to perform an action from his action set, with full knowledge of the past history. If I (player 1) play action 'a' today, and my opponent last played action 'b', I get payoff u_1(a,b), and my opponent gets payoff u_2(a,b). And vice versa. We keep playing forever, and what we care about is our average per-day payoff in the limit, as time goes to infinity.

So how should I think about equilibria in this model? It turns out that this model is strategically equivalent to the model in which the two players take turns alternately making moves, instead of getting to make moves in a random order (informally, this is because if I get two turns in a row, it doesn't make sense to change my action, so equilibria from one model can be converted to equilibria in the other model and vice versa). This sounds like good news -- if we were only playing for finitely many rounds, this would just be a game of "perfect information", and we could find equilibria of the game just by "solving it", starting from the last round and working backwards. (Of course the game tree might be pretty big, so lets hope "finite" means "really small")

What about the infinitely repeated case? It turns out that infinitely repeated alternating move games always admit a particularly appealing kind of equilibrium: one in which each player plays according to a stationary strategy: a unit-memory strategy in which the next action is just a (possibly randomized) function of the opponent's last action. If these stationary strategies are not randomized (pure), they are even simpler to think about: they eventually lead game play into a deterministic cycle of actions, and players just care about their average payoff along this cycle.

We show the following in this alternating move model:
  • All but a negligible fraction (smaller than any inverse polynomial) of games actually do admit equilibria in pure stationary strategies.
  • If a game admits equilibria in pure stationary strategies, there is a PTAS to compute epsilon-approximate equilibria in pure stationary strategies.
  • In any game, there is an FPTAS to compute epsilon-approximate equilibria in (non-stationary) pure strategies, even for n > 2 players.
This last bullet confirms the "backwards induction" intuition that alternating move games are actually more tractable than simultanious move games: there isn't an FPTAS for computing equilibria in simultanious move repeated games with n > 2 players unless PPAD = P.

So how about finding exact equilibria? My intuition is that there should be an algorithm to efficiently compute exact equilibria in these games, but I have no idea how to do it. Even for the special case of zero-sum games, the problem of computing -exact- policy equilibria seems to be closely related to a long-standing open problem in model checking, so solving the problem might be hard...

But I think that this model deserves more attention than it has recieved: why should we restrict our attention to a model that (often artificially) assumes simultanious play, when this restriction is making our lives computationally much harder than they need to be?

(One reason is that simultanious move equilibria have a nice simple description that makes analyzing things like the price of anarchy particularly easy. Unfortunately, since the model is computationally intractable, those price of anarchy guarantees might not always be so predictive... It would be cool if someone came up with a good technique for studying the price of anarchy in alternating move games, or in another tractable model.)