Sunday, July 05, 2009

Asynchronous Games

I'm heading off to California for the week where I will present my paper with Moshe Babaioff and Liad Blumrosen, "Auctions with Online Supply", at the ad-auctions workshop that preceeds EC. If you are going to be around, you should come to my talk!

But this is a post on something fun that I was thinking about this past semester at Microsoft research, with Nina Balcan, Adam Kalai, and Yishay Mansour.

In game theory it often goes without saying that when we are talking about an (infinitely) repeated interaction amongst agents, that we are assuming that the agents are acting syncronously: that is, we implicitely have a universally known sequence of discrete clock ticks, and every agent acts simultaniously at each clock tick, without knowledge of the other's actions. Think about this standard model as modelling what goes on when you play "Rock, Paper, Scissors". I'm not sure why simultanious move repeated games became the standard historically -- my guess is that they are just easier to think about. But its not obvious that when modelling interactions (say) on the internet, that the simultanious move model is the right one -- perhaps some model that has players acting asynchronously, with knowledge of previous actions, is more realistic -- often we are without any syncronizing mechanism.

There is another reason why the simultanious move model might not be the best -- the complexity results in this model have been almost universally negative. It turns out to be hard to compute the Nash equilibria of even a 1-shot (non-repeated) two-player n x n game in which the payoffs are restricted to be in {0,1}. You might think that it is easier to compute the equilibria of a repeated game (since due to the "Folk Theorem", almost anything is an equilibrium in a repeated simultanious move game!), but for n > 2 players, that's known to be hard as well. In addition to accurately modelling the games we are interested in, we would like our solution concepts to be computationally plausible, and simultanious move Nash equilibria often seem to fail in both counts.

A reasonable first pass at a model of asynchrony in a two player game might go as follows: The game is still defined by an action set A for player 1, B for player 2, and utility functions u_i :AxB -> [-1,1]. Every day, a random player gets to perform an action from his action set, with full knowledge of the past history. If I (player 1) play action 'a' today, and my opponent last played action 'b', I get payoff u_1(a,b), and my opponent gets payoff u_2(a,b). And vice versa. We keep playing forever, and what we care about is our average per-day payoff in the limit, as time goes to infinity.

So how should I think about equilibria in this model? It turns out that this model is strategically equivalent to the model in which the two players take turns alternately making moves, instead of getting to make moves in a random order (informally, this is because if I get two turns in a row, it doesn't make sense to change my action, so equilibria from one model can be converted to equilibria in the other model and vice versa). This sounds like good news -- if we were only playing for finitely many rounds, this would just be a game of "perfect information", and we could find equilibria of the game just by "solving it", starting from the last round and working backwards. (Of course the game tree might be pretty big, so lets hope "finite" means "really small")

What about the infinitely repeated case? It turns out that infinitely repeated alternating move games always admit a particularly appealing kind of equilibrium: one in which each player plays according to a stationary strategy: a unit-memory strategy in which the next action is just a (possibly randomized) function of the opponent's last action. If these stationary strategies are not randomized (pure), they are even simpler to think about: they eventually lead game play into a deterministic cycle of actions, and players just care about their average payoff along this cycle.

We show the following in this alternating move model:
  • All but a negligible fraction (smaller than any inverse polynomial) of games actually do admit equilibria in pure stationary strategies.
  • If a game admits equilibria in pure stationary strategies, there is a PTAS to compute epsilon-approximate equilibria in pure stationary strategies.
  • In any game, there is an FPTAS to compute epsilon-approximate equilibria in (non-stationary) pure strategies, even for n > 2 players.
This last bullet confirms the "backwards induction" intuition that alternating move games are actually more tractable than simultanious move games: there isn't an FPTAS for computing equilibria in simultanious move repeated games with n > 2 players unless PPAD = P.

So how about finding exact equilibria? My intuition is that there should be an algorithm to efficiently compute exact equilibria in these games, but I have no idea how to do it. Even for the special case of zero-sum games, the problem of computing -exact- policy equilibria seems to be closely related to a long-standing open problem in model checking, so solving the problem might be hard...

But I think that this model deserves more attention than it has recieved: why should we restrict our attention to a model that (often artificially) assumes simultanious play, when this restriction is making our lives computationally much harder than they need to be?

(One reason is that simultanious move equilibria have a nice simple description that makes analyzing things like the price of anarchy particularly easy. Unfortunately, since the model is computationally intractable, those price of anarchy guarantees might not always be so predictive... It would be cool if someone came up with a good technique for studying the price of anarchy in alternating move games, or in another tractable model.)


Thursday, January 15, 2009

Oh, Pittsburgh

From the Post Gazette

Our mayor, Luke Ravenstahl, has begun the process to officially change his name to Luke Steelerstahl. Why would he do such a thing? Its because on Sunday the Pittsburgh Steelers will be playing the Baltimore Ravens. According to our mayor:
On behalf of the Steelers Nation, I've decided to remove the word 'Ravens' from my name just like the Steelers will remove them from the AFC Championship.
Stahl, of course, means "Steel" in German.




Wednesday, December 03, 2008

CS theory in 60 minutes for highschool students

I might have the opportunity to give a talk at a high school, as part of a "what is science" style seminar. The audience would presumably tilt towards the more nerdy (since attendance is voluntary), but still probably wouldn't have more than a standard high school background in math or computer science (which means, likely, that they have never heard of computer science).  So I'd like to give a talk that introduces them to computer science as the study of computation, and I'd like to make it sound cool. So there should be perhaps some proofs of cool theorems, but all of the proofs should be simple enough to do by picture on a powerpoint slide. I'm looking for suggestions for what to present.

My first thought is a talk titled something like "What we can't do, infinity, and even bigger". It would cover two proofs that I think have a particularly high coolness to complicatedness ratio: Cantor's proof of the uncountability of reals, and Turing's proof of the undecidability of the halting problem. The outline would perhaps be something like:

1) Infinities are not all equal. There are lots of integers, but there are WAY more real numbers. Spend 20 minutes explaining diagonalization, and what it means.

2) Computer programs can do a lot, but not everything. 
Spend 20 minutes explaining the halting problem, and what it means to be undecidable. I'd probably avoid defining Turing Machines, since that seems like a mess, and just talk about it in terms of computer programs, which are hopefully a little more familiar

3) The punch line: Computer programs are countable, like the integers, but the problems that you might want to solve (languages) are uncountable, like the reals. So although computer programs can solve lots of things, actually, most things can't be solved. Hopefully this can be done in 10 minutes given that I've already talked about uncountability and undecidability.

I like these proofs since they don't require any machinery, and seem like they can be done visually with some animations. Thoughts?

Friday, November 28, 2008

Incentives, computation, and privacy

What do computer scientists bring to the study of game theory? One thing is simply a computer science style of analysis: worst case input, competitive ratios, approximation factors, etc. Perhaps the more interesting thing, however, is the study of how incentive constraints and computation interact: since we want to actually implement any mechanism we might dream up (in principle), it is subject not only to traditional game theoretic constraints such as truthfulness, but also standard computational limitations.

One fundamental question is: What can be computed by algorithms that satisfy constraint X, and it is natural to substitute "incentive compatibility" for "constraint X". An interesting study along these lines appeared in the last FOCS: "On The Hardness of Being Truthful"by Papadimitriou, Schapira, and Singer. This paper gives an example of a natural NP hard problem that can be well approximated efficiently, can be implemented truthfully, but cannot be efficiently well approximated by a truthful implementation: that is, they show that for this problem, truthfulness and computational limitation restrict what is algorithmically possible, although neither limitation is insurmountable on its own. 

They call the problem the "Combinatorial Public Projects Problem" (CPPP), although it is really just submodular maximization. In the CPPP, there are n agents, and m potential public goods. a mechanism for the CPP problem will select a subset of k of these goods. The agents have an opinion over which subsets are best: each agent i has a submodular valuation function f_i over sets of goods. The mechanism wishes to maximize the social welfare: it wants to pick the set of k goods that maximizes the sum of the agents valuation functions. We'll call this sum F.    Notice that since each f_i is submodular, F is submodular as well, and so the problem the mechanism faces is simply submodular maximization, and it is easy to show that the greedy algorithm (which selects at each stage the good that produces the largest marginal utility, until k goods are selected) achieves a (1-1/e) approximation. Not bad! But also not truthful -- agents may be incentivized to misreport their valuations. Whats more, the classic VCG mechanism achieves perfect social welfare, and enforces truthfulness -- but of course cannot be implemented efficiently, since it involves solving an NP hard problem exactly. What Papadimitriou, Schapira, and Singer show is that no truthful efficient algorithm can guarantee social welfare better by any polynomial factor than Sqrt(m) -- a huge gap from what is possible with either constraint alone.

Now whenever you see a striking impossibility result like this, its tempting to ask how resilient it is. Does it still hold if no player by himself constitutes a large fraction of the total social welfare? What if we relax truthfulness to approximate truthfulness, so that perhaps players can benefit by lying, but can't gain more than some small epsilon?

It turns out this particular problem is no longer so hard when we relax some of our constraints. Anupam Gupta, Katrina Ligett, Frank McSherry, Kunal Talwar, and I have a mostly unrelated paper, called "Differentially Private Approximation Algorithms". In this paper we also ask "What can be computed by algorithms that satisfy constraint X?", but in this paper, constraint X is differential privacy. We get mostly positive results: many NP hard problems, including the CPPP problem can be well approximated efficiently by privacy preserving algorithms. So what does this have to do with truthfulness?

In their 2007 FOCS paper "Mechanism Design via Differential Privacy", Mcsherry and and Talwar observed that mechanisms that satisfy differential privacy are also approximately truthful: privacy guarantees that any single player who changes his input does not change the probability of any output by more than a (1+epsilon) factor; therefore, he cannot increase his utility by more than a (1+epsilon) factor by deviating from a truthful input. In general, lets say that a mechanism satisfies gamma-truthfulness if no agent can benefit by more than an additive gamma factor by deviating from truthful behavior. So suppose in the CPP problem we scale every agent's valuation function so that her maximum valuation is 1. As a corollary of one of our approximation results, we show that for any gamma, there exists an efficient, gamma-truthful mechanism that achieves an additive O(k log m * log(1/gamma)/gamma) approximation factor. (Contrasting with the impossibility result of a multiplicative Sqrt(m) factor for 0-truthful mechanisms). In particular, for instances in which we have OPT = Ω(k log m * log(1/gamma)/gamma), this is an approximately truthful constant factor approximation algorithm.

I think that computer scientists are just starting to scratch the surface of the interplay between game theory and computation. It seems that there is a rich theory of incentive compatible computation waiting to be discovered, and that algorithmic game theory will prove to fit in nicely with the rest of theoretical computer science as a discipline that studies the fundamental abilities and limits of computation.

 

Friday, November 21, 2008

Computer science in the news

The New York Times magazine has a feature this week on the NetFlix challenge:  http://www.nytimes.com/2008/11/23/magazine/23Netflix-t.html?pagewanted=1&hp

Its actually puts computer science in a pretty nice light -- it makes it seem very exciting. In particular, it seems to take the position that complex machine learning systems have something approaching real intelligence -- or at least that they are complex enough that they aren't fully understood by their own creators:

There’s a sort of unsettling, alien quality to their computers’ results. When the teams examine the ways that singular value decomposition is slotting movies into categories, sometimes it makes sense to them — as when the computer highlights what appears to be some essence of nerdiness in a bunch of sci-fi movies. But many categorizations are now so obscure that they cannot see the reasoning behind them. Possibly the algorithms are finding connections so deep and subconscious that customers themselves wouldn’t even recognize them. At one point, Chabbert showed me a list of movies that his algorithm had discovered share some ineffable similarity; it includes a historical movie, “Joan of Arc,” a wrestling video, “W.W.E.: SummerSlam 2004,” the comedy “It Had to Be You” and a version of Charles Dickens’s “Bleak House.”  For the life of me, I can’t figure out what possible connection they have, but Chabbert assures me that this singular value decomposition scored 4 percent higher than Cinematch — so it must be doing something right.

The author is definitely giving the computer too much credit for these anomylous clusters (and holds up "singular value decomposition" throughout the article as some sort of magical technique), but the point made is the right one. The difference between real intelligence and "just an algorithm" does in large part seem to be whether or not you can anticipate what its going to do. When you create a program that can come up with output that surprises you, thats pretty cool.

Saturday, November 08, 2008

Auctions with Unknown Supply

Its been a long time since my last post; I spent the summer at Microsoft Research in Silicon Valley, which was tons of fun, and since then I've been writing up some of the work I did there. Oh, and I've also been wasting an inordinate amount of time reading political blogs. But all of these things are now over, and its time for a post. Today, I'll write about one of the cool projects I worked on at Microsoft, with Moshe Babaioff and Liad Blumrosen.

Lots of people have heard talks about truthful auctions by now: A seller has some object he wants to sell, and bidders each have some private value for the item. The seller wants to sell the item to the highest bidder. How should he do it? If he just asks for bids and sells to the highest bidder, charging the bidder his own bid, buyers will lie about how much they value the item, and the item may not actually go to the person who values it the most. A classic result of Vickrey says that the seller should solicit bids, and give the item to the highest bidder, but charge him the second highest price. Then, the bidders can't do any better for themselves by lying about their valuations -- they should all just tell the truth! Auctions that have this property are called truthful auctions. Its easy to generalize this to the case in which the seller has not just a single item, but instead is selling k identical items.

This is all well and good -- the problem of selling items seems solved. And now that auctions are used to sell billions of advertisements on the internet, the theory of truthful auctions can be put to good use.

Unfortunately, there are some characteristics of these auctions that haven't been modelled. Even if we abstract away the details of particular ad systems (different slots worth different amounts, etc.), the items we are selling are page views, and page views are peculiar items indeed: First and foremost, we don't know how many items we will have available to sell! The items arrive online as users make clicks, and since we can't delay showing the user a page, we must decide on the allocation and price of each item as soon as it arrives, before we know how many future items will arrive. So the auctions we really want to study are online. And unlike previous work on online auctions, the auctioneer knows who the bidders are -- its the items that arrive online.

The Vickery auction maximizes social welfare -- the sum of the valuations of the bidders who recieve items. A basic question is how well we can approximate the optimal social welfare when items arrive online. At first blush, it seems that perhaps we can achieve optimal social welfare: After all, if we knew bidders' valuations, the greedy algorithm which simply allocated items as they came in to the unsatisfied bidder with the highest bid would achieve this. The problem is designing a payment rule that incentivizes bidders to actually bid their true values.

But lets step back -- what do we mean, anyway, when we say that the supply is unknown to the mechanism? There are several things we might mean. Perhaps, as is standard in the analysis of online algorithms, we mean that the supply is adversarial -- items arrive one at a time, but may stop at any time, and we want good guarantees on social welfare regardless of how many items arrive. However, this might not be totally realistic. After all, if we are Microsoft (or, lets see... Google?), we're running lots of these auctions every day, and we have good distributional information about how many items arrive each day. So another thing we might mean is that the total number of items that will arrive in any given day is drawn from a distribution that is known to the auctioneer. In this case, we only want good welfare guarantees in expectation over this distribution.  

And this gets to the heart of our result: distributional information is -really- important. What we show is that in the adversarial setting, its impossible to get a constant approximation to social welfare. In fact, no deterministic truthful mechanism gets better than the trivial n-approximation in the worst case. Even randomized truthful mechanisms can't achieve better than an Ω(loglog(n)) approximation.  On the other hand, in the stochastic setting, it is possible to get a constant approximation to social welfare. For any distribution that satisfies a technical condition known as monotone hazard rate (and many distributions do), there is a deterministic truthful mechanism which achieves a ~ 16 approximation. 

So what is the take home message? Pay attention to distributional information, at least when it comes to your supply. Economists often make distributional assumptions about the values of the bidders, and our result shows that this is less important in our setting -- we don't need this to get a good approximation to welfare. This is good, because a search engine has direct access to the distribution over clicks and page views, but must do some guessing to figure out the true values of its bidders (as distinct from their bids). And this is how it should be -- if we have a model in which what we want to accomplish is impossible, we should fix it with the information that we have right under our noses.

Saturday, September 13, 2008

Grim Trigger

It seems that the late army scientist turned anthrax suspect Bruce Ivins was not just a microbiologist, he was a game theorist. In an article today in the New York Times, we learn that before the FBI focused its anthrax investigation on him, he wrote a will, requesting that in the event of his death, his remains be cremated and scattered. But talk is cheap, and he wasn't sure that his wife and children would honor his request. 

It turns out that Ms. Ivins, his wife, was an anti-abortion activist and a former president of Frederick County Right to Life, a pro-life organization. So Ivins wrote the following into his will:
“If my remains are not cremated and my ashes are not scattered or spread on the ground, I give to Planned Parenthood of Maryland $50,000"
or about 1/3 of his estate. 

His concern seems to have stemmed from the fact that cremation is discouraged by Catholic doctrine, and in the case of cremation, ashes must be buried in sacred ground rather than scattered. But the threat-from-beyond-the-grave has worked, and it seems that Ivin will get his wish. So consider this a victory of game theory over theology.