Saturday, November 08, 2008

Auctions with Unknown Supply

Its been a long time since my last post; I spent the summer at Microsoft Research in Silicon Valley, which was tons of fun, and since then I've been writing up some of the work I did there. Oh, and I've also been wasting an inordinate amount of time reading political blogs. But all of these things are now over, and its time for a post. Today, I'll write about one of the cool projects I worked on at Microsoft, with Moshe Babaioff and Liad Blumrosen.

Lots of people have heard talks about truthful auctions by now: A seller has some object he wants to sell, and bidders each have some private value for the item. The seller wants to sell the item to the highest bidder. How should he do it? If he just asks for bids and sells to the highest bidder, charging the bidder his own bid, buyers will lie about how much they value the item, and the item may not actually go to the person who values it the most. A classic result of Vickrey says that the seller should solicit bids, and give the item to the highest bidder, but charge him the second highest price. Then, the bidders can't do any better for themselves by lying about their valuations -- they should all just tell the truth! Auctions that have this property are called truthful auctions. Its easy to generalize this to the case in which the seller has not just a single item, but instead is selling k identical items.

This is all well and good -- the problem of selling items seems solved. And now that auctions are used to sell billions of advertisements on the internet, the theory of truthful auctions can be put to good use.

Unfortunately, there are some characteristics of these auctions that haven't been modelled. Even if we abstract away the details of particular ad systems (different slots worth different amounts, etc.), the items we are selling are page views, and page views are peculiar items indeed: First and foremost, we don't know how many items we will have available to sell! The items arrive online as users make clicks, and since we can't delay showing the user a page, we must decide on the allocation and price of each item as soon as it arrives, before we know how many future items will arrive. So the auctions we really want to study are online. And unlike previous work on online auctions, the auctioneer knows who the bidders are -- its the items that arrive online.

The Vickery auction maximizes social welfare -- the sum of the valuations of the bidders who recieve items. A basic question is how well we can approximate the optimal social welfare when items arrive online. At first blush, it seems that perhaps we can achieve optimal social welfare: After all, if we knew bidders' valuations, the greedy algorithm which simply allocated items as they came in to the unsatisfied bidder with the highest bid would achieve this. The problem is designing a payment rule that incentivizes bidders to actually bid their true values.

But lets step back -- what do we mean, anyway, when we say that the supply is unknown to the mechanism? There are several things we might mean. Perhaps, as is standard in the analysis of online algorithms, we mean that the supply is adversarial -- items arrive one at a time, but may stop at any time, and we want good guarantees on social welfare regardless of how many items arrive. However, this might not be totally realistic. After all, if we are Microsoft (or, lets see... Google?), we're running lots of these auctions every day, and we have good distributional information about how many items arrive each day. So another thing we might mean is that the total number of items that will arrive in any given day is drawn from a distribution that is known to the auctioneer. In this case, we only want good welfare guarantees in expectation over this distribution.  

And this gets to the heart of our result: distributional information is -really- important. What we show is that in the adversarial setting, its impossible to get a constant approximation to social welfare. In fact, no deterministic truthful mechanism gets better than the trivial n-approximation in the worst case. Even randomized truthful mechanisms can't achieve better than an Ω(loglog(n)) approximation.  On the other hand, in the stochastic setting, it is possible to get a constant approximation to social welfare. For any distribution that satisfies a technical condition known as monotone hazard rate (and many distributions do), there is a deterministic truthful mechanism which achieves a ~ 16 approximation. 

So what is the take home message? Pay attention to distributional information, at least when it comes to your supply. Economists often make distributional assumptions about the values of the bidders, and our result shows that this is less important in our setting -- we don't need this to get a good approximation to welfare. This is good, because a search engine has direct access to the distribution over clicks and page views, but must do some guessing to figure out the true values of its bidders (as distinct from their bids). And this is how it should be -- if we have a model in which what we want to accomplish is impossible, we should fix it with the information that we have right under our noses.

2 comments:

Jonathan Katz said...

Interesting post. But since the motivation is practical, I have a question: for any reasonable system, wouldn't log log n be less than 16?

Aaron said...

Hi Jonathan,

My answer has two parts: I left out some details, and my motivation is not purely practical.

Sticking with the practical motivation: Actually, if you are a company like Google selling things to customers, you really want to do so in a non-discriminatory way so that you don't piss your customers off. The mechanisms we give are in a natural class that we call "online envy-free", which basically means that bidders who are given exactly the same thing shouldn't be charged different prices. If you stick to mechanisms like this, then we strengthen our lower bound to ~Ω(log n) (up to a loglog factor -- and this is tight). Since we're talking about things like pageviews and clicks on popular websites, log(n) is now starting to look pretty big.

But I don't want to pretend that we've designed a completely realistic model, because we haven't -- we've chosen a simple model that gets to the heart of the issue we want to study: online supply. And that answer, I think, sells the field of mechanism design a bit short: auctions are interesting mathematical structures in their own right, not only because they are useful for making money. Cute introductory paragraphs aside, it is interesting what restrictions the semantic constraint of truthfulness places on algorithms. If we ignore incentive issues, the problem we study is trivial: the greedy algorithm which assigns items as they come in to the highest unsatisfied bidders achieves optimal social welfare, even in the adversarial setting. But truthfulness turns out to be a real barrier in this online setting: one that can't be overcome in the adversarial setting, but which can be in the stochastic setting.

So my real answer is this: I think that the constraints that semantic restrictions like truthfulness place on computation are interesting in the same way that time and space constraints are. These problems often relate to making money, but thats not the only reason we should study them.