At the beginning of the semester, I mentioned that after the law school's semester of events (click for videos) on fairness, machine learning, and the law, we would host a technical workshop on recent work on fairness in machine learning.
We have now finished putting together the program, which will be terrific. The workshop will take place here at Penn from January 19-20th. Take a look at our great line-up of speakers here: https://sites.google.com/view/fairnessconfererencepenn
The event is open to the public, but registration is required.
Friday, November 25, 2016
Monday, October 24, 2016
Designing the Digital Economy
I'm on a train to New Haven, where I'll be giving a guest lecture (together with Solon Barocas) in Glen Weyl's class, "Designing the Digital Economy" (n.b. I need to get advice from Glen about how to get as good publicity for my classes...)
Solon and I will be sharing the 3 hour class, talking about fairness in machine learning, starting at 2:30. Pop by if you are around -- otherwise, here are my slides.
Solon and I will be sharing the 3 hour class, talking about fairness in machine learning, starting at 2:30. Pop by if you are around -- otherwise, here are my slides.
Wednesday, September 14, 2016
Semester on Fairness and Algorithms at Penn
This year, the "Fels Policy Research Initiative" is funding two exciting events, both related to fairness and machine learning. The first, joint between the law school and statistics, is called "Optimizing Government", and will host a series of 4 seminars over the course of this semester touching on technical and legal aspects of fairness.
I will be speaking at the first one, introducing the basics of Machine Learning and scenarios in which its use can lead to inadvertent discrimination. My inimitable colleague Richard Berk (who actually builds models used to predict criminal recidivism used by the state of Pennsylvania) will be offering his comments following my talk.
The second seminar will provide a panel discussion on what the law demands in terms of "fair and equal treatment", and how it relates to the use of machine learning. The panelists will come from Philosophy, Political Science, and Law.
The third seminar will be given by our excellent Warren Center postdoc Jamie Morgenstern, and will focus on technical solutions to the problem of unfairness in machine learning, and how it can be squared with learning the optimal policy in online decision making settings.
Finally, the fourth seminar will be an exciting keynote delivered by the current Deputy U.S. CTO Ed Felten on uses of machine learning in government.
I believe the talks will be recorded.
We will begin next semester with the second Fels sponsored workshop, organized between computer science and economics -- a 2 day intensive workshop exploring current research on technical and economic solutions to addressing unfairness in decision making. More details to come.
Below is the schedule for this semester:
The “Optimizing Government” interdisciplinary research collaboration, supported by the Fels Policy Research Initiative, will hold the following workshops this fall:
EDIT: The Optimizing Government Project now has a website: https://www.law.upenn.edu/institutes/ppr/optimizing-government-project/ and the talks will be livestreamed here: https://www.law.upenn.edu/institutes/ppr/optimizing-government-project/media.php
I will be speaking at the first one, introducing the basics of Machine Learning and scenarios in which its use can lead to inadvertent discrimination. My inimitable colleague Richard Berk (who actually builds models used to predict criminal recidivism used by the state of Pennsylvania) will be offering his comments following my talk.
The second seminar will provide a panel discussion on what the law demands in terms of "fair and equal treatment", and how it relates to the use of machine learning. The panelists will come from Philosophy, Political Science, and Law.
The third seminar will be given by our excellent Warren Center postdoc Jamie Morgenstern, and will focus on technical solutions to the problem of unfairness in machine learning, and how it can be squared with learning the optimal policy in online decision making settings.
Finally, the fourth seminar will be an exciting keynote delivered by the current Deputy U.S. CTO Ed Felten on uses of machine learning in government.
I believe the talks will be recorded.
We will begin next semester with the second Fels sponsored workshop, organized between computer science and economics -- a 2 day intensive workshop exploring current research on technical and economic solutions to addressing unfairness in decision making. More details to come.
Below is the schedule for this semester:
The “Optimizing Government” interdisciplinary research collaboration, supported by the Fels Policy Research Initiative, will hold the following workshops this fall:
Thursday, 9/22/16: What is Machine Learning (and Why Might it be Unfair)?Fundamentals of machine learning with a focus on what makes it different from traditional statistical analysis and why it might lead to unfair outcomes.Speakers: Aaron Roth (Penn Computer Science), with comments from Richard Berk (Wharton Statistics; Chair of SAS Criminology)
Thursday, 10/6/16: What Does Fair and Equal Treatment Demand?Current legal and moral norms about fairness and equal protection as they relate to the use of machine learning in government.Speakers: Panel featuring Samuel Freeman (Penn Philosophy), Nancy Hirschmann (Penn Political Science), and Seth Kreimer (Penn Law)Thursday, 11/3/16: Fairness and Performance Trade-Offs in Machine LearningTechnical solutions to fairness challenges raised by machine learning and their impacts on algorithm effectiveness.Speaker: Jamie Morgenstern (Penn Computer Science)Thursday, 11/17/16: Keynote on Machine Learning and GovernmentHow to use machine learning for a variety of administrative and policy functions, and findings from a White House initiative on artificial intelligence in government.Speaker: Ed Felten, Deputy U.S. CTO (Invited)
Each workshop will take place from 4:30-6:00 pm in Gittis 213 (Penn Law). You can enter the Law School through its main entrance at 3501 Sansom Street.
Wednesday, August 24, 2016
Call for Papers: Second Workshop on Adaptive Data Analysis
As part of NIPS 2016, we will be running the second annual workshop on adaptive data analysis. Last year's workshop was a big hit. As a new addition this year, we are soliciting submitted contributions in addition to invited speakers. The call for papers is below. If you are working on relevant work, definitely submit it to our workshop! More information at: http://wadapt.org/
Call for Papers
The overall goal of WADAPT is to stimulate the discussion on theoretical analysis and practical aspects of adaptive data analysis. We seek contributions from different research areas of machine learning, statistics and computer science. Submissions focused on a particular area of application are also welcome.
Submissions will undergo a lightweight review process and will be judged on originality, relevance, clarity, and the extent to which their presentation can stimulate the discussion between different communities at the workshop. Submissions may describe either novel work (completed or in progress), or work already published or submitted elsewhere provided that it first appeared after September 1, 2015.
Authors are invited to submit either a short abstract (2-4 pages) or a complete paper by Oct 15, 2016. Information about previous publication, if applicable, should appear prominently on the first page of the submission. Abstracts must be written in English and be submitted as a single PDF file at EasyChair.
All accepted abstracts will be presented at the workshop as posters and some will be selected for an oral presentation. The workshop will not have formal proceedings, and presentation at the workshop is not intended to preclude later publication at another venue.
Those who need to receive a notification before the NIPS early registration deadline (Oct 6, 2016) should submit their work by the early submission deadline of Sept 23, 2016.
Important Dates:
Submission deadlines. Early: Sep 23, 2016; Regular: Oct 25, 2016. Submit at EasyChair.
Notification of acceptance. Early: Oct 3, 2016, Regular: Nov 7, 2016.
Workshop: December 9, 2016
Specific topics of interest for the workshop include (but are not limited to):
Selective/post-selection inference
Sequential/online false discovery rate control
Algorithms for answering adaptively chosen data queries
Computational and statistical barriers to adaptive data analysis
Stability measures and their applications to generalization
Information-theoretic approaches to generalization
Call for Papers
The overall goal of WADAPT is to stimulate the discussion on theoretical analysis and practical aspects of adaptive data analysis. We seek contributions from different research areas of machine learning, statistics and computer science. Submissions focused on a particular area of application are also welcome.
Submissions will undergo a lightweight review process and will be judged on originality, relevance, clarity, and the extent to which their presentation can stimulate the discussion between different communities at the workshop. Submissions may describe either novel work (completed or in progress), or work already published or submitted elsewhere provided that it first appeared after September 1, 2015.
Authors are invited to submit either a short abstract (2-4 pages) or a complete paper by Oct 15, 2016. Information about previous publication, if applicable, should appear prominently on the first page of the submission. Abstracts must be written in English and be submitted as a single PDF file at EasyChair.
All accepted abstracts will be presented at the workshop as posters and some will be selected for an oral presentation. The workshop will not have formal proceedings, and presentation at the workshop is not intended to preclude later publication at another venue.
Those who need to receive a notification before the NIPS early registration deadline (Oct 6, 2016) should submit their work by the early submission deadline of Sept 23, 2016.
Important Dates:
Submission deadlines. Early: Sep 23, 2016; Regular: Oct 25, 2016. Submit at EasyChair.
Notification of acceptance. Early: Oct 3, 2016, Regular: Nov 7, 2016.
Workshop: December 9, 2016
Specific topics of interest for the workshop include (but are not limited to):
Selective/post-selection inference
Sequential/online false discovery rate control
Algorithms for answering adaptively chosen data queries
Computational and statistical barriers to adaptive data analysis
Stability measures and their applications to generalization
Information-theoretic approaches to generalization
Saturday, June 04, 2016
Machine Learning Postdoc
My brand new colleague Shivani Agarwal is in the market for a postdoc; the announcement is below. One of the targeted areas is machine learning and economics. Whoever takes this position will join a growing group of exceptional postdocs in this area at Penn, including Jamie Morgenstern and Bo Waggoner.
Postdoctoral Position in Machine Learning at UPenn Applications are invited for a postdoctoral position in machine learning in the Department of Computer and Information Science at the University of Pennsylvania. The position is expected to begin in Fall 2016, and is for a period of up to two years (with renewal in the second year contingent on performance in the first year). Applications in all areas of machine learning will be considered, with special emphasis on the following areas: ranking and choice modeling; connections between machine learning and economics; and learning of complex structures. The ideal candidate will demonstrate both ability for independent thinking and interest in co-mentoring of graduate students. The candidate will work primarily with Shivani Agarwal (joining UPenn faculty in July 2016), but will also have opportunities to collaborate with other faculty in machine learning and related areas at UPenn, including Michael Kearns, Daniel Lee, Sasha Rakhlin, Aaron Roth, Lyle Ungar, and other faculty. UPenn is located in the vibrant city of Philadelphia, which is known for its rich culture, history, museums, parks, and restaurants. It is less than 1.5 hrs by train to NYC, 1.5 hrs by flight to Boston, 2 hrs by train to Washington DC, and 40 mins by train to Princeton. For more details about the CIS department at UPenn, see: http://www.cis.upenn.edu/ To apply, send the following materials in an email titled “Application for Postdoctoral Position” toby June 17, 2016: - curriculum vitae - 2-page statement of research interests and goals - 3 representative publications or working papers - 3 letters of recommendation (to be sent separately by the same date) Shortlisted candidates will be invited for a short meeting/interview at ICML/COLT in NYC during June 23-26 (in your email, please indicate your availability for this).
Tuesday, May 24, 2016
Fairness in Learning
The very real problem of (un)fairness in algorithmic decision making in general, and machine learning in particular seems to have finally reached the forefront of public attention. Every day there is a new popular article about the topic. Just in the last few weeks, we have seen articles in the Times about built in bias in facebook, and an in-depth ProPublica study about racial bias in statistical models for predicting criminal recidivism. Earlier this month, the White House released a report on the challenges in promoting fairness in Big Data.
The tricky thing is saying something concrete and technical about this problem -- even defining what "fairness" is is delicate. There has been some good technical work in this area that I have long admired from afar -- see e.g. the "FATML" (Fairness and Transparency in Machine Learning) Workshop to get an idea of the range of work being done, and the folks doing it. People like Cynthia Dwork, Moritz Hardt, Solon Barocas, Suresh Venkatasubramanian, Sorelle Friedler, Cathy O'Neil, and others have been doing important work thinking about these problems for quite some time. A particularly nice early paper that I recommend everyone interested in the area read is Fairness Through Awareness, by Dwork, Hardt, Pitassi, Reingold, and Zemel. It was first posted online in 2011(!), and in retrospect is quite prescient in its discussion of algorithmic fairness.
So I'm happy to finally have something interesting to say about the topic! My student Matthew Joseph, Michael Kearns, Jamie Morgenstern, and I just posted a new paper online that I'm excited about: Fairness in Learning: Classic and Contextual Bandits. I'll mostly let the paper speak for itself, but briefly, we write down a simple but (I think) compelling definition of fairness in a stylized general model of sequential decision making called the "contextual bandit setting". To keep a canonical problem in your mind, imagine the following: There are a bunch of different populations (say racial or socioeconomic groups), and you are a loan officer. Every day, an individual from each population applies for a loan. You get to see the loan application for each person (this is the "context"), and have to decide who to give the loan to. When you give out the loan, you observe some reward (e.g. you see if they paid back the loan), but you don't see what reward you -would- have gotten had you given the loan to someone else. Our fairness condition says roughly that an algorithm is "fair" if it never preferentially gives a loan to a less qualified applicant over a more qualified applicant -- where the quality of an applicant in our setting is precisely the probability that they pay back the loan. (It prohibits discriminating against qualified applicants on an individual basis -- even if they happen to come from a population that is less credit-worthy on average, or from a population that the bank doesn't understand as well).
It might seem like this definition of fairness is entirely consistent with the profit motivation of a bank -- why would a bank ever want to give a loan to an applicant less likely to pay it back? Indeed, this would be true if the bank had nothing to learn -- i.e. if it already knew the optimal rule mapping loan applications to credit-worthiness. Said another way, implementing the optimal policy is entirely consistent with our fairness definition. Our main conceptual message is that fairness can nevertheless be an obstruction to learning the optimal policy.
What our results say is that "fairness" always has a cost in terms of the optimal learning rate achievable by algorithms in this setting. For some kinds of problems, the cost is mild in that the cost of fairness on the learning rate is only polynomial (e.g. when credit-worthiness is determined by a simple linear regression model on the features of a loan application). On the other hand, for other kinds of problems, the cost of fairness on the learning rate is severe, in that it can slow learning by an exponential factor (e.g. when credit-worthiness is determined by an AND of features in the loan application). Put another way, for the problems in which the cost of fairness is severe, if the bank were to use a fast learning algorithm (absent a fairness constraint), the algorithm might be "unfair" for a very long time, even if in the limit, once it learned the truly optimal policy, it would eventually be fair. One friction to fairness is that we don't live in the limit -- we are always in a state of learning.
The tricky thing is saying something concrete and technical about this problem -- even defining what "fairness" is is delicate. There has been some good technical work in this area that I have long admired from afar -- see e.g. the "FATML" (Fairness and Transparency in Machine Learning) Workshop to get an idea of the range of work being done, and the folks doing it. People like Cynthia Dwork, Moritz Hardt, Solon Barocas, Suresh Venkatasubramanian, Sorelle Friedler, Cathy O'Neil, and others have been doing important work thinking about these problems for quite some time. A particularly nice early paper that I recommend everyone interested in the area read is Fairness Through Awareness, by Dwork, Hardt, Pitassi, Reingold, and Zemel. It was first posted online in 2011(!), and in retrospect is quite prescient in its discussion of algorithmic fairness.
So I'm happy to finally have something interesting to say about the topic! My student Matthew Joseph, Michael Kearns, Jamie Morgenstern, and I just posted a new paper online that I'm excited about: Fairness in Learning: Classic and Contextual Bandits. I'll mostly let the paper speak for itself, but briefly, we write down a simple but (I think) compelling definition of fairness in a stylized general model of sequential decision making called the "contextual bandit setting". To keep a canonical problem in your mind, imagine the following: There are a bunch of different populations (say racial or socioeconomic groups), and you are a loan officer. Every day, an individual from each population applies for a loan. You get to see the loan application for each person (this is the "context"), and have to decide who to give the loan to. When you give out the loan, you observe some reward (e.g. you see if they paid back the loan), but you don't see what reward you -would- have gotten had you given the loan to someone else. Our fairness condition says roughly that an algorithm is "fair" if it never preferentially gives a loan to a less qualified applicant over a more qualified applicant -- where the quality of an applicant in our setting is precisely the probability that they pay back the loan. (It prohibits discriminating against qualified applicants on an individual basis -- even if they happen to come from a population that is less credit-worthy on average, or from a population that the bank doesn't understand as well).
It might seem like this definition of fairness is entirely consistent with the profit motivation of a bank -- why would a bank ever want to give a loan to an applicant less likely to pay it back? Indeed, this would be true if the bank had nothing to learn -- i.e. if it already knew the optimal rule mapping loan applications to credit-worthiness. Said another way, implementing the optimal policy is entirely consistent with our fairness definition. Our main conceptual message is that fairness can nevertheless be an obstruction to learning the optimal policy.
What our results say is that "fairness" always has a cost in terms of the optimal learning rate achievable by algorithms in this setting. For some kinds of problems, the cost is mild in that the cost of fairness on the learning rate is only polynomial (e.g. when credit-worthiness is determined by a simple linear regression model on the features of a loan application). On the other hand, for other kinds of problems, the cost of fairness on the learning rate is severe, in that it can slow learning by an exponential factor (e.g. when credit-worthiness is determined by an AND of features in the loan application). Put another way, for the problems in which the cost of fairness is severe, if the bank were to use a fast learning algorithm (absent a fairness constraint), the algorithm might be "unfair" for a very long time, even if in the limit, once it learned the truly optimal policy, it would eventually be fair. One friction to fairness is that we don't live in the limit -- we are always in a state of learning.
Subscribe to:
Posts (Atom)