Schedule

We will aim to fill in lecture topics at least 1 week in advance. Assignment due dates are final, unless there are exceptional unforeseen circumstances.

Event

Date

Description

Course Material
Lecture

08/26
Monday

Lecture 1 - Course Introduction
[slides]
Lecture

08/28
Wednesday

Lecture 2 - Data challenges
[slides]
Suggested Readings:
- Lessons from measurement [only need to read measurement section]
- When You Hear the Margin of Error Is Plus or Minus 3 Percent, Think 7 Instead
Assignment

08/29
Thursday

Homework #1 - Polling and Data Collection released!

[Homework #1 - Polling and Data Collection]
Lecture

09/02
Monday

NO CLASS -- Labor day
Lecture

09/04
Wednesday

Lecture 3 - Survey weighting
[slides]
Suggested Readings:
Lecture

09/09
Monday

Lecture 4 - Weighting 2
[slides]
Lecture

09/11
Wednesday

Lecture 5 - Other aspects of data collection
[slides]
Optional Readings:
Lecture

09/16
Monday

Lecture 6 - Recommendations introduction
[slides]
Suggested Readings:
- Mining Massive Datasets, Ch 9, Recommendations
- Stitchfix Algorithms tour
Assignment

09/17
Tuesday

Homework #2 - Recommendation Systems released!

[Homework #2 - Recommendation Systems]
Lecture

09/18
Wednesday

Lecture 7 - Recommendations, from predictions to decisions
[slides]
Suggested Readings:
- The Engagement-Diversity Connection: Evidence from a Field Experiment on Spotify
- Algorithmic Effects on the Diversity of Consumption on Spotify
Due

09/17 23:59 ET
Tuesday

Homework #1 due
Lecture

09/23
Monday

Lecture 8 - Algorithmic Pricing basics
[slides]
Lecture

09/25
Wednesday

Lecture 9 - Algorithmic Pricing complications
[slides]
Suggested Readings:
- The Algorithm behind Plane Ticket Prices and How to Get the Best Deal
- Book chapter on Dynamic Programming
Notes:
- Handwritten notes backward induction 1
- Handwritten notes backward induction 2
Lecture

09/30
Monday

Special Lecture (by Kenny, Sophie, Vince) -- Recommendations and matching

Abstract of Kenny’s Talk: In recommendation settings, there is an apparent trade-off between the goals of accuracy (to recommend items a user is most likely to want) and diversity (to recommend items representing a range of categories). As such, real-world recommender systems often explicitly incorporate diversity separately from accuracy. This approach, however, leaves a basic question unanswered: Why is there a trade-off in the first place? We show how the trade-off can be explained via a user’s consumption constraints—users typically only consume a few of the items they are recommended. In a stylized model we introduce, objectives that account for this constraint induce diverse recommendations, while objectives that do not account for this constraint induce homogeneous recommendations. This suggests that accuracy and diversity appear misaligned because standard accuracy metrics do not consider consumption constraints.

Abstract of Sophie’s Talk: In the basic recommendation paradigm, the most relevant item is recommended to each user. This may result in some items receiving lower exposure than they “should”; to counter this, several algorithmic approaches have been developed to ensure item fairness. These approaches necessarily degrade recommendations for some users to improve outcomes for items, leading to user fairness concerns. In turn, a recent line of work has focused on developing algorithms for multi-sided fairness, to jointly optimize user fairness, item fairness, and overall recommendation quality.

In this talk, I will introduce item fairness and multi-sided fairness in recommendation. Then, I will discuss our recent work addressing the question: what is the tradeoff between these user fairness and item fairness, and what are the characteristics of (multi-objective) optimal solutions? In this work, we develop a theoretical model of recommendations with user and item fairness objectives and characterize the solutions of fairness-constrained optimization. We identify two phenomena: (a) when user preferences are diverse, there is “free” item and user fairness; and (b) users whose preferences are misestimated can be especially disadvantaged by item fairness constraints. Empirically, we build a recommendation system for preprints on arXiv and implement our framework, measuring the phenomena in practice and showing how these phenomena inform the design of markets with recommendation systems-intermediated matching.

Abstract of Vince’s Talk: Every year in the US, over 3 million older adults are discharged from hospitals into long term care facilities, like nursing homes. Many of these patients are not immediately discharged and spend time at the hospital though they are ready to be discharged. Meanwhile, both hospital staff and care facility operators devote significant resources to facilitate placement, for example by placing calls to refresh outdated placement data, in order to mitigate the significant operating costs of an occupied hospital bed or vacant facility bed. I present my work supporting the placement of over 200 patients into long term care facilities in Hawai’i.

First I will present work in deploying a conversational SMS agent to address hospital staff data needs with 1,047 care facilities – showing that aligning with existing workflows and improving accuracy and timeliness of data can provide value and improve outcomes, even absent sophisticated algorithmic recommendations. Second, I will present an experiment to better understand preferences among care homes with respect to patients, and how on-the-ground factors play a role in preferencing and ultimately interest in a patient placement. Lastly, I will present on-going work in the design of levers within a dashboard used by hospital staff to enable better timeliness of data and in-turn patient placement outcomes.
Lecture

10/02
Wednesday

Special Lecture (by Gabriel, Sidhika, Zhi) -- Heterogeneous reporting in NYC311 systems
Optional Readings:
- Quantifying Spatial Under-reporting Disparities in Resident Crowdsourcing
- A Bayesian Spatial Model to Correct Under-Reporting in Urban Crowdsourcing
Abstract of Gabriel’s Talk: Decision-makers often observe the occurrence of events through a reporting process. City governments, for example, rely on resident reports to find and then resolve urban infrastructural problems such as fallen street trees, flooded basements, or rat infestations. Without additional assumptions, there is no way to distinguish events that occur but are not reported from events that truly did not occur. Because disparities in reporting rates correlate with resident demographics, addressing incidents only on the basis of reports leads to systematic neglect in neighborhoods that are less likely to report events.

We show how to overcome this challenge in three different settings and estimate reporting rates. First, we leverage the fact that events are spatially correlated and propose a latent variable Bayesian model. Second, we propose a method to fit graph neural networks with both sparsely observed, unbiased data and densely observed, biased data. And third, we incorporate the report delay and the possibility of multiple reports in a Poisson Bayesian model.

Our work lays the groundwork for more equitable proactive government services, even with disparate reporting behavior.

Abstract of Sidhika’s Talk: Graph neural networks (GNNs) are widely used to make predictions on graph-structured data in urban spatiotemporal forecasting applications, such as predicting infrastructure problems and weather events. In urban settings, nodes have a true latent state (e.g., street condition) that is sparsely observed (e.g., via government inspection ratings). We more frequently observe biased proxies for the latent state (e.g., via crowdsourced reports) that correlate with resident demographics. We introduce a GNN-based model that uses both unbiased rating data and biased reporting data to predict the true latent state. We show that our approach can both recover the latent state at each node and quantify the reporting biases. We apply our model to a case study of urban incidents using reporting data from New York City 311 complaints across 141 complaint types and rating data from government inspections. We show (i) that our model predicts more correlated ground truth latent states compared to prior work which trains models only on the biased reporting data, (ii) that our model’s inferred reporting biases capture known demographic biases, and (iii) that our model’s learned ratings capture correlations across locations and between complaint types. Especially in urban crowdsourcing applications, our analysis reveals a widely applicable approach for using GNNs and sparse ground truth data to estimate latent states.

Abstract of Zhi’s Talk: Modern city governance relies heavily on crowdsourcing to identify problems such as downed trees and power lines. A major concern is that residents do not report problems at the same rates, with heterogeneous reporting delays directly translating to downstream disparities in how quickly incidents can be addressed. Here, we develop a method to identify reporting delays without using external ground-truth data. Our insight is that rates on duplicate reports about the same incident can be leveraged to disambiguate whether an incident has occurred with its reporting rate once it has occurred. We apply our method to over 100,000 resident reports made in New York City and to over 900,000 reports made in Chicago, finding that there are substantial spatial disparities in how quickly incidents are reported. We further validate our methods using external data and demonstrate how estimating reporting delays leads to practical insights and interventions for a more equitable, efficient government service.
Due

10/01 23:59 ET
Tuesday

Homework #2 due
Lecture

10/07
Monday

Guest Lecture -- Allison Koenecke

Title: Equitable Decision-Making in Public Resource Allocation

Abstract: Algorithmically guided decisions are becoming increasingly prevalent and, if left unchecked, can amplify pre-existing societal biases. In this talk, I audit the equity of decision-making in two public resource allocation settings. First, I present a methodological framework for online advertisers to determine a demographically equitable allocation of individuals being shown ads for SNAP (food stamp) benefits – specifically, considering budget-constrained trade-offs between ad conversions for English-speaking and Spanish-speaking SNAP applicants. Second, I discuss sensitivity analyses on public funding allocation algorithms such as CalEnviroScreen, an algorithm used to promote environmental justice by aiding disadvantaged census tracts – which we find to encode bias against tracts with high immigrant populations. In both case studies, we will discuss methods to mitigate allocative harm and to foster equitable outcomes using accountability mechanisms.

Required reading: CalEnviroScreen

Optional reading: SNAP,

Bio: Allison Koenecke is an Assistant Professor of Information Science at Cornell University. Her research on algorithmic fairness applies computational methods, such as machine learning and causal inference, to study societal inequities in domains from online services to public health. Koenecke is regularly quoted as an expert on disparities in automated speech-to-text systems. She previously held a postdoctoral researcher role at Microsoft Research and received her PhD from Stanford’s Institute for Computational and Mathematical Engineering. Awards won include the NSF Graduate Research Fellowship and Forbes 30 under 30 in Science.
Assignment

10/07
Monday

Homework #3 - Dynamic and Personalized Pricing released!

[Homework #3 - Dynamic and Personalized Pricing]
Lecture

10/09
Wednesday

Guest Lecture -- Aaron Schein
Lecture

10/16
Wednesday

Lecture 10 - Algorithmic Pricing complications 2
[slides]
Lecture

10/21
Monday

Lecture 11 - Algorithmic Pricing practice -- ride-hailing
[slides]
Due

10/22 23:59 ET
Tuesday

Homework #3 due
Assignment

10/24
Thursday

Homework #4 - Experimentation released!

[Homework #4 - Experimentation]
Lecture

10/28
Monday

Lecture 12 - Experimentation -- Introduction
[slides]
Suggested Readings:
Lecture

10/30
Wednesday

Lecture 13 - Experimentation -- Peeking and Interference
[slides]
Suggested Readings:
- Lyft – experimentation in marketplaces (Part 1), (Part 2), (Part 3)
- Toward Causal Inference With Interference
Lecture

11/04
Monday

Guest Lecture -- Max Nickel
Content: I’m planning to cover the basics of model validation, how the assumptions are often violated in recommender systems, and what to do about it. Sprinkled in some philosophy via Hume’s problem of induction and how it occurs in model validation.

Bio: Maximilian Nickel is a research scientist manager at FAIR, Meta AI where he is leading the AI & Society team and also acted as a research area lead for Machine Learning and Society & Responsible AI. In 2023, Max also acted as Program Chair for ICLR. Before joining FAIR, Max was a postdoctoral fellow at MIT where he was with the Laboratory for Computational and Statistical Learning and the Center for Brains, Minds and Machines. He received his PhD with summa cum laude from the Ludwig Maximilian University Munich as a research assistant at Siemens Corporate Technology. Max’s research is focused on understanding the interplay of AI and social systems. For this purpose, he is combining machine learning theory and methods with complex systems theory including networks, dynamics, and emergence. Max aims to establish the necessary theoretical and methodological foundations for AI to positively interact with society and to obtain results that have a direct impact on AI practice, methods, and governance.

Papers:
- Shai Shalev-Shwartz and Shai Ben-David: Understanding Machine Learning: From Theory to Algorithms, Chapter 11 Model Selection and Validation (available for free here: https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/)
- Maximilian Nickel: No free delivery service - Epistemic limits of passive data collection in complex social systems (NeurIPS 2024)
- Maximilian Nickel, Kevin Murphy, Volker Tresp, Evgeniy Gabrilovich: A Review of Relational Machine Learning for Knowledge Graphs, until Section IV
- Nathan Srebro, Russ R. Salakhutdinov: Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm (NeurIPS 2010)
Due

11/05 23:59 ET
Tuesday

Homework #4 due
Lecture

11/06
Wednesday

Lecture 14 - Experimentation in marketplaces
[slides]
Suggested Readings:
- Switchback Tests and Randomized Experimentation Under Network Effects at DoorDash
- A/B testing at marketplaces
Lecture

11/11
Monday

Lecture 15 - Project description and game theory; Synthetic control
[project_slides] [experiment_slides]
Suggested Readings:
Lecture

11/13
Wednesday

Lecture 16 - Discrimination in Platforms
[slides]
Lecture

11/18
Monday

Guest Lecture -- Keyon Vafa
Title: The LLM Conundrum: Evaluation

Abstract: Large language models (LLMs) present an evaluation challenge: a model can simultaneously achieve impressive metrics and produce baffling responses when used in practice. How can we trust metrics that don’t capture this behavior? In this talk, I’ll present two alternative approaches for evaluating LLMs. In the first, we develop metrics for assessing whether LLMs learn coherent world models, finding that even in simple domains like navigation, models fail to capture underlying structure despite strong predictive performance. In the second, we evaluate LLMs by incorporating behavioral models of how people decide to use them, finding that misaligned expectations about model capabilities can lead to systematic deployment errors.

Papers:
- Evaluating the World Model Implicit in a Generative Model (https://arxiv.org/abs/2406.03689)
- Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function (https://arxiv.org/abs/2406.01382)
Bio: Keyon Vafa is a postdoctoral fellow at the Harvard Data Science Initiative. His research focuses on developing ML methods to address economic questions along with using insights from the behavioral sciences to improve ML methods. Keyon completed his PhD in computer science from Columbia University under David Blei, where he was an NSF GRFP Fellow, a Cheung-Kong Innovation Doctoral Fellow, and the recipient of the Morton B. Friedman Memorial Prize for excellence in engineering. During his PhD, he launched and co-organized the ML-NYC speaker series. He is a member of the Early Career Board of the Harvard Data Science Review.
Lecture

11/20
Wednesday

Lecture 17 - Differential Privacy
[slides]
Suggested Readings:
Lecture

11/25
Monday

Guest Lecture -- Arvind Narayanan

Title: Against Predictive Optimization
Lecture

12/02
Monday

Lecture 18 - Monoculture in AI
[slides]
Suggested Readings:
- Algorithmic monoculture and social welfare
- Monoculture in matching markets
Lecture

12/04
Wednesday

Project work day
Lecture

12/09
Monday

Lecture 20 - Course conclusion