You can download the lectures here. We will try to upload lectures prior to their corresponding classes. Initial versions of the slides (from previous years) may be updated the days preceding the lecture.

  • Lecture 1 - Course Introduction
    tl;dr: Course introduction.
    [slides]
  • Lecture 2 - Data challenges
    tl;dr: Common challenges in data collection.
    [slides]
  • NO CLASS -- Labor day
    tl;dr:
  • Lecture 4 - Weighting 2
    tl;dr: More on weighting, Uncertainty quantification with selection on unknown covariates
    [slides]
  • Lecture 6 - Recommendations introduction
    tl;dr: Introduction to Recommendations, including data challenges and collaborative filtering
    [slides]
  • Lecture 7 - Recommendations, from predictions to decisions
    tl;dr: From predicting ratings to making decisions: capacity constraints and multi-sided recommendations
    [slides]
  • Lecture 8 - Algorithmic Pricing basics
    tl;dr: Introduction to algorithmic Pricing
    [slides]
  • Lecture 9 - Algorithmic Pricing complications
    tl;dr: Algorithmic pricing: capacity, price differentiation, and competition
    [slides]
  • Special Lecture (by Kenny, Sophie, Vince) -- Recommendations and matching
    tl;dr: Kenny, Sophie, Vince on research recommendation systems and matching

    Abstract of Kenny’s Talk: In recommendation settings, there is an apparent trade-off between the goals of accuracy (to recommend items a user is most likely to want) and diversity (to recommend items representing a range of categories). As such, real-world recommender systems often explicitly incorporate diversity separately from accuracy. This approach, however, leaves a basic question unanswered: Why is there a trade-off in the first place? We show how the trade-off can be explained via a user’s consumption constraints—users typically only consume a few of the items they are recommended. In a stylized model we introduce, objectives that account for this constraint induce diverse recommendations, while objectives that do not account for this constraint induce homogeneous recommendations. This suggests that accuracy and diversity appear misaligned because standard accuracy metrics do not consider consumption constraints.

    Abstract of Sophie’s Talk: In the basic recommendation paradigm, the most relevant item is recommended to each user. This may result in some items receiving lower exposure than they “should”; to counter this, several algorithmic approaches have been developed to ensure item fairness. These approaches necessarily degrade recommendations for some users to improve outcomes for items, leading to user fairness concerns. In turn, a recent line of work has focused on developing algorithms for multi-sided fairness, to jointly optimize user fairness, item fairness, and overall recommendation quality.

    In this talk, I will introduce item fairness and multi-sided fairness in recommendation. Then, I will discuss our recent work addressing the question: what is the tradeoff between these user fairness and item fairness, and what are the characteristics of (multi-objective) optimal solutions? In this work, we develop a theoretical model of recommendations with user and item fairness objectives and characterize the solutions of fairness-constrained optimization. We identify two phenomena: (a) when user preferences are diverse, there is “free” item and user fairness; and (b) users whose preferences are misestimated can be especially disadvantaged by item fairness constraints. Empirically, we build a recommendation system for preprints on arXiv and implement our framework, measuring the phenomena in practice and showing how these phenomena inform the design of markets with recommendation systems-intermediated matching.

    Abstract of Vince’s Talk: Every year in the US, over 3 million older adults are discharged from hospitals into long term care facilities, like nursing homes. Many of these patients are not immediately discharged and spend time at the hospital though they are ready to be discharged. Meanwhile, both hospital staff and care facility operators devote significant resources to facilitate placement, for example by placing calls to refresh outdated placement data, in order to mitigate the significant operating costs of an occupied hospital bed or vacant facility bed. I present my work supporting the placement of over 200 patients into long term care facilities in Hawai’i.

    First I will present work in deploying a conversational SMS agent to address hospital staff data needs with 1,047 care facilities – showing that aligning with existing workflows and improving accuracy and timeliness of data can provide value and improve outcomes, even absent sophisticated algorithmic recommendations. Second, I will present an experiment to better understand preferences among care homes with respect to patients, and how on-the-ground factors play a role in preferencing and ultimately interest in a patient placement. Lastly, I will present on-going work in the design of levers within a dashboard used by hospital staff to enable better timeliness of data and in-turn patient placement outcomes.

  • Special Lecture (by Gabriel, Sidhika, Zhi) -- Heterogeneous reporting in NYC311 systems
    tl;dr: Gabriel, Sidhika, Zhi on research regarding heterogeneous reporting in 311 systems

    Optional Readings:

    Abstract of Gabriel’s Talk: Decision-makers often observe the occurrence of events through a reporting process. City governments, for example, rely on resident reports to find and then resolve urban infrastructural problems such as fallen street trees, flooded basements, or rat infestations. Without additional assumptions, there is no way to distinguish events that occur but are not reported from events that truly did not occur. Because disparities in reporting rates correlate with resident demographics, addressing incidents only on the basis of reports leads to systematic neglect in neighborhoods that are less likely to report events.

    We show how to overcome this challenge in three different settings and estimate reporting rates. First, we leverage the fact that events are spatially correlated and propose a latent variable Bayesian model. Second, we propose a method to fit graph neural networks with both sparsely observed, unbiased data and densely observed, biased data. And third, we incorporate the report delay and the possibility of multiple reports in a Poisson Bayesian model.

    Our work lays the groundwork for more equitable proactive government services, even with disparate reporting behavior.

    Abstract of Sidhika’s Talk: Graph neural networks (GNNs) are widely used to make predictions on graph-structured data in urban spatiotemporal forecasting applications, such as predicting infrastructure problems and weather events. In urban settings, nodes have a true latent state (e.g., street condition) that is sparsely observed (e.g., via government inspection ratings). We more frequently observe biased proxies for the latent state (e.g., via crowdsourced reports) that correlate with resident demographics. We introduce a GNN-based model that uses both unbiased rating data and biased reporting data to predict the true latent state. We show that our approach can both recover the latent state at each node and quantify the reporting biases. We apply our model to a case study of urban incidents using reporting data from New York City 311 complaints across 141 complaint types and rating data from government inspections. We show (i) that our model predicts more correlated ground truth latent states compared to prior work which trains models only on the biased reporting data, (ii) that our model’s inferred reporting biases capture known demographic biases, and (iii) that our model’s learned ratings capture correlations across locations and between complaint types. Especially in urban crowdsourcing applications, our analysis reveals a widely applicable approach for using GNNs and sparse ground truth data to estimate latent states.

    Abstract of Zhi’s Talk: Modern city governance relies heavily on crowdsourcing to identify problems such as downed trees and power lines. A major concern is that residents do not report problems at the same rates, with heterogeneous reporting delays directly translating to downstream disparities in how quickly incidents can be addressed. Here, we develop a method to identify reporting delays without using external ground-truth data. Our insight is that rates on duplicate reports about the same incident can be leveraged to disambiguate whether an incident has occurred with its reporting rate once it has occurred. We apply our method to over 100,000 resident reports made in New York City and to over 900,000 reports made in Chicago, finding that there are substantial spatial disparities in how quickly incidents are reported. We further validate our methods using external data and demonstrate how estimating reporting delays leads to practical insights and interventions for a more equitable, efficient government service.

  • Guest Lecture -- Allison Koenecke
    tl;dr: Allison Koenecke (Cornell).

    Title: Equitable Decision-Making in Public Resource Allocation

    Abstract: Algorithmically guided decisions are becoming increasingly prevalent and, if left unchecked, can amplify pre-existing societal biases. In this talk, I audit the equity of decision-making in two public resource allocation settings. First, I present a methodological framework for online advertisers to determine a demographically equitable allocation of individuals being shown ads for SNAP (food stamp) benefits – specifically, considering budget-constrained trade-offs between ad conversions for English-speaking and Spanish-speaking SNAP applicants. Second, I discuss sensitivity analyses on public funding allocation algorithms such as CalEnviroScreen, an algorithm used to promote environmental justice by aiding disadvantaged census tracts – which we find to encode bias against tracts with high immigrant populations. In both case studies, we will discuss methods to mitigate allocative harm and to foster equitable outcomes using accountability mechanisms.

    Required reading: CalEnviroScreen

    Optional reading: SNAP,

    Bio: Allison Koenecke is an Assistant Professor of Information Science at Cornell University. Her research on algorithmic fairness applies computational methods, such as machine learning and causal inference, to study societal inequities in domains from online services to public health. Koenecke is regularly quoted as an expert on disparities in automated speech-to-text systems. She previously held a postdoctoral researcher role at Microsoft Research and received her PhD from Stanford’s Institute for Computational and Mathematical Engineering. Awards won include the NSF Graduate Research Fellowship and Forbes 30 under 30 in Science.

  • Guest Lecture -- Aaron Schein
    tl;dr: Aaron Schein (UChicago).

  • Lecture 10 - Algorithmic Pricing complications 2
    tl;dr: Algorithmic pricing: capacity, price differentiation, and competition
    [slides]
  • Lecture 11 - Algorithmic Pricing practice -- ride-hailing
    tl;dr: Algorithmic pricing: ride-hailing case study
    [slides]
  • Lecture 13 - Experimentation -- Peeking and Interference
    tl;dr: Experimentation challenges in online marketplaces: peeking and interference
    [slides]
  • Guest Lecture -- Max Nickel
    tl;dr: Max Nickel (Meta AI Research).

    Content: I’m planning to cover the basics of model validation, how the assumptions are often violated in recommender systems, and what to do about it. Sprinkled in some philosophy via Hume’s problem of induction and how it occurs in model validation.

    Bio: Maximilian Nickel is a research scientist manager at FAIR, Meta AI where he is leading the AI & Society team and also acted as a research area lead for Machine Learning and Society & Responsible AI. In 2023, Max also acted as Program Chair for ICLR. Before joining FAIR, Max was a postdoctoral fellow at MIT where he was with the Laboratory for Computational and Statistical Learning and the Center for Brains, Minds and Machines. He received his PhD with summa cum laude from the Ludwig Maximilian University Munich as a research assistant at Siemens Corporate Technology. Max’s research is focused on understanding the interplay of AI and social systems. For this purpose, he is combining machine learning theory and methods with complex systems theory including networks, dynamics, and emergence. Max aims to establish the necessary theoretical and methodological foundations for AI to positively interact with society and to obtain results that have a direct impact on AI practice, methods, and governance.

    Papers:

    • Shai Shalev-Shwartz and Shai Ben-David: Understanding Machine Learning: From Theory to Algorithms, Chapter 11 Model Selection and Validation (available for free here: https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/)
    • Maximilian Nickel: No free delivery service - Epistemic limits of passive data collection in complex social systems (NeurIPS 2024)
    • Maximilian Nickel, Kevin Murphy, Volker Tresp, Evgeniy Gabrilovich: A Review of Relational Machine Learning for Knowledge Graphs, until Section IV
    • Nathan Srebro, Russ R. Salakhutdinov: Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm (NeurIPS 2010)
  • Lecture 14 - Experimentation in marketplaces
    tl;dr: Experimentation in marketplaces: interference, spatial randomization, and switchbacks
    [slides]
  • Lecture 15 - Project description and game theory; Synthetic control
    tl;dr: How to succeed in class project; Experimentation conclusion: synthetic control, culture, and communication
    [project_slides] [experiment_slides]
  • Lecture 16 - Discrimination in Platforms
    tl;dr: Discrimination in ad platforms and two-sided marketplaces; some design mitigations
    [slides]
  • Guest Lecture -- Keyon Vafa
    tl;dr: Keyon Vafa (Harvard).

    Title: The LLM Conundrum: Evaluation

    Abstract: Large language models (LLMs) present an evaluation challenge: a model can simultaneously achieve impressive metrics and produce baffling responses when used in practice. How can we trust metrics that don’t capture this behavior? In this talk, I’ll present two alternative approaches for evaluating LLMs. In the first, we develop metrics for assessing whether LLMs learn coherent world models, finding that even in simple domains like navigation, models fail to capture underlying structure despite strong predictive performance. In the second, we evaluate LLMs by incorporating behavioral models of how people decide to use them, finding that misaligned expectations about model capabilities can lead to systematic deployment errors.

    Papers:

    • Evaluating the World Model Implicit in a Generative Model (https://arxiv.org/abs/2406.03689)
    • Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function (https://arxiv.org/abs/2406.01382)

    Bio: Keyon Vafa is a postdoctoral fellow at the Harvard Data Science Initiative. His research focuses on developing ML methods to address economic questions along with using insights from the behavioral sciences to improve ML methods. Keyon completed his PhD in computer science from Columbia University under David Blei, where he was an NSF GRFP Fellow, a Cheung-Kong Innovation Doctoral Fellow, and the recipient of the Morton B. Friedman Memorial Prize for excellence in engineering. During his PhD, he launched and co-organized the ML-NYC speaker series. He is a member of the Early Career Board of the Harvard Data Science Review.

  • Guest Lecture -- Arvind Narayanan
    tl;dr: Arvind Narayanan (Princeton).

    Title: Against Predictive Optimization

  • Lecture 18 - Monoculture in AI
    tl;dr: Introduction to algorithmic monoculture
    [slides]
  • Project work day
    tl;dr: In-person project work day

  • Lecture 20 - Course conclusion
    tl;dr: Course summary and discussion