Starting with Homework 2, the programming portion of each homework will build toward a final project, which will be an algorithmic recommendation and pricing competition. While the homeworks should be completed independently, the project may be done in groups of up to 3.

See here for a living document with the project instructions. We will post an EdStem announcement whenever there are major changes/updates to the document.

See here for a strategy document for suggestions on how to complete each part of the project.

We recognize that students will heavily differ in terms of programming and data science experience entering the course, and the tournament is partially luck-driven. Thus, project grading will not primarily be based on your performance in the tournament – rather, effort and soundness of application of course concepts as discussed in the report will determine most of the project grade. Exceptional performance on the class project will warrant an A+ in the course.


  1. Part 1 submission form Please submit a CSV of your prices for each customer in the test set. The CSV should have four columns: (1) user_index, (2) price_item_0, (3) price_item_1, and (4) expected_revenue. The first row should be headers with the above column names. The filename should be “[your_teamname].csv.” Thus, the first row after the headers should have user_index equal to 14001.


  1. Part 1 training data
  2. Part 1 test data
  3. Part 1 test data, compatible loading with colaboratory (USE THIS; same data as above but in pickling format readable by Pandas version in Google Colaboratory)