EE3001 - Machine Learning (Fall 2021)

Basic Information

Lectures

All course materials will be shared via this page.

Index Date Topic Lecture Notes Homework
00 Sept 6, 2021 Introduction Lec00-Introduction.pdf
01 Sept 8, 2021 Linear Regression I
02 Sept 13, 2021 Linear Regression II Lec01-LinearRegression.pdf, Lec01slides.pdf HW01.pdf
03 Sept 15, 2021 Bias-Variance Decomposition Lec02-BiasVarianceDecomposition.pdf
04 Sept 18, 2021 Review of Linear Algebra
05 Sept 22, 2021 Beyesian Linear Regression I Lec03-BayesianLinearRegression.pdf
06 Sept 27, 2021 Beyesian Linear Regression II & Basics of Analysis I
07 Sept 29, 2021 Basics of Analysis II Lec04-BasicsofAnalysis.pdf HW02.pdf
08 Oct 11, 2021 Convex Sets I
09 Oct 13, 2021 Convex Sets II & Separation Theorems I Lec05-ConvexSets.pdf
10 Oct 18, 2021 Separation Theorems II Lec06-SeparationTheorems.pdf
11 Oct 20, 2021 Separation Theorems III
12 Oct 25, 2021 Convex Functions I Lec07-ConvexFunctions.pdf HW03.pdf
13 Oct 27, 2021 Convex Functions II
14 Nov 1, 2021 Subdifferentials Lec08-Subdifferentials.pdf
15 Nov 3, 2021 Convex Optimization Problems Lec09-ConvexOptimizationProblems.pdf HW04.pdf
16 Nov 8, 2021 DecisionTree Lec10-DecisionTree.pdf
17 Nov 10, 2021 Naive Bayes Classifier Lec11-NaiveBayesClassifier.pdf
18 Nov 15, 2021 Mid-term Exam
19 Nov 17, 2021 Logistic Regression Lec12-LogisticRegression.pdf HW05.pdf
20 Nov 22, 2021 Analysis of the Mid-term Exam
21 Nov 24, 2021 SVM I Lec13-SVM I.pdf
22 Nov 29, 2021 SVM II Lec14-SVM II.pdf
23 Dec 1, 2021 Neural Networks I Lec15-NeuralNetworks.pdf
24 Dec 6, 2021 Neural Networks II Lec16-ConvolutionalNeuralNetwork.pdf HW06.pdf
25 Dec 8, 2021 Principal Component Analysis Lec17-PrincipalComponentAnalysis.pdf
26 Dec 13, 2021 Reinforcement Learning I Lec18-ElementaryRL_DeterministicEnvironment.pdf
27 Dec 15, 2021 Reinforcement Learning II Lec19-Multi-armedBandits.pdf
28 Dec 20, 2021 Reinforcement Learning III Lec20-ElementaryRL_StochasticEnvironment.pdf, Lec21-ValueIteration.pdf HW07.pdf

Project

Description

In this project, you are expected to implement a Reinforcement Learning algorithm to teach an agent to play an Atari game, Pong.

Environment

Pong has been implemented as the Reinforcement Learning environment in OpenAI Gym. You may want to install Gym by running pip install gym[atari]. Moreover, You can get more information about this environment Pong.

pong

Agent

We provide a code framework for reference, which contains creating a Pong Environment and evaluating the agent. You may obtain the code here. To run the code, go into the Example directory and execute python scripts/main.py.

We define a base class called RL_alg in src/alg/RL_alg.py. You are supposed to implement your algorithm in src/alg/[your student ID] directory inheriting from RL_alg. We give an example in src/alg/PB00000000, which takes a random action at each step.

You are supposed to finally upload src/alg/[your student ID] directory. TAs will use the provided code to evaluate your trained agent.

Requirements

  • Do NOT use any autograd tools or any optimization tools from machine learning packages. You are supposed to implement your algorithm from scratch. For example, if you want to use a neural network, you are expected to implement both the forward and backward processes. You can use the libraries in the WhiteList. TAs will update the Whitelist if your requirements are reasonable.

  • You can work as a team with no more than three members in total. Please list the percentage of each member’s contribution in your report, e.g., {San Zhang: 30%, Si Li: 35%, Wu Wang: 35%}.

  • You are supposed to send a package named

    [your student ID].zip
    

    to ml_homework@163.com, which contains the

    [your student ID]
    

    directory organized as follows. For a teamwork, please use the team leader’s students ID in the package name and submit the package by your team leader.

    [your student ID]
      |- [your student ID]-report.pdf (your report)
      |- (your code and model)
    
  • Remember to save the trained model. You are supposed to send your trained model to the aforementioned e-mail address. In the final test,you are supposed to use your trained agent to play Pong ten times and generate animations instantly.

  • Please submit a detailed report. The report should include all the details of your projects,e.g., the implementations, the experimental settings and the analysis of your results. For example, if you use deep reinforcement learning algorithms, you are supposed to show your clear technical routes, which contain image preprocessing, the structure of your neural networks and the replay memory settings. Besides,the report should include a learning curve plot showing the performance of your algorithm. The x-axis should correspond to number of time steps and the y-axis should show the mean 100-episode reward as well as the best mean reward. What’s more,game animations of trained agent are neccessary.

Grading

  • The full points = min(Base score (up to 20pts) + Bonus (up to 5pts), 20pts).
  • The base score is determined by the average scores of ten- times game.
  • The bonus also consists of two parts. The first part is determined by the novelty of your approach, which should be highlighted in your report. The second part is related to the readability of your code and report. Please make them easy to read.

System Requirements

  • We will evaluate your model on a GeForce RTX 2080ti (about 10G memory) under Ubuntu 18.04 system. Please limit the size of your model to avoid OOM.

Hint

  • You can visit the website to obtain more information about Gym, which is a widely used benchmark in the reinforcement learning.
  • The following papers may be helpful to you.
    • [1] Mnih V , Kavukcuoglu K , Silver D , et al. Playing Atari with Deep Reinforcement Learning[J]. Computer Science, 2013.
    • [2] Mnih, V., Kavukcuoglu, K., Silver, D. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236
    • [3] van Hasselt, H., Guez, A. and Silver, D. 2016. Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence. 30, 1 (Mar. 2016).
    • [4] Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M. & Freitas, N.. (2016). Dueling Network Architectures for Deep Reinforcement Learning. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1995-2003 Available from https://proceedings.mlr.press/v48/wangf16.html.

Due Day

  • Team leaders should inform the TAs about your team members before 23:59 PM, November 21, 2021.
  • Please submit your report, code and trained model before 23:59 PM, January 23, 2022.
  • No late submissions will be accepted.

Page view (from Jan 1, 2021):

Downloads (from Jan 1, 2021):5815