EE3001 - Machine Learning (Fall 2021)

Basic Information

Instructor: Jie Wang
Email: jiewangx@ustc.edu.cn
Time and Location: M, W 9:45 AM - 11:20 AM (3A311)
TAs:
- Xize Liang (xizeliang@miralab.ai)
- Runxin Liu (runxinliu@miralab.ai)
- Zijie Geng (zijiegeng@miralab.ai)
- Shuling Yang (shulingyang@miralab.ai)

Lectures

All course materials will be shared via this page.

Index	Date	Topic	Lecture Notes	Homework
00	Sept 6, 2021	Introduction	Lec00-Introduction.pdf
01	Sept 8, 2021	Linear Regression I
02	Sept 13, 2021	Linear Regression II	Lec01-LinearRegression.pdf, Lec01slides.pdf	HW01.pdf
03	Sept 15, 2021	Bias-Variance Decomposition	Lec02-BiasVarianceDecomposition.pdf
04	Sept 18, 2021	Review of Linear Algebra
05	Sept 22, 2021	Beyesian Linear Regression I	Lec03-BayesianLinearRegression.pdf
06	Sept 27, 2021	Beyesian Linear Regression II & Basics of Analysis I
07	Sept 29, 2021	Basics of Analysis II	Lec04-BasicsofAnalysis.pdf	HW02.pdf
08	Oct 11, 2021	Convex Sets I
09	Oct 13, 2021	Convex Sets II & Separation Theorems I	Lec05-ConvexSets.pdf
10	Oct 18, 2021	Separation Theorems II	Lec06-SeparationTheorems.pdf
11	Oct 20, 2021	Separation Theorems III
12	Oct 25, 2021	Convex Functions I	Lec07-ConvexFunctions.pdf	HW03.pdf
13	Oct 27, 2021	Convex Functions II
14	Nov 1, 2021	Subdifferentials	Lec08-Subdifferentials.pdf
15	Nov 3, 2021	Convex Optimization Problems	Lec09-ConvexOptimizationProblems.pdf	HW04.pdf
16	Nov 8, 2021	DecisionTree	Lec10-DecisionTree.pdf
17	Nov 10, 2021	Naive Bayes Classifier	Lec11-NaiveBayesClassifier.pdf
18	Nov 15, 2021	Mid-term Exam
19	Nov 17, 2021	Logistic Regression	Lec12-LogisticRegression.pdf	HW05.pdf
20	Nov 22, 2021	Analysis of the Mid-term Exam
21	Nov 24, 2021	SVM I	Lec13-SVM I.pdf
22	Nov 29, 2021	SVM II	Lec14-SVM II.pdf
23	Dec 1, 2021	Neural Networks I	Lec15-NeuralNetworks.pdf
24	Dec 6, 2021	Neural Networks II	Lec16-ConvolutionalNeuralNetwork.pdf	HW06.pdf
25	Dec 8, 2021	Principal Component Analysis	Lec17-PrincipalComponentAnalysis.pdf
26	Dec 13, 2021	Reinforcement Learning I	Lec18-ElementaryRL_DeterministicEnvironment.pdf
27	Dec 15, 2021	Reinforcement Learning II	Lec19-Multi-armedBandits.pdf
28	Dec 20, 2021	Reinforcement Learning III	Lec20-ElementaryRL_StochasticEnvironment.pdf, Lec21-ValueIteration.pdf	HW07.pdf

Project

Description

In this project, you are expected to implement a Reinforcement Learning algorithm to teach an agent to play an Atari game, Pong.

Environment

Pong has been implemented as the Reinforcement Learning environment in OpenAI Gym. You may want to install Gym by running pip install gym[atari]. Moreover, You can get more information about this environment Pong.

pong

Agent

We provide a code framework for reference, which contains creating a Pong Environment and evaluating the agent. You may obtain the code here. To run the code, go into the Example directory and execute python scripts/main.py.

We define a base class called RL_alg in src/alg/RL_alg.py. You are supposed to implement your algorithm in src/alg/[your student ID] directory inheriting from RL_alg. We give an example in src/alg/PB00000000, which takes a random action at each step.

You are supposed to finally upload src/alg/[your student ID] directory. TAs will use the provided code to evaluate your trained agent.

Requirements

Do NOT use any autograd tools or any optimization tools from machine learning packages. You are supposed to implement your algorithm from scratch. For example, if you want to use a neural network, you are expected to implement both the forward and backward processes. You can use the libraries in the WhiteList. TAs will update the Whitelist if your requirements are reasonable.
You can work as a team with no more than three members in total. Please list the percentage of each member’s contribution in your report, e.g., {San Zhang: 30%, Si Li: 35%, Wu Wang: 35%}.
You are supposed to send a package named
```
[your student ID].zip
```
to ml_homework@163.com, which contains the
```
[your student ID]
```
directory organized as follows. For a teamwork, please use the team leader’s students ID in the package name and submit the package by your team leader.
```
[your student ID]
  |- [your student ID]-report.pdf (your report)
  |- (your code and model)
```
Remember to save the trained model. You are supposed to send your trained model to the aforementioned e-mail address. In the final test，you are supposed to use your trained agent to play Pong ten times and generate animations instantly.
Please submit a detailed report. The report should include all the details of your projects，e.g., the implementations, the experimental settings and the analysis of your results. For example, if you use deep reinforcement learning algorithms, you are supposed to show your clear technical routes, which contain image preprocessing, the structure of your neural networks and the replay memory settings. Besides，the report should include a learning curve plot showing the performance of your algorithm. The x-axis should correspond to number of time steps and the y-axis should show the mean 100-episode reward as well as the best mean reward. What’s more，game animations of trained agent are neccessary.

Grading

The full points = min(Base score (up to 20pts) + Bonus (up to 5pts), 20pts).
The base score is determined by the average scores of ten- times game.
The bonus also consists of two parts. The first part is determined by the novelty of your approach, which should be highlighted in your report. The second part is related to the readability of your code and report. Please make them easy to read.

System Requirements

We will evaluate your model on a GeForce RTX 2080ti (about 10G memory) under Ubuntu 18.04 system. Please limit the size of your model to avoid OOM.

Hint

You can visit the website to obtain more information about Gym, which is a widely used benchmark in the reinforcement learning.
The following papers may be helpful to you.
- [1] Mnih V , Kavukcuoglu K , Silver D , et al. Playing Atari with Deep Reinforcement Learning[J]. Computer Science, 2013.
- [2] Mnih, V., Kavukcuoglu, K., Silver, D. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236
- [3] van Hasselt, H., Guez, A. and Silver, D. 2016. Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence. 30, 1 (Mar. 2016).
- [4] Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M. & Freitas, N.. (2016). Dueling Network Architectures for Deep Reinforcement Learning. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1995-2003 Available from https://proceedings.mlr.press/v48/wangf16.html.

Due Day

Team leaders should inform the TAs about your team members before 23:59 PM, November 21, 2021.
Please submit your report, code and trained model before 23:59 PM, January 23, 2022.
No late submissions will be accepted.

Page view (from Jan 1, 2021)：

Downloads (from Jan 1, 2021)：5815