Fudan University-Machine Learning

2020 Spring Fudan University Machine Learning Course HW by prof. Chen Qin

1. Linear Regression

Task

The datasets are real observations downloaded from the website of the Central Meteorological Administration. Please use linear regression to predict the PM2.5 value.

Modules

numpy==1.18.3
pandas==0.25.3
seaborn==0.10.1
matplotlib==3.2.1
sklearn.model_selection.train_test_split
sklearn.metrics.mean_squared_error
sklearn.linear_model.LinearRegression

Results

2. Logistic Regression

Task

Implement two models (probabilistic generative model && logistic regression model) to predict whether a person can make over 50k a year according to the personal information.

Modules

numpy==1.18.3
pandas==0.25.3
seaborn==0.10.1
matplotlib==3.2.1
sklearn.preprocessing
sklearn.preprocessing.MinMaxScaler
sklearn.preprocessing.LabelEncoder
sklearn.linear_model.LogisticRegression
sklearn.metrics.accuracy_score
sklearn.model_selection.train_test_split

Results

3. Sentiment Classification

Task

This task is based on subtask 2 of SemEval-2014 Task 4: Aspect Based Sentiment Analysis

You are required to implement two neural networks (RNN and CNN or their variants) for sentiment classification specific to an aspect.

For example:

“Even though its good seafood, the prices are too high”.
This sentence contains two aspects, namely “seafood” and “prices”. The sentiment for the two aspects are positive and negative respectively.

Modules

numpy==1.18.3
pandas==0.25.3
torch==1.2.0
torch.optim
torch.nn.functional

Results

4. Auto Encoder

Task

Please write an auto-encoder for the images.

Use the trained encoder to obtain the 2-dimensional code of the last 1000 images in the test set, and visualize them with a scatterplot where different colors represent different digits.
Use the decoder to generate 20 images by sampling some codes.

Modules

numpy == 1.18.3
scipy == 1.2.1
Pillow == 7.1.2
tensorflow == 1.15.3
torch == 1.2.0

Results

5. Reproduction of ALBERT Model

Task

With the application and development of pre-training model in natural language processing, machine reading comprehension no longer simply relies on the combination of network structure and word embedding. This paper briefly introduces the concepts of machine reading comprehension and pre-training language model, summarizes the research progress of machine reading comprehension based on ALBERT model, analyzes the performance of the current pre-training model on the relevant data set.

Requirements

python == 3.7
pytorch == 1.0.1
cuda version == 10.1

Dataset

SQuAD

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
MRPC

A text file containing 5800 pairs of sentences which have been extracted from news sources on the web, along with human annotations indicating whether each pair captures a paraphrase/semantic equivalence relationship. No more than 1 sentence has been extracted from any given news article. We have made a concerted effort to correctly associate with each sentence information about its provenance and any associated information about its author.

Structure

Results

Model	Parameters	SQuAD1.1	SQuAD2.0
ALBERT base	12M	89.3/82.1	79.1/76.1
ALBERT large	18M	90.9/84.1	82.1/79.0
ALBERT xlarge	59M	93.0/86.5	85.9/83.1
ALBERT xxlarge	233M	94.1/88.3	88.1/85.1

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
1.Linear Regression		1.Linear Regression
2.Logistic Regression		2.Logistic Regression
3.Sentiment Classification		3.Sentiment Classification
4.Auto Encoder		4.Auto Encoder
5.Reproduction of ALBERT Model		5.Reproduction of ALBERT Model
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fudan University-Machine Learning

1. Linear Regression

Task

Modules

Results

2. Logistic Regression

Task

Modules

Results

3. Sentiment Classification

Task

Modules

Results

4. Auto Encoder

Task

Modules

Results

5. Reproduction of ALBERT Model

Task

Requirements

Dataset

Structure

Results

Reference

About

Releases

Packages

Languages

jrothschild33/Fudan-MachineLearning

Folders and files

Latest commit

History

Repository files navigation

Fudan University-Machine Learning

1. Linear Regression

Task

Modules

Results

2. Logistic Regression

Task

Modules

Results

3. Sentiment Classification

Task

Modules

Results

4. Auto Encoder

Task

Modules

Results

5. Reproduction of ALBERT Model

Task

Requirements

Dataset

Structure

Results

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages