CENG463 - HW2

Task

Both tasks are identical, with task 1 using orientation dataset and ask 2 using power dataset. Main distinction between them is for masked lm, task 1 does fine-tuning and evaluatoin on original language whereas task 2 does on English translation of the original text.

Prepare the Data

Downloads the dataset, chooses the predetermined language and splits the training data. Must be run everytime for the code to be functional.

Masked LM

Uses XLMRoberta, can be run independently of the "Causal LM" section. For the Turkish dataset (biggest), fine-tuning takes approximately an hour with T4.

Causal LM

Uses distilgpt2, can be run independently of the "Masked LM" section. Does a total of 4 evaluations, details can be found in ipnyb files. Fine-tuning is done 2 times, takes up approximately 20 minutes with T4.

Results

Results of the evaluations.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
latex		latex
task1		task1
task2		task2
README.md		README.md
e2580371_hw2.pdf		e2580371_hw2.pdf
e2580371_hw2.zip		e2580371_hw2.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CENG463 - HW2

Task

Prepare the Data

Masked LM

Causal LM

Results

About

Releases

Packages

Languages

aysucengiz/ceng463-hw2

Folders and files

Latest commit

History

Repository files navigation

CENG463 - HW2

Task

Prepare the Data

Masked LM

Causal LM

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages