Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets

This repository provides the means to download the newly created Few-Shot-150T Corpus (FS150T-Corpus), introduced in the paper "Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets".

Abstract: Topic-Dependent Argument Mining (TDAM), that is extracting and classifying argument components for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large TDAM datasets are rare and recognition of argument components requires expert knowledge. The task becomes even more difficult if it also involves stance detection of retrieved arguments. In this work, we investigate the effect of TDAM dataset composition in few- and zero-shot settings. Our findings show that, while fine-tuning is mandatory to achieve acceptable model performance, using carefully composed training samples and reducing the training sample size by up to almost 90% can still yield 95% of the maximum performance. This gain is consistent across three TDAM tasks on three different datasets.

Contact person: Benjamin Schiller

UKP Lab | TU Darmstadt | summetix

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

Getting started

Due to license reasons, we cannot provide the download to the full dataset files directly. Instead, all the sentences have to be retrieved from Common Crawl WARC files and are missing in the dataset files in this repository. To download the sentences, follow these instructions:

First, create a virtual environment and install the requirements:

python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Next, use src/main.py to complete the datasets:

python src/main.py -i dataset/test_full_no_sents.tsv -o dataset/test_full.tsv
python src/main.py -i dataset/dev_full_no_sents.tsv -o dataset/dev_full.tsv
python src/main.py -i dataset/train_full_no_sents.tsv -o dataset/train_full.tsv

The code checks via hashes if the retrieved sentences are correct and prints out a message if not. The code also checks if all sentences were retrieved at the end.

Retrieving all sentences can take up to 4 hours and the retrieval process may get interrupted. Hence, every 500 rows, a checkpoint file will be saved (e.g. test_full.chkpt500.tsv). In case of an interruption, this file can be used as in-file (-i) to start the process from the checkpoint.

The code was tested with Python 3.12. In case you cannot retrieve the dataset, please request it from us at https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4353.

Download the other corpora used in the paper at:

Cite

Please use the following citation:

@inproceedings{schiller-etal-2024-diversity,
    title = "Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets",
    author = "Schiller, Benjamin  and
      Daxenberger, Johannes  and
      Waldis, Andreas  and
      Gurevych, Iryna",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.608",
    pages = "10870--10887",
    abstract = "Topic-Dependent Argument Mining (TDAM), that is extracting and classifying argument components for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large TDAM datasets are rare and recognition of argument components requires expert knowledge. The task becomes even more difficult if it also involves stance detection of retrieved arguments. In this work, we investigate the effect of TDAM dataset composition in few- and zero-shot settings. Our findings show that, while fine-tuning is mandatory to achieve acceptable model performance, using carefully composed training samples and reducing the training sample size by up to almost 90{\%} can still yield 95{\%} of the maximum performance. This gain is consistent across three TDAM tasks on three different datasets. We also publish a new dataset and code for future benchmarking.",
}

Disclaimer

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dataset		dataset
src		src
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE.txt		NOTICE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets

Getting started

Cite

Disclaimer

About

Releases

Packages

Languages

License

UKPLab/argument-topic-diversity

Folders and files

Latest commit

History

Repository files navigation

Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets

Getting started

Cite

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages