GitHub - simonlindgren/bambambam: Few-shot learning for text classification in Python

bambambam 💥💥💥

Few-shot learning is a machine learning approach where AI models are equipped with the ability to make predictions about new, unseen data examples based on a small number of training examples. The model learns by only a few 'shots', and then applies its knowledge to novel tasks.

This method requires spacy and classy-classification.

pip install spacy
pip install classy-classification

Running python bambambam.py does the following:

Look for label data in a subdir named labels. Assume that all *.txt files in there contain example sentences where the filename.txt is the label name, and the examples are on separate lines in the file.
Prepare a classifier by loading a pretrained BERT model and showing it the labels. The code in bambambam.py can be edited to use any other HuggingFace sentence-transformers model, and/or to use gpu instead of cpu.
Read the unseen data, line by line, from a file named data/unseen.txt. The example data used here comes from the public domain novel The Legend of Sleepy Hollow (1820) by Washington Irving.
Classify each line of the unseen data by leveraging BERT and the labels. Save a bambambam.csv with all scores, and also give the option (y/n) of printing out examples of top matches per label.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
labels		labels
.gitignore		.gitignore
README.md		README.md
bambambam.py		bambambam.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bambambam 💥💥💥

About

Releases

Packages

Languages

simonlindgren/bambambam

Folders and files

Latest commit

History

Repository files navigation

bambambam 💥💥💥

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages