Skip to content

Few-shot learning for text classification in Python

Notifications You must be signed in to change notification settings

simonlindgren/bambambam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bambambam 💥💥💥

Few-shot learning is a machine learning approach where AI models are equipped with the ability to make predictions about new, unseen data examples based on a small number of training examples. The model learns by only a few 'shots', and then applies its knowledge to novel tasks.

This method requires spacy and classy-classification.

pip install spacy
pip install classy-classification

Running python bambambam.py does the following:

  1. Look for label data in a subdir named labels. Assume that all *.txt files in there contain example sentences where the filename.txt is the label name, and the examples are on separate lines in the file.

  2. Prepare a classifier by loading a pretrained BERT model and showing it the labels. The code in bambambam.py can be edited to use any other HuggingFace sentence-transformers model, and/or to use gpu instead of cpu.

  3. Read the unseen data, line by line, from a file named data/unseen.txt. The example data used here comes from the public domain novel The Legend of Sleepy Hollow (1820) by Washington Irving.

  4. Classify each line of the unseen data by leveraging BERT and the labels. Save a bambambam.csv with all scores, and also give the option (y/n) of printing out examples of top matches per label.


About

Few-shot learning for text classification in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages