Few-shot learning is a machine learning approach where AI models are equipped with the ability to make predictions about new, unseen data examples based on a small number of training examples. The model learns by only a few 'shots', and then applies its knowledge to novel tasks.
This method requires spacy
and classy-classification
.
pip install spacy
pip install classy-classification
Running python bambambam.py
does the following:
-
Look for label data in a subdir named
labels
. Assume that all*.txt
files in there contain example sentences where thefilename.txt
is the label name, and the examples are on separate lines in the file. -
Prepare a classifier by loading a pretrained BERT model and showing it the labels. The code in
bambambam.py
can be edited to use any other HuggingFacesentence-transformers
model, and/or to usegpu
instead ofcpu
. -
Read the unseen data, line by line, from a file named
data/unseen.txt
. The example data used here comes from the public domain novel The Legend of Sleepy Hollow (1820) by Washington Irving. -
Classify each line of the unseen data by leveraging BERT and the labels. Save a
bambambam.csv
with all scores, and also give the option (y/n) of printing out examples of top matches per label.