How to use model for making predictions? #6

adityakapri · 2019-09-12T15:56:05Z

Once the model has been rained how to do prediction using this?I have examples with no labels, i need to find all the predicted labels .

ThilinaRajapakse · 2019-09-12T17:52:22Z

Easiest way to do it would probably be something like this. I am setting label to 0 for all the examples, but the labels will not be used.

def tokenize(all_data):
    test_examples = [InputExample(0, sentence, None, '0') for sentence in all_data]
    label_list = ["0", "1"]

    num_labels = len(label_list)
    test_examples_len = len(test_examples)
    label_map = {label: i for i, label in enumerate(label_list)}

    test_features = convert_examples_to_features(test_examples, label_list, max_seq_len, tokenizer, output_mode,
        cls_token_at_end=bool('model_type' == 'xlnet'),            # xlnet has a cls token at the end
        cls_token=tokenizer.cls_token,
        cls_token_segment_id=2 if 'model_type' == 'xlnet' else 0,
        sep_token=tokenizer.sep_token,
        sep_token_extra=bool('model_type' == 'roberta'),
        pad_on_left=True,                 # pad on the left for xlnet
        pad_token=tokenizer.convert_tokens_to_ids([tokenizer.pad_token])[0],
        pad_token_segment_id= 4 if 'model_type' == 'xlnet' else 0)

    all_input_ids = torch.tensor([f.input_ids for f in test_features], dtype=torch.long)
    all_input_mask = torch.tensor([f.input_mask for f in test_features], dtype=torch.long)
    all_segment_ids = torch.tensor([f.segment_ids for f in test_features], dtype=torch.long)
    all_label_ids = torch.tensor([f.label_id for f in test_features], dtype=torch.long)

    test_data = TensorDataset(all_input_ids, all_input_mask, all_segment_ids, all_label_ids)
    return test_data

def get_predictions(model, test_data):
    model.eval()
    test_sampler = SequentialSampler(test_data)
    eval_dataloader = DataLoader(test_data, sampler=test_sampler, batch_size=eval_batch_size)
    preds = None
    for batch in eval_dataloader:
        with torch.no_grad():
            batch = tuple(t for t in batch)
            inputs = {'input_ids': batch[0],
                  'attention_mask': batch[1],
                  'token_type_ids': batch[2],
                  'labels': batch[3]}
   

            outputs = model(**inputs)
            _, logits = outputs[:2]
        if not preds:
            preds = logits.detach().numpy()
        else:
            preds = np.append(preds, logits.detach().cpu().numpy(), axis=0)

        preds = np.argmax(preds, axis=1)

    return preds

You can use the tokenize() function to prepare the data, send it to get_predictions() and collect the predictions.

There may be cleaner ways of doing this but it didn't seem worth the trouble for me (the class specification for InputExample says label can be set to None for test data, but that would also require a lot more changes to the code). These two functions are adapted from something similar I wrote for an API that generates predictions. The API is working, so the approach is sound. However, I haven't tested the specific code I provided here, so let me know if it throws any bugs and I can see about fixing them.

Magpi007 · 2019-09-24T04:11:16Z

Is not the get_mismatched function taking out wrong predictions? It could be possible to just adjust this function to get both right and wrong preds?

ThilinaRajapakse · 2019-09-24T04:19:18Z

It's certainly possible. It's original purpose was to give insight into examples that the model was getting wrong.

Mahhos · 2020-01-28T21:12:38Z

I've got two questions.

what is the format of all_data in def tokenize(all_data): function? Is it ".tsv" file in the same format as "train.tsv" and "dev.tsv"?
Where to put these functions and how should we call these functions?

Mahhos · 2020-01-29T04:29:49Z

When I am running the tokenize function, I am getting ValueError: Number of processes must be at least 1. However, when I print os.cpu_count() it shows 2. Do you have any idea why?

djSharma7 · 2020-01-29T06:34:22Z

Can we get classification results on the basis of labels along with their polarities.
For example- The product is good, but the price is very high..
Results --
Product -Positive (Polarity)
Price - Negative (Polarity)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use model for making predictions? #6

How to use model for making predictions? #6

adityakapri commented Sep 12, 2019

ThilinaRajapakse commented Sep 12, 2019 •

edited

Magpi007 commented Sep 24, 2019

ThilinaRajapakse commented Sep 24, 2019

Mahhos commented Jan 28, 2020

Mahhos commented Jan 29, 2020

djSharma7 commented Jan 29, 2020

How to use model for making predictions? #6

How to use model for making predictions? #6

Comments

adityakapri commented Sep 12, 2019

ThilinaRajapakse commented Sep 12, 2019 • edited

Magpi007 commented Sep 24, 2019

ThilinaRajapakse commented Sep 24, 2019

Mahhos commented Jan 28, 2020

Mahhos commented Jan 29, 2020

djSharma7 commented Jan 29, 2020

ThilinaRajapakse commented Sep 12, 2019 •

edited