spark-nlp explainability #9328
Replies: 1 comment 3 replies
-
Hi, This is actually a very big topic and up until now it has been mostly if not 100% done by data scientists through different evaluation methods. I would personally have a test datasets that is important to me and represents the real-world data I am going to use my model(s) for, and then evaluate those pipelines each separately over the test datasets to understand the value/importance of pre-processing/lowercasing/lemmatizing/ and anything else in the pipeline. (accuracy, false positives, F1, etc.) - Spark NLP like many other NLP libraries don't have such feature, or at least not out of the box. But it is an interesting subject so I converted your issue to a discussion to avoid being it closed. |
Beta Was this translation helpful? Give feedback.
-
I am using spark-nlp to preprocess data, which I am then passing into a logistic regression model. Because I am building a bunch of similar models with different subsets of data, I created a pipeline for each model
I am looking into model explainability- how to see which text is most effecting the model. I looked into LIME and SHAP, but could not find anything that would work with spark-nlp. Does spark-nlp have some sort of explainability and if so, can you point me to docs?
I am also looking into incremental training, I can't seem to find anything that works with spark-nlp. If this is possible with spark-nlp and you can point me in the direction of docs, that would be really helpful.
Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions