Skip to content

produce_characteristic_explorer for non-English-language Corpus #83

Answered by JasonKessler
cedoard asked this question in Q&A
Discussion options

You must be logged in to vote

Hi Edoardo,

This is still a bit experimental and inelegant (like of Scattertext), but you can pass a subclass of CharacteristicScorer into the characteristic_scorer parameter of produce_scattertext_explorer (or any similar function).

You can see an example of this in https://github.com/JasonKessler/scattertext/blob/5af28c8860d718feb9da1d24c3cb698a946b9c70/scattertext/characteristic/DenseRankCharacteristicness.py

In short, your subclass should implement a get_scores(self, corpus) method which should return a tuple consisting of:

  • a float \in [0,1] which indicates the median value of the following series (this is currently unused)
  • a pd.Series which is indexed on each term in the corpus (e.g…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@cedoard
Comment options

@antonrytting
Comment options

@JasonKessler
Comment options

Answer selected by cedoard
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants