Skip to content
This repository has been archived by the owner on Sep 3, 2023. It is now read-only.

[WIP] Common Derogatory Words and Linguistic Structures #4

Open
3 tasks
malteserteresa opened this issue Dec 4, 2019 · 2 comments
Open
3 tasks

[WIP] Common Derogatory Words and Linguistic Structures #4

malteserteresa opened this issue Dec 4, 2019 · 2 comments
Assignees

Comments

@malteserteresa
Copy link
Member

malteserteresa commented Dec 4, 2019

Objective
Identifying derogatory terms and linguistic structures in the data sets

Description

Skills

Tools

Time Estimation

Tasks

  • Step 1
  • Step 2
  • Step 3
@mgeyrdnlr mgeyrdnlr self-assigned this May 29, 2020
@mgeyrdnlr mgeyrdnlr changed the title [WIP] Common internet dialects [WIP] Common Derogatory Words and Linguistic Structures Jun 7, 2020
@mgeyrdnlr
Copy link

mgeyrdnlr commented Jun 7, 2020

In the gold data set, out of 318.765 word tokens, the most frequent terms and word groups, which can be considered as misogynistic as well as discriminating (without knowing the context), are:

  1. bitch (freq. 1779)
  2. feminazi (freq. 773)
  3. whore (freq.615)
  4. cunt (freq. 543)
  5. hysterical ( freq. 413)
  6. skank (freq. 350)
  7. hoe (freq. 326)
  8. pussy (freq. 291)
  9. slut (freq. 214)
  10. ugly (freq. 176)
  11. bitches (freq. 131)
  12. womensuck (freq. 123)
  13. dyke (freq. 105)
  14. nigga (freq. 63)
  15. kunt (freq. 55)
  16. whores (freq. 37)
  17. cunts (freq. 24)
  18. feminismisawful (freq. 16)
  19. skanks(freq. 16)
  20. bitchy (freq. 14)

@mgeyrdnlr
Copy link

In the gold data set, out of 318.765 word tokens, the most frequent words and word groups, around which potentially a misogynistic statement can be built, are:

  1. women (freq. 1748)
  2. sexist (freq. 1036)
  3. she (freq. 909)
  4. her (freq. 882)
  5. woman (freq. 832)
  6. fucking (freq. 682)
  7. fuck (freq. 677)
  8. rape (freq. 651)
  9. stupid (freq. 365)
  10. girl (freq. 363)
  11. hate (freq. 273)
  12. female (freq. 265)
  13. notsexist (freq. 208)
  14. male )freq. 143)
  15. feminism (freq. 139).
  16. feminist (freq. 139)
  17. feminists (freq. 138)
  18. womensuck (freq. 123)
  19. abuse (freq. 88)
  20. realdonaldtrump (freq. 88)
  21. sexual (freq. 88)
  22. ladies (freq. 77)
  23. crazy (76)
  24. cooking (freq. 73)
  25. females (freq. 69)
  26. gender (freq. 69)
  27. camel (freq. 65)
  28. toe (freq. 64)
  29. equality (freq. 63)
  30. raped ( freq. 63)
  31. equal (freq. 62)
  32. fuckin (freq. 60)
  33. fucked (freq. 57)
  34. trans (freq. 55)
  35. mother (freq. 54)
  36. sexism (freq. 54)
  37. womenagainstfeminism (freq. 49)
  38. disgusting (freq. 48)
  39. abortion (freq. 46)
  40. notallmen (freq. 37)
  41. lady (freq. 36)
  42. vagina (freq. 28)
  43. rapists (freq. 27)
  44. weinstein (freq. 27)
  45. penis (freq. 22)
  46. lesbian (freq. 19)
  47. maledominance (freq. 19)
  48. sexually (freq. 18)
  49. boobs (freq. 11)
  50. feminine (freq. 9)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants