- Paper: KOCOH: A Dataset for Detecting Context-Dependent Hate Speech
- Authors: Park Eunah, Song Sanghoun
- Contact: [email protected]
- Hugging Face🤗
This is Korean context-dependent hate speech dataset, KOCOH.
한국어 설명
- Hate speech
- Verbal or non-verbal expressions that propagate or promote prejudice or discrimination against minorities (groups with common identity who have relatively diminished political and social power), or denigrate, insult, or threaten individuals or groups based on their attributes as minorities, or incites discrimination, hostility, or violence against them (Hong, 2018)
- Context-dependent hate speech
- Hate speech interpreted as hateful through contextual factors rather than explicit content alone
- In other words, without context, the statement itself may appear neutral or even positive in interpretation
- Examples
Context Comment Hate 여성 고용이 미흡해 정부가 불이익을 준 기업의 이름이 공개되었다.
A list of companies penalized by the government for inadequate female employment was released믿을 만한 그룹이라는 얘기네
This means they're a trustworthy group.1 직원에게 높은 수준의 복지를 제공한 기업의 이름이 공개되었다.
A list of companies that provide high-level welfare benefits to employees was released.믿을 만한 그룹이라는 얘기네
This means they're a trustworthy group.0
- Source: Dcinside Real-time Best Gallery
- Period: 2024/06/02-23
- Size
Total Type 1 Type 2 Type 3 2,005 539 539 927 - Types and examples
Context Comment Hate Type 1 Actually written context A Actually written comment C 1 광주광역시의 맛있는 음식 다섯 가지를 소개했다.
Introduced five delicious foods from Gwangju Metropolitan City.먹고 싶은데 여권 들고 가기 귀찮아::
I want to eat them but it's annoying to bring my passport.Type 2 Created context B Comment D(same as comment C) 0 일본 오사카의 맛있는 음식 다섯 가지를 소개했다.
Introduced five delicious foods from Osaka, Japan.먹고 싶은데 여권 들고 가기 귀찮아::
I want to eat them but it's annoying to bring my passport.Type 3 Actually written context A Actually written comment E 0 광주광역시의 맛있는 음식 다섯 가지를 소개했다.
Introduced five delicious foods from Gwangju Metropolitan City.역시 광주다
That's Gwangju for you. - Columns
Column Description index Data index set Post index type Type number (1~3 labeling) date The date the DC Inside post was written link The link to the DC Inside post title The title of the DC Inside post context Summary of the DC Inside post content / Created context comment Comments collected from DC Inside hate speech Whether it is hate speech (0 or 1 labeling) gender Target: gender (0 or 1 labeling) disability Target: disability (0 or 1 labeling) race/nation Target: race/nation (0 or 1 labeling) region Target: region (0 or 1 labeling) age Target: age (0 or 1 labeling) note Ikiyano style (-no)
Hong, S. (2018). When Words Hurt. Across.
@article{ART003173761,
author={박은아 and 송상헌},
title={KOCOH: 맥락 의존적 혐오 표현 탐지를 위한 데이터 세트},
journal={한국어학},
issn={1226-9123},
year={2025},
volume={106},
pages={251-277}
}
TBD