Enabling sexual orientation attribute by PS5138 · Pull Request #233 · cvs-health/langfair

PS5138 · 2026-02-15T17:56:00Z

Closes #142

Hi! This is my second open-source contribution, so I appreciate your patience. I've done my best to follow the existing code patterns and contributing guidelines, but if I've missed anything or if there are changes you'd like me to make, please let me know and I'll be happy to work on it.

Description

This PR adds sexual orientation as a third protected attribute to the CounterfactualGenerator, alongside the existing gender and race attributes. The implementation follows the same substitution strategy used for race (all-to-one replacement) with four groups: heterosexual, gay, lesbian, and bisexual.

Changes in detail

langfair/constants/word_lists.py

Added SEXUAL_ORIENTATION_WORDS_NOT_REQUIRING_CONTEXT (13 terms: homosexual, heterosexual, bisexual, lesbian, queer, lgbtq, etc.)
Added SEXUAL_ORIENTATION_WORDS_REQUIRING_CONTEXT (3 terms: gay, straight, pride; these only match when followed by a person word, to avoid false positives like "straight line" or "go straight")
Word lists influenced by the HRC Glossary of Terms

langfair/generator/counterfactual.py

Built STRICT_SEXUAL_ORIENTATION_WORDS mappings (mirroring the race pattern)
Added sexual_orientation to attribute_to_word_lists, group_mapping, and validation
Added _get_sexual_orientation_subsequences, _counterfactual_sub_sexual_orientation, and _replace_sexual_orientation helper methods (mirroring _get_race_subsequences, _counterfactual_sub_race, and _replace_race)
Replacement sorts by length (longest first) to prevent partial matches (e.g., "homosexual" inside "homosexuals")
Added sexual_orientation support to neutralize_tokens (uses [MASK], same as race)
Updated all relevant docstrings

langfair/auto/auto.py

Added sexual_orientation to Protected_Attributes
Fixed protected_words initialization to derive dynamically from Protected_Attributes instead of being hardcoded to just race and gender

tests/test_counterfactualgenerator.py

Added test_counterfactual_sexual_orientation covering parse_texts, create_prompts, generate_responses, check_ftu, neutralize_tokens, and validation error handling

Contributor License Agreement

confirm you have signed the LangFair CLA

Tests

no new tests required
new tests added
existing tests adjusted

All tests passed.

Documentation

no documentation changes needed
README updated
API docs added or updated
example notebook added or updated

Screenshots

N/A

Enabling sexual orientation attribute

9a38009

PS5138 mentioned this pull request Feb 15, 2026

Enable sexual orientation attribute for CounterfactualGenerator #142

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enabling sexual orientation attribute#233

Enabling sexual orientation attribute#233
PS5138 wants to merge 1 commit intocvs-health:mainfrom
PS5138:sexual-orientation-attribute

PS5138 commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PS5138 commented Feb 15, 2026

Description

Changes in detail

Contributor License Agreement

Tests

Documentation

Screenshots

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant