Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v2.1.0 #81

Merged
merged 6 commits into from
Feb 5, 2025
Merged

Release v2.1.0 #81

merged 6 commits into from
Feb 5, 2025

Conversation

RomiconEZ
Copy link
Owner

Обновление версия для main.

RomiconEZ and others added 6 commits December 16, 2024 14:58
What's New:

New Features & Enhancements
- Introduced Multistage Attack: We've added a novel `multistage_depth` parameter to the `start_testing()` fucntion, allowing users to specify the depth of a dialogue during testing, enabling more sophisticated and targeted LLM Red teaming strategies.
- Refactored Sycophancy Attack: The `sycophancy_test` has been renamed to `sycophancy`, transforming it into a multistage attack for increased effectiveness in uncovering model vulnerabilities.
- Enhanced Logical Inconsistencies Attack: The `logical_inconsistencies_test` has been renamed to `logical_inconsistencies` and restructured as a multistage attack to better detect and exploit logical weaknesses within language models.
- New Multistage Harmful Behavior Attack: Introducing `harmful_behaviour_multistage`, a more nuanced version of the original harmful behavior attack, designed for deeper penetration testing.
- Innovative System Prompt Leakage Attack: We've developed a new multistage attack, `system_prompt_leakage`, leveraging jailbreak examples from dataset to target and exploit model internals.

Improvements & Refinements
- Conducted extensive refactoring for improved code efficiency and maintainability across the framework.
- Made numerous small improvements and optimizations to enhance overall performance and user experience.

---------

Co-authored-by: Timur Nizamov <[email protected]>
Co-authored-by: Nikita Ivanov <[email protected]>
* small fix for attacks and add strip parameter for ChatSession

---------

Co-authored-by: Низамов Тимур Дамирович <[email protected]>
@RomiconEZ RomiconEZ added the dependencies Pull requests that update a dependency file label Feb 5, 2025
@RomiconEZ RomiconEZ self-assigned this Feb 5, 2025
@RomiconEZ RomiconEZ merged commit d439156 into main Feb 5, 2025
2 of 4 checks passed
@RomiconEZ RomiconEZ deleted the release-2-1-0 branch February 5, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants