-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto merge of #11781 - partiallytyped:11710, r=xFrednet
Verify Borrow<T> semantics for types that implement Hash, Borrow<str> and Borrow<[u8]>. Fixes #11710 The essence of the issue is that types that implement Borrow<T> provide a facet or a representation of the underlying type. Under these semantics `hash(a) == hash(a.borrow())`. This is a problem when a type implements `Borrow<str>`, `Borrow<[u8]>` and Hash, it is expected that the hash of all three types is identical. The problem is that the hash of [u8] is not the same as that of a String, even when the byte reference ([u8]) is derived from `.as_bytes()` - [x] Followed [lint naming conventions][lint_naming] - [x] Added passing UI tests (including committed `.stderr` file) - [x] `cargo test` passes locally - [x] Executed `cargo dev update_lints` - [x] Added lint documentation - [x] Run `cargo dev fmt` --- - [x] Explanation of the issue in the code - [x] Tests reproducing the issue - [x] Lint rule and emission
- Loading branch information
Showing
6 changed files
with
287 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
106 changes: 106 additions & 0 deletions
106
clippy_lints/src/impl_hash_with_borrow_str_and_bytes.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
use clippy_utils::diagnostics::span_lint_and_then; | ||
use clippy_utils::ty::implements_trait; | ||
use rustc_hir::def::{DefKind, Res}; | ||
use rustc_hir::{Item, ItemKind, Path, TraitRef}; | ||
use rustc_lint::{LateContext, LateLintPass}; | ||
use rustc_middle::ty::Ty; | ||
use rustc_session::{declare_lint_pass, declare_tool_lint}; | ||
use rustc_span::symbol::sym; | ||
|
||
declare_clippy_lint! { | ||
/// ### What it does | ||
/// | ||
/// This lint is concerned with the semantics of `Borrow` and `Hash` for a | ||
/// type that implements all three of `Hash`, `Borrow<str>` and `Borrow<[u8]>` | ||
/// as it is impossible to satisfy the semantics of Borrow and `Hash` for | ||
/// both `Borrow<str>` and `Borrow<[u8]>`. | ||
/// | ||
/// ### Why is this bad? | ||
/// | ||
/// When providing implementations for `Borrow<T>`, one should consider whether the different | ||
/// implementations should act as facets or representations of the underlying type. Generic code | ||
/// typically uses `Borrow<T>` when it relies on the identical behavior of these additional trait | ||
/// implementations. These traits will likely appear as additional trait bounds. | ||
/// | ||
/// In particular `Eq`, `Ord` and `Hash` must be equivalent for borrowed and owned values: | ||
/// `x.borrow() == y.borrow()` should give the same result as `x == y`. | ||
/// It follows then that the following equivalence must hold: | ||
/// `hash(x) == hash((x as Borrow<[u8]>).borrow()) == hash((x as Borrow<str>).borrow())` | ||
/// | ||
/// Unfortunately it doesn't hold as `hash("abc") != hash("abc".as_bytes())`. | ||
/// This happens because the `Hash` impl for str passes an additional `0xFF` byte to | ||
/// the hasher to avoid collisions. For example, given the tuples `("a", "bc")`, and `("ab", "c")`, | ||
/// the two tuples would have the same hash value if the `0xFF` byte was not added. | ||
/// | ||
/// ### Example | ||
/// | ||
/// ``` | ||
/// use std::borrow::Borrow; | ||
/// use std::hash::{Hash, Hasher}; | ||
/// | ||
/// struct ExampleType { | ||
/// data: String | ||
/// } | ||
/// | ||
/// impl Hash for ExampleType { | ||
/// fn hash<H: Hasher>(&self, state: &mut H) { | ||
/// self.data.hash(state); | ||
/// } | ||
/// } | ||
/// | ||
/// impl Borrow<str> for ExampleType { | ||
/// fn borrow(&self) -> &str { | ||
/// &self.data | ||
/// } | ||
/// } | ||
/// | ||
/// impl Borrow<[u8]> for ExampleType { | ||
/// fn borrow(&self) -> &[u8] { | ||
/// self.data.as_bytes() | ||
/// } | ||
/// } | ||
/// ``` | ||
/// As a consequence, hashing a `&ExampleType` and hashing the result of the two | ||
/// borrows will result in different values. | ||
/// | ||
#[clippy::version = "1.76.0"] | ||
pub IMPL_HASH_BORROW_WITH_STR_AND_BYTES, | ||
correctness, | ||
"ensures that the semantics of `Borrow` for `Hash` are satisfied when `Borrow<str>` and `Borrow<[u8]>` are implemented" | ||
} | ||
|
||
declare_lint_pass!(ImplHashWithBorrowStrBytes => [IMPL_HASH_BORROW_WITH_STR_AND_BYTES]); | ||
|
||
impl LateLintPass<'_> for ImplHashWithBorrowStrBytes { | ||
/// We are emitting this lint at the Hash impl of a type that implements all | ||
/// three of `Hash`, `Borrow<str>` and `Borrow<[u8]>`. | ||
fn check_item(&mut self, cx: &LateContext<'_>, item: &Item<'_>) { | ||
if let ItemKind::Impl(imp) = item.kind | ||
&& let Some(TraitRef {path: Path {span, res, ..}, ..}) = imp.of_trait | ||
&& let ty = cx.tcx.type_of(item.owner_id).instantiate_identity() | ||
&& let Some(hash_id) = cx.tcx.get_diagnostic_item(sym::Hash) | ||
&& Res::Def(DefKind::Trait, hash_id) == *res | ||
&& let Some(borrow_id) = cx.tcx.get_diagnostic_item(sym::Borrow) | ||
// since we are in the `Hash` impl, we don't need to check for that. | ||
// we need only to check for `Borrow<str>` and `Borrow<[u8]>` | ||
&& implements_trait(cx, ty, borrow_id, &[cx.tcx.types.str_.into()]) | ||
&& implements_trait(cx, ty, borrow_id, &[Ty::new_slice(cx.tcx, cx.tcx.types.u8).into()]) | ||
{ | ||
span_lint_and_then( | ||
cx, | ||
IMPL_HASH_BORROW_WITH_STR_AND_BYTES, | ||
*span, | ||
"the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented", | ||
|diag| { | ||
diag.note("the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T>"); | ||
diag.note( | ||
"however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ..." | ||
); | ||
diag.note("... as (`hash(\"abc\") != hash(\"abc\".as_bytes())`"); | ||
diag.help("consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ..."); | ||
diag.help("... or not implementing `Hash` for this type"); | ||
}, | ||
); | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
#![warn(clippy::impl_hash_borrow_with_str_and_bytes)] | ||
|
||
use std::borrow::Borrow; | ||
use std::hash::{Hash, Hasher}; | ||
|
||
struct ExampleType { | ||
data: String, | ||
} | ||
|
||
impl Hash for ExampleType { | ||
//~^ ERROR: can't | ||
fn hash<H: Hasher>(&self, state: &mut H) { | ||
self.data.hash(state); | ||
} | ||
} | ||
|
||
impl Borrow<str> for ExampleType { | ||
fn borrow(&self) -> &str { | ||
&self.data | ||
} | ||
} | ||
|
||
impl Borrow<[u8]> for ExampleType { | ||
fn borrow(&self) -> &[u8] { | ||
self.data.as_bytes() | ||
} | ||
} | ||
|
||
struct ShouldNotRaiseForHash {} | ||
impl Hash for ShouldNotRaiseForHash { | ||
fn hash<H: Hasher>(&self, state: &mut H) { | ||
todo!(); | ||
} | ||
} | ||
|
||
struct ShouldNotRaiseForBorrow {} | ||
impl Borrow<str> for ShouldNotRaiseForBorrow { | ||
fn borrow(&self) -> &str { | ||
todo!(); | ||
} | ||
} | ||
impl Borrow<[u8]> for ShouldNotRaiseForBorrow { | ||
fn borrow(&self) -> &[u8] { | ||
todo!(); | ||
} | ||
} | ||
|
||
struct ShouldNotRaiseForHashBorrowStr {} | ||
impl Hash for ShouldNotRaiseForHashBorrowStr { | ||
fn hash<H: Hasher>(&self, state: &mut H) { | ||
todo!(); | ||
} | ||
} | ||
impl Borrow<str> for ShouldNotRaiseForHashBorrowStr { | ||
fn borrow(&self) -> &str { | ||
todo!(); | ||
} | ||
} | ||
|
||
struct ShouldNotRaiseForHashBorrowSlice {} | ||
impl Hash for ShouldNotRaiseForHashBorrowSlice { | ||
fn hash<H: Hasher>(&self, state: &mut H) { | ||
todo!(); | ||
} | ||
} | ||
|
||
impl Borrow<[u8]> for ShouldNotRaiseForHashBorrowSlice { | ||
fn borrow(&self) -> &[u8] { | ||
todo!(); | ||
} | ||
} | ||
|
||
#[derive(Hash)] | ||
//~^ ERROR: can't | ||
struct Derived { | ||
data: String, | ||
} | ||
|
||
impl Borrow<str> for Derived { | ||
fn borrow(&self) -> &str { | ||
self.data.as_str() | ||
} | ||
} | ||
|
||
impl Borrow<[u8]> for Derived { | ||
fn borrow(&self) -> &[u8] { | ||
self.data.as_bytes() | ||
} | ||
} | ||
|
||
struct GenericExampleType<T> { | ||
data: T, | ||
} | ||
|
||
impl<T: Hash> Hash for GenericExampleType<T> { | ||
fn hash<H: Hasher>(&self, state: &mut H) { | ||
self.data.hash(state); | ||
} | ||
} | ||
|
||
impl Borrow<str> for GenericExampleType<String> { | ||
fn borrow(&self) -> &str { | ||
&self.data | ||
} | ||
} | ||
|
||
impl Borrow<[u8]> for GenericExampleType<&'static [u8]> { | ||
fn borrow(&self) -> &[u8] { | ||
self.data | ||
} | ||
} | ||
|
||
struct GenericExampleType2<T> { | ||
data: T, | ||
} | ||
|
||
impl Hash for GenericExampleType2<String> { | ||
//~^ ERROR: can't | ||
// this is correctly throwing an error for generic with concrete impl | ||
// for all 3 types | ||
fn hash<H: Hasher>(&self, state: &mut H) { | ||
self.data.hash(state); | ||
} | ||
} | ||
|
||
impl Borrow<str> for GenericExampleType2<String> { | ||
fn borrow(&self) -> &str { | ||
&self.data | ||
} | ||
} | ||
|
||
impl Borrow<[u8]> for GenericExampleType2<String> { | ||
fn borrow(&self) -> &[u8] { | ||
self.data.as_bytes() | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
error: the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented | ||
--> $DIR/impl_hash_with_borrow_str_and_bytes.rs:10:6 | ||
| | ||
LL | impl Hash for ExampleType { | ||
| ^^^^ | ||
| | ||
= note: the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T> | ||
= note: however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ... | ||
= note: ... as (`hash("abc") != hash("abc".as_bytes())` | ||
= help: consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ... | ||
= help: ... or not implementing `Hash` for this type | ||
= note: `-D clippy::impl-hash-borrow-with-str-and-bytes` implied by `-D warnings` | ||
= help: to override `-D warnings` add `#[allow(clippy::impl_hash_borrow_with_str_and_bytes)]` | ||
|
||
error: the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented | ||
--> $DIR/impl_hash_with_borrow_str_and_bytes.rs:73:10 | ||
| | ||
LL | #[derive(Hash)] | ||
| ^^^^ | ||
| | ||
= note: the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T> | ||
= note: however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ... | ||
= note: ... as (`hash("abc") != hash("abc".as_bytes())` | ||
= help: consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ... | ||
= help: ... or not implementing `Hash` for this type | ||
= note: this error originates in the derive macro `Hash` (in Nightly builds, run with -Z macro-backtrace for more info) | ||
|
||
error: the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented | ||
--> $DIR/impl_hash_with_borrow_str_and_bytes.rs:117:6 | ||
| | ||
LL | impl Hash for GenericExampleType2<String> { | ||
| ^^^^ | ||
| | ||
= note: the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T> | ||
= note: however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ... | ||
= note: ... as (`hash("abc") != hash("abc".as_bytes())` | ||
= help: consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ... | ||
= help: ... or not implementing `Hash` for this type | ||
|
||
error: aborting due to 3 previous errors | ||
|