Skip to content

Commit

Permalink
Auto merge of #11781 - partiallytyped:11710, r=xFrednet
Browse files Browse the repository at this point in the history
Verify Borrow<T> semantics for types that implement Hash, Borrow<str> and Borrow<[u8]>.

Fixes #11710

The essence of the issue is that types that implement Borrow<T> provide a facet or a representation of the underlying type. Under these semantics `hash(a) == hash(a.borrow())`.

This is a problem when a type implements `Borrow<str>`, `Borrow<[u8]>` and Hash, it is expected that the hash of all three types is identical. The problem is that the hash of [u8] is not the same as that of a String, even when the byte reference ([u8]) is derived from `.as_bytes()`

- [x] Followed [lint naming conventions][lint_naming]
- [x] Added passing UI tests (including committed `.stderr` file)
- [x] `cargo test` passes locally
- [x] Executed `cargo dev update_lints`
- [x] Added lint documentation
- [x] Run `cargo dev fmt`

---

 - [x] Explanation of the issue in the code
 - [x] Tests reproducing the issue
 - [x] Lint rule and emission
  • Loading branch information
bors committed Nov 19, 2023
2 parents dbd19f9 + 3c1e0af commit 3eef533
Show file tree
Hide file tree
Showing 6 changed files with 287 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5128,6 +5128,7 @@ Released 2018-09-13
[`if_then_some_else_none`]: https://rust-lang.github.io/rust-clippy/master/index.html#if_then_some_else_none
[`ifs_same_cond`]: https://rust-lang.github.io/rust-clippy/master/index.html#ifs_same_cond
[`ignored_unit_patterns`]: https://rust-lang.github.io/rust-clippy/master/index.html#ignored_unit_patterns
[`impl_hash_borrow_with_str_and_bytes`]: https://rust-lang.github.io/rust-clippy/master/index.html#impl_hash_borrow_with_str_and_bytes
[`impl_trait_in_params`]: https://rust-lang.github.io/rust-clippy/master/index.html#impl_trait_in_params
[`implicit_clone`]: https://rust-lang.github.io/rust-clippy/master/index.html#implicit_clone
[`implicit_hasher`]: https://rust-lang.github.io/rust-clippy/master/index.html#implicit_hasher
Expand Down
1 change: 1 addition & 0 deletions clippy_lints/src/declared_lints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,7 @@ pub(crate) static LINTS: &[&crate::LintInfo] = &[
crate::if_not_else::IF_NOT_ELSE_INFO,
crate::if_then_some_else_none::IF_THEN_SOME_ELSE_NONE_INFO,
crate::ignored_unit_patterns::IGNORED_UNIT_PATTERNS_INFO,
crate::impl_hash_with_borrow_str_and_bytes::IMPL_HASH_BORROW_WITH_STR_AND_BYTES_INFO,
crate::implicit_hasher::IMPLICIT_HASHER_INFO,
crate::implicit_return::IMPLICIT_RETURN_INFO,
crate::implicit_saturating_add::IMPLICIT_SATURATING_ADD_INFO,
Expand Down
106 changes: 106 additions & 0 deletions clippy_lints/src/impl_hash_with_borrow_str_and_bytes.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
use clippy_utils::diagnostics::span_lint_and_then;
use clippy_utils::ty::implements_trait;
use rustc_hir::def::{DefKind, Res};
use rustc_hir::{Item, ItemKind, Path, TraitRef};
use rustc_lint::{LateContext, LateLintPass};
use rustc_middle::ty::Ty;
use rustc_session::{declare_lint_pass, declare_tool_lint};
use rustc_span::symbol::sym;

declare_clippy_lint! {
/// ### What it does
///
/// This lint is concerned with the semantics of `Borrow` and `Hash` for a
/// type that implements all three of `Hash`, `Borrow<str>` and `Borrow<[u8]>`
/// as it is impossible to satisfy the semantics of Borrow and `Hash` for
/// both `Borrow<str>` and `Borrow<[u8]>`.
///
/// ### Why is this bad?
///
/// When providing implementations for `Borrow<T>`, one should consider whether the different
/// implementations should act as facets or representations of the underlying type. Generic code
/// typically uses `Borrow<T>` when it relies on the identical behavior of these additional trait
/// implementations. These traits will likely appear as additional trait bounds.
///
/// In particular `Eq`, `Ord` and `Hash` must be equivalent for borrowed and owned values:
/// `x.borrow() == y.borrow()` should give the same result as `x == y`.
/// It follows then that the following equivalence must hold:
/// `hash(x) == hash((x as Borrow<[u8]>).borrow()) == hash((x as Borrow<str>).borrow())`
///
/// Unfortunately it doesn't hold as `hash("abc") != hash("abc".as_bytes())`.
/// This happens because the `Hash` impl for str passes an additional `0xFF` byte to
/// the hasher to avoid collisions. For example, given the tuples `("a", "bc")`, and `("ab", "c")`,
/// the two tuples would have the same hash value if the `0xFF` byte was not added.
///
/// ### Example
///
/// ```
/// use std::borrow::Borrow;
/// use std::hash::{Hash, Hasher};
///
/// struct ExampleType {
/// data: String
/// }
///
/// impl Hash for ExampleType {
/// fn hash<H: Hasher>(&self, state: &mut H) {
/// self.data.hash(state);
/// }
/// }
///
/// impl Borrow<str> for ExampleType {
/// fn borrow(&self) -> &str {
/// &self.data
/// }
/// }
///
/// impl Borrow<[u8]> for ExampleType {
/// fn borrow(&self) -> &[u8] {
/// self.data.as_bytes()
/// }
/// }
/// ```
/// As a consequence, hashing a `&ExampleType` and hashing the result of the two
/// borrows will result in different values.
///
#[clippy::version = "1.76.0"]
pub IMPL_HASH_BORROW_WITH_STR_AND_BYTES,
correctness,
"ensures that the semantics of `Borrow` for `Hash` are satisfied when `Borrow<str>` and `Borrow<[u8]>` are implemented"
}

declare_lint_pass!(ImplHashWithBorrowStrBytes => [IMPL_HASH_BORROW_WITH_STR_AND_BYTES]);

impl LateLintPass<'_> for ImplHashWithBorrowStrBytes {
/// We are emitting this lint at the Hash impl of a type that implements all
/// three of `Hash`, `Borrow<str>` and `Borrow<[u8]>`.
fn check_item(&mut self, cx: &LateContext<'_>, item: &Item<'_>) {
if let ItemKind::Impl(imp) = item.kind
&& let Some(TraitRef {path: Path {span, res, ..}, ..}) = imp.of_trait
&& let ty = cx.tcx.type_of(item.owner_id).instantiate_identity()
&& let Some(hash_id) = cx.tcx.get_diagnostic_item(sym::Hash)
&& Res::Def(DefKind::Trait, hash_id) == *res
&& let Some(borrow_id) = cx.tcx.get_diagnostic_item(sym::Borrow)
// since we are in the `Hash` impl, we don't need to check for that.
// we need only to check for `Borrow<str>` and `Borrow<[u8]>`
&& implements_trait(cx, ty, borrow_id, &[cx.tcx.types.str_.into()])
&& implements_trait(cx, ty, borrow_id, &[Ty::new_slice(cx.tcx, cx.tcx.types.u8).into()])
{
span_lint_and_then(
cx,
IMPL_HASH_BORROW_WITH_STR_AND_BYTES,
*span,
"the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented",
|diag| {
diag.note("the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T>");
diag.note(
"however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ..."
);
diag.note("... as (`hash(\"abc\") != hash(\"abc\".as_bytes())`");
diag.help("consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ...");
diag.help("... or not implementing `Hash` for this type");
},
);
}
}
}
2 changes: 2 additions & 0 deletions clippy_lints/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,7 @@ mod if_let_mutex;
mod if_not_else;
mod if_then_some_else_none;
mod ignored_unit_patterns;
mod impl_hash_with_borrow_str_and_bytes;
mod implicit_hasher;
mod implicit_return;
mod implicit_saturating_add;
Expand Down Expand Up @@ -1066,6 +1067,7 @@ pub fn register_lints(store: &mut rustc_lint::LintStore, conf: &'static Conf) {
store.register_late_pass(move |_| Box::new(manual_hash_one::ManualHashOne::new(msrv())));
store.register_late_pass(|_| Box::new(iter_without_into_iter::IterWithoutIntoIter));
store.register_late_pass(|_| Box::new(iter_over_hash_type::IterOverHashType));
store.register_late_pass(|_| Box::new(impl_hash_with_borrow_str_and_bytes::ImplHashWithBorrowStrBytes));
// add lints here, do not remove this comment, it's used in `new_lint`
}

Expand Down
136 changes: 136 additions & 0 deletions tests/ui/impl_hash_with_borrow_str_and_bytes.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#![warn(clippy::impl_hash_borrow_with_str_and_bytes)]

use std::borrow::Borrow;
use std::hash::{Hash, Hasher};

struct ExampleType {
data: String,
}

impl Hash for ExampleType {
//~^ ERROR: can't
fn hash<H: Hasher>(&self, state: &mut H) {
self.data.hash(state);
}
}

impl Borrow<str> for ExampleType {
fn borrow(&self) -> &str {
&self.data
}
}

impl Borrow<[u8]> for ExampleType {
fn borrow(&self) -> &[u8] {
self.data.as_bytes()
}
}

struct ShouldNotRaiseForHash {}
impl Hash for ShouldNotRaiseForHash {
fn hash<H: Hasher>(&self, state: &mut H) {
todo!();
}
}

struct ShouldNotRaiseForBorrow {}
impl Borrow<str> for ShouldNotRaiseForBorrow {
fn borrow(&self) -> &str {
todo!();
}
}
impl Borrow<[u8]> for ShouldNotRaiseForBorrow {
fn borrow(&self) -> &[u8] {
todo!();
}
}

struct ShouldNotRaiseForHashBorrowStr {}
impl Hash for ShouldNotRaiseForHashBorrowStr {
fn hash<H: Hasher>(&self, state: &mut H) {
todo!();
}
}
impl Borrow<str> for ShouldNotRaiseForHashBorrowStr {
fn borrow(&self) -> &str {
todo!();
}
}

struct ShouldNotRaiseForHashBorrowSlice {}
impl Hash for ShouldNotRaiseForHashBorrowSlice {
fn hash<H: Hasher>(&self, state: &mut H) {
todo!();
}
}

impl Borrow<[u8]> for ShouldNotRaiseForHashBorrowSlice {
fn borrow(&self) -> &[u8] {
todo!();
}
}

#[derive(Hash)]
//~^ ERROR: can't
struct Derived {
data: String,
}

impl Borrow<str> for Derived {
fn borrow(&self) -> &str {
self.data.as_str()
}
}

impl Borrow<[u8]> for Derived {
fn borrow(&self) -> &[u8] {
self.data.as_bytes()
}
}

struct GenericExampleType<T> {
data: T,
}

impl<T: Hash> Hash for GenericExampleType<T> {
fn hash<H: Hasher>(&self, state: &mut H) {
self.data.hash(state);
}
}

impl Borrow<str> for GenericExampleType<String> {
fn borrow(&self) -> &str {
&self.data
}
}

impl Borrow<[u8]> for GenericExampleType<&'static [u8]> {
fn borrow(&self) -> &[u8] {
self.data
}
}

struct GenericExampleType2<T> {
data: T,
}

impl Hash for GenericExampleType2<String> {
//~^ ERROR: can't
// this is correctly throwing an error for generic with concrete impl
// for all 3 types
fn hash<H: Hasher>(&self, state: &mut H) {
self.data.hash(state);
}
}

impl Borrow<str> for GenericExampleType2<String> {
fn borrow(&self) -> &str {
&self.data
}
}

impl Borrow<[u8]> for GenericExampleType2<String> {
fn borrow(&self) -> &[u8] {
self.data.as_bytes()
}
}
41 changes: 41 additions & 0 deletions tests/ui/impl_hash_with_borrow_str_and_bytes.stderr
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
error: the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented
--> $DIR/impl_hash_with_borrow_str_and_bytes.rs:10:6
|
LL | impl Hash for ExampleType {
| ^^^^
|
= note: the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T>
= note: however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ...
= note: ... as (`hash("abc") != hash("abc".as_bytes())`
= help: consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ...
= help: ... or not implementing `Hash` for this type
= note: `-D clippy::impl-hash-borrow-with-str-and-bytes` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::impl_hash_borrow_with_str_and_bytes)]`

error: the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented
--> $DIR/impl_hash_with_borrow_str_and_bytes.rs:73:10
|
LL | #[derive(Hash)]
| ^^^^
|
= note: the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T>
= note: however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ...
= note: ... as (`hash("abc") != hash("abc".as_bytes())`
= help: consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ...
= help: ... or not implementing `Hash` for this type
= note: this error originates in the derive macro `Hash` (in Nightly builds, run with -Z macro-backtrace for more info)

error: the semantics of `Borrow<T>` around `Hash` can't be satisfied when both `Borrow<str>` and `Borrow<[u8]>` are implemented
--> $DIR/impl_hash_with_borrow_str_and_bytes.rs:117:6
|
LL | impl Hash for GenericExampleType2<String> {
| ^^^^
|
= note: the `Borrow` semantics require that `Hash` must behave the same for all implementations of Borrow<T>
= note: however, the hash implementations of a string (`str`) and the bytes of a string `[u8]` do not behave the same ...
= note: ... as (`hash("abc") != hash("abc".as_bytes())`
= help: consider either removing one of the `Borrow` implementations (`Borrow<str>` or `Borrow<[u8]>`) ...
= help: ... or not implementing `Hash` for this type

error: aborting due to 3 previous errors

0 comments on commit 3eef533

Please sign in to comment.