Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data.hash() hashes the first 80 bytes of data instead of the whole data #4862

Open
endeavour42 opened this issue Jan 9, 2024 · 0 comments
Open

Comments

@endeavour42
Copy link

endeavour42 commented Jan 9, 2024

Data.hash currently hashes the first 80 bytes of data, which means that all data values of the same size whose first 80 bytes are the same hash equal. Hashing only the first 80 bytes opens an opportunity of DOS attacks when the attacker provides a data key of the same size whose first 80 bytes are the same.

Example app:

import Foundation

let keySize = 88
let N = 100_000

var dictionary: [Data: Int] = [:]
var data0 = Data([UInt8](repeating: 0, count: keySize - 8))

func makeData(data0: Data, value: Int) -> Data {
    let a = (value >>  0) & 0xFF
    let b = (value >>  8) & 0xFF
    let c = (value >> 16) & 0xFF
    let d = (value >> 24) & 0xFF
    var data = data0
    data.append(UInt8(a))
    data.append(UInt8(b))
    data.append(UInt8(c))
    data.append(UInt8(d))
    return data
}

print("start, N = \(N), keySize = \(keySize)")

let keys = (0 ..< N).map {
    makeData(data0: data0, value: $0)
}

var startTime = Date()
for (i, key) in keys.enumerated() {
    dictionary[key] = i
}
var elapsedTime = Date().timeIntervalSince(startTime)
print("initializing dictionary took \(elapsedTime) seconds")

var key = makeData(data0: data0, value: N)
startTime = Date()
dictionary[key] = N
elapsedTime = Date().timeIntervalSince(startTime)
print("dictionary insert took: \(elapsedTime) seconds")

key = makeData(data0: data0, value: N/2)
startTime = Date()
let result = dictionary[key]!
elapsedTime = Date().timeIntervalSince(startTime)
print("dictionary lookup (\(result)) took: \(elapsedTime) seconds")

For N = 100K and keySize = 88 this gives:

 start, N = 100_000 and keySize = 88
 initializing dictionary took 120 seconds
 dictionary insert took: 0.0025 seconds
 dictionary lookup (50000) took: 0.00195 seconds

For comparison for the same N and keySize = 80 this gives a sharply different result:

 start, N = 100_000, keySize = 80
 initializing dictionary took 0.054 seconds (2000+ times faster)
 dictionary insert took: 0.0 seconds (many times faster)
 dictionary lookup (50000) took: 0.0 seconds (many times faster)

This is the relevant portion of the code in the current (at the time of writing) main branch:

// At most, hash the first 80 bytes of this data.
let range = startIndex ..< Swift.min(startIndex + 80, endIndex)
storage.withUnsafeBytes(in: range) {
hasher.combine(bytes: $0)
}

// Hash at most the first 80 bytes of this data.
let range = startIndex ..< Swift.min(startIndex + 80, endIndex)
storage.withUnsafeBytes(in: range) {
hasher.combine(bytes: $0)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant