-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use colorhash to find similarity in percentage #207
Comments
The code is here https://github.com/JohannesBuchner/imagehash/blob/master/imagehash/__init__.py#L435 It computes a few numbers (14) for black, gray, and 6 histogram bins for faint and bright colors each. The numbers are between 0 and 2^binbits-1. The bits of these are then flattened into a single, large array of binary numbers. The subtraction operation is here: https://github.com/JohannesBuchner/imagehash/blob/master/imagehash/__init__.py#L111 So I guess the maximum possible is binbits*14? |
Similar images should have a small difference. This function is designed with small binbits (default=3) in mind. If the number is way different, all 3 bits are likely different, while if they are similar, likely only one or two (the least significant bits) are different. This does not have to be true (in digits, 9 vs 10 has 2 differences, while the numbers are actually close together), so it is not ideal. But if you choose binbits=64, then counting the number of different bits is not a good approach, and does not really group quite similar things together. All that said, the colorhash is just one possible implementation, and there are probably better approaches. |
@JohannesBuchner I'm little bit confused. Im new to it. For example i have one blank black image and one blank white image and binbits 32. At the end i'm getting 128. Shouldn't it be 448 (32 * 14)? |
Ooh. So working with high binbits is not that efficient? |
Maybe copy the function code of colorhash and run it line by line for an example image, and look at the variables. frac_black and frac_gray are probably as you expect, but I am not sure about h_bright_counts and h_faint_counts. |
My question is can i use colorhash to find similarity of image in percentage.
Example:
let's imagine i get 75. But the question what is the max possible value for this two images. Is it 80 so my images are not similar or is it 800 so my images are quite similar.
The text was updated successfully, but these errors were encountered: