Replies: 4 comments 9 replies
-
The first one! It's brilliant! I'd like to rename/restructure iris-test-imagehash to make it clear that the file names no longer have anything to do with their imagehash values, with corresponding renaming of the list items in Then each graphical test:
Not entirely sure if this is what @wjbenfold meant, but I like it anyway. If it ends up taking longer - due to 're-hashing' all potential image matches every time - then I recommend creating a dedicated CI run. |
Beta Was this translation helpful? Give feedback.
-
Another question - other than the ones in the post - how do we provide backwards compatibility? Can someone trying to test an older version of Iris just use an older version of the image repo too, or do we need to support it more? |
Beta Was this translation helpful? Give feedback.
-
Proposal in a PR: #4602 |
Beta Was this translation helpful? Give feedback.
-
I'll document the investigation I've done into this here, so if anyone else wants to pursue it while I'm away they can. Attached is a notebook for exploring image hash. Based on reviewing the imagehash library the algorithm calculating the perceptual image hash for a single image goes something like this:
The only steps that requires PIL are 2 and 3. The rest rely on numpy and scipy. GRAYSCALE is easy, we can produce a bit-for-bit reproduction of that (see notebook). RESIZE is more tricky. A general resize is straightforward using scipy instead of pillow. But the imagehash algo that we used to generate all the hashes uses a specific resize algorithm, the Lanczos filter. I've had a couple of quick goes at reimplementing that in in numpy - it's not impossible but its also not simple. Again, see the notebook. So if we wanted to drop the PIL/Pillow dependency, a couple of options:
|
Beta Was this translation helpful? Give feedback.
-
This is motivated by the pinned dependency on Pillow that Iris currently has for the graphic tests.
Current process
When the graphic tests run, each test creates an image and then imagehashes it. This imagehash is tested against the known good results in
imagerepo.json
. If there's a close enough match then the test passes, if not then if fails. In the failure case, the developer then follows a process of checking their images and (if the failure should be a pass) adding updated success options to bothimagerepo.json
and the test-iris-imagehash repository.Current issues
An update to Pillow or imagehash can break all of the imagehashes simultaneously. The last time this happened, we responded by pinning the version of Pillow.
The test-iris-imagehash repo only grows over time (it's currently ~45Mb).
Suggestions
Use imagehash to just compare images in the moment (rather than storing historical imagehashes)
imagerepo.json
could store known good sha256 values of graphic test results, with tests failing every time the sha256 doesn't match an acceptable value.The process for fixing the tests could involve comparing the freshly computed imagehash of the known good image(s) in test-iris-imagehash (these now being indexed by test) and the test result. A successful pass here would add a sha256 to
imagerepo.json
, a failure that was deemed acceptable by the developer could lead to a new image being uploaded to test-iris-imagehash.Good:
Bad:
imagerepo.json
could end up with a lot of sha256 values in for all of the permutations of cartopy / matplotlib / etc. versionsKeep using imagehash as we have, but make it easier to adapt to changes in the hash algorithm
Given that a symbolic link takes very little space and we could generate them programmatically, we could have a script in the test-iris-imagehash repo that will generate a new folder with symbolic links to a centrally stored folder of images. We only need the images to exist once. If we're worried about hash collisions, particularly between hashes generated with different algorithms, we could store some metadata in each folder specifying which versions of Pillow and imagehash it's good for. If we haven't got a certain combination covered then we could automatically generate them by pulling a known good version of Iris and running the tests (or for each image in the central image store, knowing which lockfile and Iris commit it was generated with).
We would also need to update the values in
imagerepo.json
to match the new Pillow version. If we want to be able to specify Pillow and imagehash versions to make tests stricter then we'd also need a bit of a tweak to the infrastructure in Iris.Good:
Bad:
Questions
Alternative approaches I've not gone into
Beta Was this translation helpful? Give feedback.
All reactions