Replies: 3 comments 1 reply
-
The behaviour of
Do you find it equally surprising that it is possible to create a non-masked I personally think it important that it is still possible to have non-masked arrays on these objects because unfortunately masked arrays often have slower performance 1; this seems to be because they are considered somewhat niche by NumPy and Dask developers and so there is less optimisation effort when it comes to masks.
Given the performance costs, I would say that the best solution is to simply be vigilant for mistakes like this. This can work: elsewhere in Iris it is a given that a dimensional object may or may not be masked, and we write/review accordingly. @mo-gill perhaps there is a case to be made that Footnotes
|
Beta Was this translation helpful? Give feedback.
-
I'll jump in on @mo-gill 's behalf here - we've been working on this together.
No, and I think there is a fundamental difference: cubes with meshes can only be loaded from netCDF. Non-mesh cubes can be loaded from other formats, which do allow loading normal, non-masked arrays. So I would expect objects that can be loaded from PP, grib, etc to contain either a non-masked or masked array, but would not expect objects that can only be loaded from netCDF to be a non-masked array. We encountered this issue in our testing: we construct a synthetic cube with a mesh and this included using a non-masked numpy array for the connectivity. We have now fixed our test mesh generator to use a masked array so that it is more representative of the equivalent data loaded in from a file. This means the behaviour that surprised us has been fixed for our use case: given it surprised us, though, I thought it worth flagging to see if we could prevent others being surprised. (At the risk of a tangent, in ANTS, we do convert all cubes on load to use masked arrays for the data payload: we mostly use netCDF files, so making the behaviour of non-netCDF files consistent helps to avoid surprises within our processing code. I don't think that's appropriate for iris - other workflows could be using netCDFs rarely or not at all - but I think it may be context for why we were surprised).
Yeah, it's a constant source of frustration ☹ |
Beta Was this translation helpful? Give feedback.
-
From @scitools/peloton: this is effectively a decision between:
We're undecided what direction to take, particularly since it's hard to be objective when every person is either an Iris developer or a downstream user/developer. |
Beta Was this translation helpful? Give feedback.
-
You can currently create a non-masked connectivity array. Members of the ANTs team have had a discussion with @stephenworsley whether it might be beneficial to implement behaviour of automatically setting connectivity arrays to maskedarrays during creation of connectivities.
There are two reasons we would appreciate this being implemented:
If there's a decision not to implement this, it would be much appreciated if the documentation could be made more explicit in regard to the connectivity behaviour and masked arrays.
Beta Was this translation helpful? Give feedback.
All reactions