Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

36863 find global b matrix warnings #38574

Merged
merged 28 commits into from
Jan 16, 2025

Conversation

andy-bridger
Copy link
Collaborator

Description of work

Summary of work

Reworking of the algorithm to incorporate some 'quality of life' changes. Mainly code refactoring and updating of error/ warning messaging.

Fixes #36863 .

Further detail of work

As per the comment on the issue (#36863 (comment)) the following changes have been made:

  1. Single wrapper function around calling a child algorithm added and replacing individual function calls
  2. Update to the find_initial_indexing function:
  • If no UB on any table then call FindUBUsingLatticeParameters on tables until successful then break
  • If there are UBs, copy UB to each workspace to get NUB x NUB triangular matrix of nindexed peaks - find best UB to use as a reference (currently: find UB which gives the most ws with n_peaks >= thresh. Tie-breaks by choosing the UB which indexes the most peaks across all ws).
  • Once a reference UB has been found, index peaks in each table using this UB - if a table doesn't have enough indexed peaks then try and call FindUBUsingLatticeParameters and make_UB_consistent - if still not successful gives warning and excludes this table from optimisation.
  1. After optimisation completed, loop over workspaces, call IndexPeaks and print warning for any runs which n_peaks < MIN_NUM_INDEXED
  2. Move try/except to be around the call to CalculateUMatrix in calcResiduals, so we can throw error telling user which workspace/run is problematic
  3. Relax MIN_NUM_INDEXED = 2
  4. In validateInputs require each table to have a minimum of 2 indexed peaks if a UB is present, or 6 peaks (don't need to be indexed) if no UB is present (atm just requires total of 6 peaks).
  5. Updated the tests to cover the change of validateInputs and the exclusion of peak tables with fewer than thresh peaks

To test:

Following script has been use for developing:

# import mantid algorithms, numpy and matplotlib
from mantid.simpleapi import *
import matplotlib.pyplot as plt
import numpy as np

# generate peak table
ws = LoadEmptyInstrument(InstrumentName='SXD', OutputWorkspace='ws')
axis = ws.getAxis(0)
axis.setUnit("TOF")  # need this to add peak to table

SetGoniometer(Workspace=ws, Axis0='-5,0,1,0,1')
peaks = CreatePeaksWorkspace(InstrumentWorkspace='ws', NumberOfPeaks=0)
UB = np.diag([0.3333, 0.25, 0.09091])  # alatt = [3,4,11] 
SetUB(peaks, UB=UB)

peaks2 = CreatePeaksWorkspace(InstrumentWorkspace='ws', NumberOfPeaks=0)
UB = np.diag([0.30, 0.25, 0.1])  # alatt = [3.3,4,10] 
SetUB(peaks2, UB=UB)

SetGoniometer(Workspace=ws, Axis0='5,0,1,0,1')
peaks3 = CreatePeaksWorkspace(InstrumentWorkspace='ws', NumberOfPeaks=0)
UB = np.diag([0.3333, 0.25, 0.09091])  # alatt = [3,4,11] 
SetUB(peaks3, UB=UB)

peaks_list = [peaks, peaks2, peaks3]
for k in range(1,6):
    for l in range(1,6):
        for peak_ws in peaks_list:
            pk = peak_ws.createPeakHKL([2, k, l])
            if pk.getDetectorID() > 0:
               peak_ws.addPeak(pk)

#ClearUB(peaks2)

FindGlobalBMatrix(PeakWorkspaces = peaks_list,
    a=3, b=4,c=11,alpha=90,beta=90,gamma=90)
    
for ws in peaks_list:
    nindexed, *_ = IndexPeaks(ws, Tolerance=0.15, CommonUBForAll=True, EnableLogging=False)
    print(f"{ws.name()} has {nindexed} indexed peaks")

Changing UB = np.diag([0.30, 0.25, 0.1]) for peaks2 to UB = np.diag([0.33, 0.25, 0.1]) should see peaks2 included within the refinement.


Reviewer

Please comment on the points listed below (full description).
Your comments will be used as part of the gatekeeper process, so please comment clearly on what you have checked during your review. If changes are made to the PR during the review process then your final comment will be the most important for gatekeepers. In this comment you should make it clear why any earlier review is still valid, or confirm that all requested changes have been addressed.

Code Review

  • Is the code of an acceptable quality?
  • Does the code conform to the coding standards?
  • Are the unit tests small and test the class in isolation?
  • If there is GUI work does it follow the GUI standards?
  • If there are changes in the release notes then do they describe the changes appropriately?
  • Do the release notes conform to the release notes guide?

Functional Tests

  • Do changes function as described? Add comments below that describe the tests performed?
  • Do the changes handle unexpected situations, e.g. bad input?
  • Has the relevant (user and developer) documentation been added/updated?

Does everything look good? Mark the review as Approve. A member of @mantidproject/gatekeepers will take care of it.

Gatekeeper

If you need to request changes to a PR then please add a comment and set the review status to "Request changes". This will stop the PR from showing up in the list for other gatekeepers.

@andy-bridger andy-bridger marked this pull request as ready for review January 10, 2025 13:07
@RichardWaiteSTFC RichardWaiteSTFC self-assigned this Jan 10, 2025
@sf1919
Copy link
Contributor

sf1919 commented Jan 13, 2025

Can we get a milestone on this please?

@RichardWaiteSTFC RichardWaiteSTFC added this to the Release 6.12 milestone Jan 13, 2025
@RichardWaiteSTFC RichardWaiteSTFC added Diffraction Issues and pull requests related to diffraction Single Crystal Issues and pull requests related to single crystal ISIS Team: Diffraction Issue and pull requests managed by the Diffraction subteam at ISIS labels Jan 13, 2025
Copy link
Contributor

@RichardWaiteSTFC RichardWaiteSTFC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on, it looks like this will be an improvement and provide more informative troubleshooting warnings to the user. Thanks for adding tests for these warnings as well!
I have some suggested improvements (feel free to push back), I think in particular find_initial_indexing could be made a bit clearer (I'm aware the same could probably be said of the original function!). Perhaps if there are no UBs found initially the loop over workspaces with the call to FindUBUsingLatticeParameters could be put in a separate function e.g. find_intial_ub_using_latt_param?

# require PeakWorkspaces with different req for those with/without UBs
# with: to have the more relaxed, n_peaks is at least _MIN_NUM_INDEXED_PEAKS
# without UBs: must have at least _MIN_NUM_PEAKS
if isinstance(ws, IPeaksWorkspace) and n_peaks >= (_MIN_NUM_PEAKS, _MIN_NUM_INDEXED_PEAKS)[has_UB]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite what I had in mind, I think if there is a UB set then we will need to call IndexPEaks.
Something like this (not tested, no warranty)

def validateInputs(self):
    issues = dict()
    ws_list = self.getProperty("PeakWorkspaces").value
    for ws_name in self.getProperty("PeakWorkspaces").value:
        ws = AnalysisDataService.retrieve(wsname)
        if not isinstance(ws, IPeaksWorkspace):
            issues["PeakWorkspaces"] = f"{ws_name} is not a PeaksWorkspace."
            break
        if ws.getOrientedLattice():
            nindexed, *_ = self.exec_child_alg("IndexPeaks", PeaksWorkspace=ws_name, RoundHKLs=True, CommonUBForAll=True)
            if nindexed < MIN_NUM_INDEXED_PEAKS:
                issues["PeakWorkspaces"] = f"{ws_name} has a UB set, therefore it must contain at least {_MIN_NUM_INDEXED_PEAKS} peaks that can be indexed."
        elif ws.getNumberPeaks() < _MIN_NUM_PEAKS:
            issues["PeakWorkspaces"] = f"{ws_name} does not have a UB set, therefore it must contain at least {_MIN_NUM_PEAKS} peaks."
    return issues

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've also incorporated the check that at least 2 ws provided are valid

self.exec_child_alg("IndexPeaks", PeaksWorkspace=cws, RoundHKLs=True, CommonUBForAll=False)

# for the ws with fewer peaks than threshold, try and find some more by adjusting U slightly
for iws in np.where(n_indexed_by_ref < _MIN_NUM_INDEXED_PEAKS)[0]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this loop can be combined with the one above

for iws, cws in enumerate(ws_list):
SetUB(cws, UB=ref_ub)
self.exec_child_alg("IndexPeaks", PeaksWorkspace=cws, RoundHKLs=True, CommonUBForAll=False)

If indexed more than minimum number then set reference UB, otherwise then call FindUBUsingLatticeParameter

total_indexed_peaks = np.sum(indexed_peaks, axis=1)

# find which UBs give the most over threshold and then, of these, which fit the most peaks
max_over_thresh = np.argwhere(n_ws_over_thresh == np.max(n_ws_over_thresh))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to throw a warning if none of the UBs can index enough peaks in any of the other runs, this might indicate an error in the goniometer angles or axes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use np.argmax instead? https://numpy.org/doc/stable/reference/generated/numpy.argmax.html

Copy link
Contributor

@RichardWaiteSTFC RichardWaiteSTFC Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case there could be multiple runs indexing the same number of peaks.
I wondered about using np.lexsort e.g.

best_iub = np.lexsort(( -total_indexed_peaks, -n_ws_over_thresh))[0]  # sort by nws_indexed then total number peaks indexed

then thought sorting would be wasteful! But need to remeber only <~10 peak tables typically so this stuff won't be a performance bottleneck! Probably readability is most important thing here!

if nindexed < _MIN_NUM_INDEXED_PEAKS:
logger.warning(f"Fewer than the desired {_MIN_NUM_INDEXED_PEAKS} peaks were indexed for Workspace {iws}")
ws_list.pop(iws)
logger.warning(f"Workspace {iws} removed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think better to combine the warning messages together in a single call to logger.warning

if foundUB:
iws_unindexed.remove(iws)
# iterate over the potential reference UBs to find the best one
potential_ref_UBs = [
Copy link
Contributor

@RichardWaiteSTFC RichardWaiteSTFC Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think potential_ref_UBs doesn't need to be stored in a variable in find_initial_indexing - it's only useful inside evaluate_best_ref_UB which could return the UB rather than the index of the best UB

@@ -129,66 +138,110 @@ def find_initial_indexing(self, a, b, c, alpha, beta, gamma, ws_list):
# check if a UB exists on any run and if so whether it indexes a sufficient number of peaks
foundUB = False
ws_with_UB = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's difficult to come up with a variable name that infers the object contains iws, nindexed pairs. I might be tempted just to store nindexed in a list and have something like

iws_have_ub = [iws, for iws, ws in enumerate(ws_list) if AnalysisDataService.retrieve(ws).sample().hasOrientedLattice()]
nindexed = np.array([self.exec_child_alg("IndexPeaks", PeaksWorkspace=ws_list[iws], RoundHKLs=True, CommonUBForAll=False)[0] for iws in iws_have_ub])
iws_potential_ref_ub = iws_have_ub[np.flatnonzero(nindexed > _MIN_NUM_INDEXED_PEAKS)]
foundUB = len(iws_potential_ref_ub) > 0

what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the list comprehensions are getting a bit much, there is also:

def _index_workspace_peaks(workspace_index: int) -> int:
    """whatever the heck this does"""
    return self.exec_child_alg(
        "IndexPeaks",
        PeaksWorkspace=ws_list[workspace_index],
        RoundHKLs=True,
        CommonUBForAll=False,
    )[0]

nindexed = np.fromiter(map(_index_workspace_peaks, iws_have_ub), dtype=int)

List comprehension might be slightly faster though 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh, I agree, I've done this along with the change above (#38574 (comment)) and some other little bits to generally reduce the number of repeated loops and some of the repeated/unnecessary data storing

@RichardWaiteSTFC
Copy link
Contributor

For my reference when testing, on main the example script produces
For np.diag([0.30, 0.25, 0.1])

Lattice parameters successfully refined for workspaces: ['peaks', 'peaks2', 'peaks3']
Lattice Parameters: [ 3.001757  4.006301 11.020517 89.848446 89.904955 90.009059]
Parameter Errors  : [1.552047e-08 7.713167e-02 1.503726e+00 4.947268e+00 4.130064e+00
 1.757921e+00]
peaks has 10 indexed peaks
peaks2 has 4 indexed peaks
peaks3 has 9 indexed peaks

and for np.diag([0.33, 0.25, 0.1])

Lattice parameters successfully refined for workspaces: ['peaks', 'peaks2', 'peaks3']
Lattice Parameters: [ 2.992984  4.008921 10.873919 89.542916 89.187807 90.773892]
Parameter Errors  : [0.035323 0.045496 0.3641   3.065943 1.982192 1.249527]
peaks has 10 indexed peaks
peaks2 has 8 indexed peaks
peaks3 has 9 indexed peaks

@andy-bridger
Copy link
Collaborator Author

All those comments should now have changes made, plus an extra handling and test of make_ub_consistent

Copy link
Contributor

@RichardWaiteSTFC RichardWaiteSTFC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, almost there! Just some minor changes to wording of warnings.
I have tried this on some real runs (WISH, Fe2Me3O8) and it does seem like a better reference UB is now found - it seems there are now only 2 runs in the sequence of 9 we can't index post global B optimisation, and the angle between the U matrices are much more consistent. Previously there were at least 3 runs we couldn't index.

I think there's some extra debug info we could give the user post optimisation but happy to leave that for a later PR.

Here are some results from Fe2Me3O8 runs, mostly for my reference!

# New with UB from FindUBUsingLatticeParameters and CalculateUMatrix
####################################################################
Lattice Parameters: [  5.776435   5.787138  10.074228  89.919849  90.012372 119.975833]
Parameter Errors  : [0.003698 0.006794 0.009006 0.111621 0.071437 0.076058]
# angles between UBs
[[ 0.   9.4  8.3  8.7  2.3 57.8  7.2  7.8  5.1]
 [ 0.   0.   8.5  2.3  9.3 48.9  3.8  8.7  4.9]
 [ 0.   0.   0.   6.7  8.7 51.7  9.4 10.5  7.7]
 [ 0.   0.   0.   0.   8.4 49.3  4.   7.6  4.3]
 [ 0.   0.   0.   0.   0.  57.6  6.5  5.7  4.4]
 [ 0.   0.   0.   0.   0.   0.  52.1 54.8 53.4]
 [ 0.   0.   0.   0.   0.   0.   0.   5.7  2.4]
 [ 0.   0.   0.   0.   0.   0.   0.   0.   5.1]
 [ 0.   0.   0.   0.   0.   0.   0.   0.   0. ]]
WISH00057712_peaks_int: indexed 1 out of 94 peaks within 0.15
WISH00057713_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057714_peaks_int: indexed 98 out of 98 peaks within 0.15
WISH00057715_peaks_int: indexed 5 out of 91 peaks within 0.15
WISH00057716_peaks_int: indexed 100 out of 100 peaks within 0.15
WISH00057718_peaks_int: indexed 90 out of 91 peaks within 0.15
WISH00057719_peaks_int: indexed 97 out of 98 peaks within 0.15
WISH00057720_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057721_peaks_int: indexed 79 out of 79 peaks within 0.15

# New with no intial UB
######################
Lattice Parameters: [  5.782949   5.779583  10.072472  90.062373  90.019683 119.970795]
Parameter Errors  : [0.005746 0.005422 0.00805  0.082286 0.090764 0.084685]
# angles between UBs
[[ 0.  24.9 48.1 55.3 23.5  5.7 19.1 22.   7.2]
 [ 0.   0.  60.  74.6  9.2 20.2 17.7  8.7 21.2]
 [ 0.   0.   0.  24.6 52.7 51.7 43.  56.1 52.4]
 [ 0.   0.   0.   0.  68.1 60.4 58.6 69.9 60.8]
 [ 0.   0.   0.   0.   0.  20.6 13.6  5.8 22.4]
 [ 0.   0.   0.   0.   0.   0.  17.4 18.9  2.8]
 [ 0.   0.   0.   0.   0.   0.   0.  17.3 18.2]
 [ 0.   0.   0.   0.   0.   0.   0.   0.  20.9]
 [ 0.   0.   0.   0.   0.   0.   0.   0.   0. ]]
WISH00057712_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057713_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057714_peaks_int: indexed 98 out of 98 peaks within 0.15
WISH00057715_peaks_int: indexed 91 out of 91 peaks within 0.15
WISH00057716_peaks_int: indexed 100 out of 100 peaks within 0.15
WISH00057718_peaks_int: indexed 90 out of 91 peaks within 0.15
WISH00057719_peaks_int: indexed 5 out of 98 peaks within 0.15
WISH00057720_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057721_peaks_int: indexed 5 out of 79 peaks within 0.15

# Old with UB from FindUBUsingLatticeParameters and CalculateUMatrix
#####################################################################
Lattice Parameters: [  5.780991   5.77608   10.075893  89.975613  90.028252 120.230395]
Parameter Errors  : [0.008211 0.012893 0.016293 0.171744 0.165701 0.150146]
# angles between UBs
[[  0.   21.2 105.9  41.3  39.6  90.9  88.4  41.2   5.9]
 [  0.    0.   91.4  33.2  26.1  81.3  75.9  27.3  20.9]
 [  0.    0.    0.   65.5  67.8  27.2  20.3  65.2 103.3]
 [  0.    0.    0.    0.   14.   50.   47.5  11.8  38.7]
 [  0.    0.    0.    0.    0.   57.6  50.8   5.7  36.5]
 [  0.    0.    0.    0.    0.    0.   17.3  54.6  88.3]
 [  0.    0.    0.    0.    0.    0.    0.   48.8  85.3]
 [  0.    0.    0.    0.    0.    0.    0.    0.   38.7]
 [  0.    0.    0.    0.    0.    0.    0.    0.    0. ]]
WISH00057712_peaks_int: indexed 9 out of 94 peaks within 0.15
WISH00057713_peaks_int: indexed 2 out of 94 peaks within 0.15
WISH00057714_peaks_int: indexed 98 out of 98 peaks within 0.15
WISH00057715_peaks_int: indexed 2 out of 91 peaks within 0.15
WISH00057716_peaks_int: indexed 100 out of 100 peaks within 0.15
WISH00057718_peaks_int: indexed 90 out of 91 peaks within 0.15
WISH00057719_peaks_int: indexed 6 out of 98 peaks within 0.15
WISH00057720_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057721_peaks_int: indexed 2 out of 79 peaks within 0.15

# Old with no intial UB
#######################
Lattice Parameters: [  5.782866   5.779128  10.072993  90.067975  90.037199 119.936395]
Parameter Errors  : [0.00565  0.004945 0.009092 0.078644 0.098838 0.095805]
# angles between UBs
[[ 0.  24.9 48.1 45.1 23.5  5.7 19.1 22.  23.3]
 [ 0.   0.  60.  63.4  9.2 20.2 17.7  8.7  4.9]
 [ 0.   0.   0.  17.  52.7 51.7 43.  56.1 56.1]
 [ 0.   0.   0.   0.  56.9 49.9 47.2 58.8 59.7]
 [ 0.   0.   0.   0.   0.  20.6 13.6  5.8  4.4]
 [ 0.   0.   0.   0.   0.   0.  17.4 18.9 19.5]
 [ 0.   0.   0.   0.   0.   0.   0.  17.3 15.1]
 [ 0.   0.   0.   0.   0.   0.   0.   0.   5.2]
 [ 0.   0.   0.   0.   0.   0.   0.   0.   0. ]]
WISH00057712_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057713_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057714_peaks_int: indexed 98 out of 98 peaks within 0.15
WISH00057715_peaks_int: indexed 2 out of 91 peaks within 0.15
WISH00057716_peaks_int: indexed 100 out of 100 peaks within 0.15
WISH00057718_peaks_int: indexed 90 out of 91 peaks within 0.15
WISH00057719_peaks_int: indexed 5 out of 98 peaks within 0.15
WISH00057720_peaks_int: indexed 94 out of 94 peaks within 0.15
WISH00057721_peaks_int: indexed 79 out of 79 peaks within 0.15

ws = AnalysisDataService.retrieve(wsname)
ws.sample().getOrientedLattice().setError(*err)
if n_peaks[iws] < _MIN_NUM_INDEXED_PEAKS:
logger.warning(
f"Workspace: {wsname}, has only {n_peaks[iws]} indexed peaks, fewer than the desired {_MIN_NUM_INDEXED_PEAKS}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this clearee that this is the number of peaks indexed after global B matrix optimisation?

# otherwise warn the user
else:
logger.warning(
f"Fewer than the desired {_MIN_NUM_INDEXED_PEAKS} peaks were indexed for Workspace {iws}. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this clearer that this is prior to the global B optimisation?

@jclarkeSTFC jclarkeSTFC changed the base branch from main to release-next January 16, 2025 15:42
Copy link
Contributor

@RichardWaiteSTFC RichardWaiteSTFC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for the hard work getting this into release!

@SilkeSchomann SilkeSchomann self-assigned this Jan 16, 2025
@SilkeSchomann SilkeSchomann merged commit 61bccff into release-next Jan 16, 2025
10 checks passed
@SilkeSchomann SilkeSchomann deleted the 36863-FindGlobalBMatrix-warnings branch January 16, 2025 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Diffraction Issues and pull requests related to diffraction ISIS Team: Diffraction Issue and pull requests managed by the Diffraction subteam at ISIS Single Crystal Issues and pull requests related to single crystal
Projects
Status: Merged
Status: Done
Development

Successfully merging this pull request may close these issues.

FindGlobalBMatrix should throw warning if no peaks indexed in a run
5 participants