Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: correct the test loop in test_arpack.eval_evec #14798

Merged
merged 9 commits into from
Oct 10, 2021

Conversation

dimpase
Copy link
Contributor

@dimpase dimpase commented Oct 3, 2021

In scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py, eval_evec
one tests for correctness eigenpairs, in a loop,
which is meant to allow for an error (the
k-tuple of eigenpairs found differs from what's asked by which
argument) in Krylov subspaces method. Cf. comments there:

# on rare occasions, ARPACK routines return results that are proper
# eigenvalues and -vectors, but not necessarily the ones requested in
# the parameter which. This is inherent to the Krylov methods, and
# should not be treated as a failure. If such a rare situation
# occurs, the calculation is tried again (but at most a few times).

One should 1st check whether one got correct eigenvalues, and try again
if they are not the wanted ones.
The current implementation does it backwards: it checks whether
the eigenpair works, without 1st checking for eigenvalues.

This commit corrects this programming error.

Reference issue

Requested by @rgommers in review of gh-14786

Additional information

#14786 (comment)

@dimpase dimpase changed the title correct the test loop in test_arpack.eval_evec BUG: correct the test loop in test_arpack.eval_evec Oct 3, 2021
@dimpase
Copy link
Contributor Author

dimpase commented Oct 3, 2021

ok, a test in GitHub Actions fails in the changed function on Linux Python 3.8:

_____________________________ test_hermitian_modes _____________________________
[gw1] linux -- Python 3.8.0 /usr/bin/python3.8-dbg
scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py:421: in test_hermitian_modes
    eval_evec(symmetric, D, typ, k, which,
        D          = <gen-hermitian-Mc>
        k          = 2
        mattype    = <built-in function asarray>
        params     = <scipy.sparse.linalg.eigen.arpack.tests.test_arpack.SymmetricParams object at 0x7fd8a0dcf500>
        sigma      = None
        symmetric  = True
        typ        = 'F'
        which      = 'LA'
scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py:271: in eval_evec
    assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
E   AssertionError: 
E   Not equal to tolerance rtol=0.000357628, atol=0.000357628
E   error for eigsh:general, typ=F, which=LA, sigma=None, mattype=asarray, OPpart=None, mode=normal
E   Mismatched elements: 1 / 12 (8.33%)
E   Max absolute difference: 0.00344393
E   Max relative difference: 0.00106749
...

and a different failure on Nightly CPython

_____________________________ test_hermitian_modes _____________________________
scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py:421: in test_hermitian_modes
    eval_evec(symmetric, D, typ, k, which,
        D          = <gen-hermitian-Mc>
        k          = 2
        mattype    = <class 'scipy.sparse.csr.csr_matrix'>
        params     = <scipy.sparse.linalg.eigen.arpack.tests.test_arpack.SymmetricParams object at 0x7f0e211f24a0>
        sigma      = None
        symmetric  = True
        typ        = 'D'
        which      = 'LM'
scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py:271: in eval_evec
    assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
E   AssertionError: 
E   Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13
E   error for eigsh:general, typ=D, which=LM, sigma=None, mattype=csr_matrix, OPpart=None, mode=normal
E   Mismatched elements: 1 / 12 (8.33%)
E   Max absolute difference: 3.6483776e-12
E   Max relative difference: 9.24609212e-13
...

and on macOS - a similar to Nightly Python failure (same test data):

_____________________________ test_hermitian_modes _____________________________
scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py:421: in test_hermitian_modes
    eval_evec(symmetric, D, typ, k, which,
        D          = <gen-hermitian-Mc>
        k          = 2
        mattype    = <class 'scipy.sparse.csr.csr_matrix'>
        params     = <scipy.sparse.linalg.eigen.arpack.tests.test_arpack.SymmetricParams object at 0x131cf32e0>
        sigma      = None
        symmetric  = True
        typ        = 'D'
        which      = 'LM'
scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py:271: in eval_evec
    assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
E   AssertionError: 
E   Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13
E   error for eigsh:general, typ=D, which=LM, sigma=None, mattype=csr_matrix, OPpart=None, mode=normal
E   Mismatched elements: 1 / 12 (8.33%)
E   Max absolute difference: 4.38825759e-12
E   Max relative difference: 1.14883516e-12
...

@rgommers rgommers added the maintenance Items related to regular maintenance tasks label Oct 3, 2021
@rgommers
Copy link
Member

rgommers commented Oct 3, 2021

The new failures are not from reversing the checks right, but from this piece of code have the wrong indentation for assert_allclose(...) so it's not run at all for general is True?

        LHS = np.dot(a, evec)
        if general:
            RHS = eigenvalues * np.dot(b, evec)
        else:
            RHS = eigenvalues * evec

            assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)

@rgommers
Copy link
Member

rgommers commented Oct 3, 2021

Reversing the checks makes sense to me, and the last two failures seem minor. The first one is a large change, increasing atol by an order of magnitude (0.000357 to 0.00344). Maybe that is what it is, but it would be good to bump the tolerance only for the subset of test parameters where that is really necessary.

@dimpase
Copy link
Contributor Author

dimpase commented Oct 3, 2021 via email

@dimpase
Copy link
Contributor Author

dimpase commented Oct 3, 2021

Reversing the checks makes sense to me, and the last two failures seem minor. The first one is a large change, increasing atol by an order of magnitude (0.000357 to 0.00344).

It's a bit suspect to me that it depends on the Python version. 3.8 is getting old, perhaps time to move on 3.9?
(or perhaps it's cause it's a "dbg" build, whatever it is)

@rgommers
Copy link
Member

rgommers commented Oct 4, 2021

It's a bit suspect to me that it depends on the Python version. 3.8 is getting old, perhaps time to move on 3.9?
(or perhaps it's cause it's a "dbg" build, whatever it is)

We can't drop 3.8, we're only just dropping 3.7 now - 3.8 will be around for another ~1.5 years. This is also not likely to be dependent on Python version I'd say.

A dbg build is a debug build of Python, so debug symbols are included. Such builds are typically also shipped in Linux distros as separate packages (typically named python-dbg).

@dimpase
Copy link
Contributor Author

dimpase commented Oct 6, 2021

Is there documentation on how to fix a particular test?
We have to fix 2 instances, which differ from one another in 3 places, marked by * below:

    eval_evec(symmetric, D, typ, k, which,
        D          = <gen-hermitian-Mc>
        k          = 2
*       mattype    = <built-in function asarray>
        params     = <scipy.sparse.linalg.eigen.arpack.tests.test_arpack.SymmetricParams object at 0x7fd8a0dcf500>
        sigma      = None
        symmetric  = True
 *      typ        = 'F'
 *      which      = 'LA'

and

    eval_evec(symmetric, D, typ, k, which,
        D          = <gen-hermitian-Mc>
        k          = 2
*       mattype    = <class 'scipy.sparse.csr.csr_matrix'>
        params     = <scipy.sparse.linalg.eigen.arpack.tests.test_arpack.SymmetricParams object at 0x131cf32e0>
        sigma      = None
        symmetric  = True
 *      typ        = 'D'
 *      which      = 'LM'

@rgommers
Copy link
Member

rgommers commented Oct 6, 2021

There aren't any docs, this is too detailed to have docs. I'd say bump the test tolerance unconditionally to make the 2 tests with a small mismatch pass, and then special-case the last test based on the parameters you point out above.

@dimpase
Copy link
Contributor Author

dimpase commented Oct 7, 2021

I actually find $\ell_\infty$ norm used by assert_allclose not very suitable here, it seems to me that $\ell_n$ norm, with n=1 or 2, is more suitable for checking vectors for near equality in the context of numeric linear algebra.

@rgommers
Copy link
Member

rgommers commented Oct 7, 2021

I actually find $\ell_\infty$ norm used by assert_allclose not very suitable here, it seems to me that $\ell_n$ norm, with n=1 or 2, is more suitable for checking vectors for near equality in the context of numeric linear algebra.

Agreed. I don't think that has been discussed before, at least not recently. Worth a separate follow-up.

@dimpase
Copy link
Contributor Author

dimpase commented Oct 8, 2021

I am running into some sort of Heisenbug here. Namely, the 1st error in #14798 (comment) consistently reproduces on a Linux system, by running

python3 runtests.py -j4 -v -t scipy.sparse.linalg.eigen.arpack.tests.test_arpack

and get one failing test.

Then I apply the following change, special-casing this error, and changing tolerances for it:

--- a/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py
+++ b/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py
@@ -28,7 +28,7 @@ from scipy._lib._gcutils import assert_deallocated, IS_PYPY
 _ndigits = {'f': 3, 'd': 11, 'F': 3, 'D': 11}
 
 
-def _get_test_tolerance(type_char, mattype=None):
+def _get_test_tolerance(type_char, mattype=None, D_type=None, which=None):
     """
     Return tolerance values suitable for a given test:
 
@@ -67,6 +67,14 @@ def _get_test_tolerance(type_char, mattype=None):
         # sparse in single precision: worse errors
         rtol *= 5
 
+
+    if mattype is np.asarray and type_char == 'F' and which == 'LA' \
+          and D_type.name == "gen-hermitian-Mc":
+        # missing case from PR 14798
+        tol = 30 * np.finfo(np.float32).eps
+        rtol *= 5
+
+
     return tol, rtol, atol
 
 
@@ -223,8 +231,7 @@ def eval_evec(symmetric, d, typ, k, which, v0=None, sigma=None,
         kwargs['OPpart'] = OPpart
 
     # compute suitable tolerances
-    kwargs['tol'], rtol, atol = _get_test_tolerance(typ, mattype)
-
+    kwargs['tol'], rtol, atol = _get_test_tolerance(typ, mattype, d, which)
     # on rare occasions, ARPACK routines return results that are proper
     # eigenvalues and -vectors, but not necessarily the ones requested in
     # the parameter which. This is inherent to the Krylov methods, and

With it, on the same system I consistently reproduce 2nd error from #14798 (comment), instead of the 1st error.

This feels like something something floating point interrupts touched by NumPy...

@dimpase
Copy link
Contributor Author

dimpase commented Oct 8, 2021

OK, I've bumped whatever was needed

@dimpase
Copy link
Contributor Author

dimpase commented Oct 8, 2021

I think I've fixed everything I could (some test pipelines crash for unrelated reasons).

@rgommers
Copy link
Member

rgommers commented Oct 8, 2021

@dimpase the change to the PROPACK submodule was a mistake I assume?

@dimpase
Copy link
Contributor Author

dimpase commented Oct 8, 2021

the change to the PROPACK submodule was a mistake I assume?

indeed, sorry, I seldom work with submodules - and this change does not show up in "normal" git.
Will be fixed.

@dimpase
Copy link
Contributor Author

dimpase commented Oct 8, 2021

somehow scipy/PROPACK#1 got mixed in.

@dimpase
Copy link
Contributor Author

dimpase commented Oct 8, 2021

it's a pure evil - the submodule update got mixed into d6f6eec - nothing and nobody told me this happened...

@rgommers
Copy link
Member

rgommers commented Oct 8, 2021

it's a pure evil - the submodule update got mixed into d6f6eec - nothing and nobody told me this happened...

Yes this happens often unfortunately, if you change branches and then forget to re-run git submodule update.

@charris
Copy link
Member

charris commented Oct 9, 2021

it's a pure evil

git status is your boon companion during this quest :) Also, don't write commit messages in line, but rather in the editor where you can check what is being committed.

@dimpase
Copy link
Contributor Author

dimpase commented Oct 9, 2021

macOS tests fail while building quadpack extension, something unrelated:

/usr/local/bin/gfortran -Wall -g -Wall -g -undefined dynamic_lookup -bundle build/temp.macosx-10.14-x86_64-3.9/scipy/integrate/_quadpackmodule.o -L/usr/local/lib -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin13/4.9.0 -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin13/4.9.0/../../.. -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin13/4.9.0/../../.. -L/Users/runner/hostedtoolcache/Python/3.9.7/x64/lib -Lbuild/temp.macosx-10.14-x86_64-3.9 -Wl,-rpath,/usr/local/lib -lquadpack -lmach -lopenblas -lopenblas -lgfortran -o build/lib.macosx-10.14-x86_64-3.9/scipy/integrate/_quadpack.cpython-39-darwin.so
ld: library not found for -lSystem
collect2: error: ld returned 1 exit status

The rest passes.

@dimpase
Copy link
Contributor Author

dimpase commented Oct 9, 2021

git status is your boon companion during this quest :)

I tend to rely on git diff, which ignores submodules :-(

In scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py, eval_evec
one tests for correctness eigenpairs, in a loop,
which is meant to allow for an error (the
`k`-tuple of eigenpairs found differs from what's asked by `which`
argument) in Krylov subspaces method. Cf. comments there:

    # on rare occasions, ARPACK routines return results that are proper
    # eigenvalues and -vectors, but not necessarily the ones requested in
    # the parameter which. This is inherent to the Krylov methods, and
    # should not be treated as a failure. If such a rare situation
    # occurs, the calculation is tried again (but at most a few times).

One should 1st check whether one got correct eigenvalues, and try again
if they are not the wanted ones.
The current implementation does it backwards: it checks whether
the eigenpair works, without 1st checking for eigenvalues.

This commit corrects this programming error.
@dimpase
Copy link
Contributor Author

dimpase commented Oct 10, 2021

the failing check is due to the rebase over the current master, unrelated to the PR.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this should be good to go now - let's give it a try. Thanks @dimpase!

@rgommers rgommers merged commit cf3b30c into scipy:master Oct 10, 2021
@rgommers rgommers added this to the 1.8.0 milestone Oct 10, 2021
@dimpase
Copy link
Contributor Author

dimpase commented Oct 12, 2021

Agreed. I don't think that has been discussed before, at least not recently. Worth a separate follow-up.

implemented in #14846

@dimpase dimpase deleted the test_arpack_fix branch October 12, 2021 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks scipy.sparse.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants