Add QR algorithm #472

Aweptimum · 2023-11-01T16:14:50Z

In investigating the problems in #468, I decided to try implementing the qr algorithm to have an alternative way of getting the eigenvectors.

This is still a work in progress, but it adds:

NumericMatrix::upperHessenberg - converts a matrix to Upper Hessenberg form using householder reflections. I only have one test case, but it implicitly passes every Eigenvalue::qrAlgorithm test case so that's neat.
Eigenvalue::qrAlgorithm - finds the eigenvalues of a matrix using the QR Algorithm, which follows the same principle as the power method. It relies on upperHessenberg. I implemented it using the Wilkinson Shift strategy and deflation, so when the lowest diagonal element converges it recurses using the submatrix that doesn't include that diagonal. It's pretty fast, none of the test cases take more than ~100 iterations to converge. All tests pass.
Eigenvector::qrAlgorithm - this just takes the output of the above method and feeds it to Eigenvector::eigenvectors. The problem is that it fails in cases where the eigenvalues are correct, so I'm pretty sure something is wrong in the eigenvectors method. There is a way to directly solve for the eigenvectors using the QR algorithm, but I haven't fully understood it yet.
Adds failing test cases from Add Moore-Penrose Inverse #468

I'm not sure how you'd want to expose Eigenvector::qrAlgorithm in the public API, so I haven't worked on that. Might need a refactor similar to Eigenvector

markrogoyski · 2023-11-02T05:16:01Z

tests/LinearAlgebra/Matrix/Numeric/MatrixOperationsTest.php

+ yield [
+ 'A' => [
+ [3, 0, 0, 0],
+ [0, 1, 0, 1],
+ [0, 0, 2, 0],
+ [0, 1, 0, 1],
+ ],
+ 'H' => [
+ [3, 0, 0, 0],
+ [0, 1, 1, 0],
+ [0, 1, 1, 0],
+ [0, 0, 0, 2],
+ ]


Question about Hessenberg matrices: Is there only one upper Hessenberg matrix for any matrix, or are there multiple possible Hessenberg matrices that satisfy the critiera?

I tried this test case in R and SciPy and it got a slightly different answer.

R

> A <- rbind(c(3, 0, 0, 0), c(0, 1, 0, 1), c(0, 0, 2, 0), c(0, 1, 0, 1)) > hb <- hessenberg(A) > hb $H [,1] [,2] [,3] [,4] [1,] 3 0.000000e+00 0.000000e+00 0.000000e+00 [2,] 0 1.000000e+00 -1.000000e+00 2.220446e-16 [3,] 0 -1.000000e+00 1.000000e+00 -5.551115e-16 [4,] 0 2.220446e-16 -4.440892e-16 2.000000e+00 $P [,1] [,2] [,3] [,4] [1,] 1 0 0.000000e+00 0.000000e+00 [2,] 0 1 0.000000e+00 0.000000e+00 [3,] 0 0 2.220446e-16 -1.000000e+00 [4,] 0 0 -1.000000e+00 2.220446e-16

Python

In [1]: import numpy as np In [2]: from scipy.linalg import hessenberg In [10]: A = np.array([[3, 0, 0, 0], [0, 1, 0, 1], [0, 0, 2, 0], [0, 1, 0, 1]]) In [11]: A Out[11]: array([[3, 0, 0, 0], [0, 1, 0, 1], [0, 0, 2, 0], [0, 1, 0, 1]]) In [12]: H = hessenberg(A) In [13]: H Out[13]: array([[ 3., 0., 0., 0.], [ 0., 1., -1., 0.], [ 0., -1., 1., 0.], [ 0., 0., 0., 2.]])

This online calculator however gets the same answer as you have here.

I'd like to help with generating test cases, but I don't know enough about it so I want to make sure I understand how it works before offering help.

Thanks,
Mark

I think the values are unique, the signs are not. In addition to the upper hessenberg form, the algorithm also computes a permutation matrix, P, such that A = PHP^T. Wolfram returns the hessenberg form with the added context of the permutation matrix, and multiplying it accordingly gives the original matrix.

In the NumericMatrix::upperHessenberg method, the P matrix calculated is exactly the same as R, Python, and Wolfram, it just has different signs:

[1, 0, 0, 0] [0, 1, 0, 0] [0, 0, 0, 1] [0, 0, 1, 0]

Multiplying A = PHP^T gives the original matrix as well.

So it might be better to return both matrices for context rather than just the upper hessenberg matrix

Also, I had originally tried the other method of getting the matrix in Upper-Hessenberg form using Givens rotations. It seems more widely used than the householder method, and I think that's because it allows for parallelization + works better on sparse matrices. This particular matrix needed (I think) a planar rotation of pi/2 to zero out the 1 at (4, 1), so the givens rotation block was just:

[cos(π/2), -sin(π/2)] = [0, -1] [sin(π/2), cos(π/2)] [1, 0]

All that to say, I wouldn't be surprised if the Givens method introduced negative signs to both P and H

Aweptimum · 2023-11-03T13:31:16Z

So the Eigenvector::qrAlgorithm test failures aren't actually that bad - The first two failures return the same eigenvectors, but the columns are swapped around. The second two failures have repeated eigenvectors, but they are cases where the eigenvalues are repeated. I think this because I'm feeding the eigenvalues directly into the eigenvectors method. It looks like the method tries to deal with dupes, but I'm not sure if I need to change the eigenvalue input to match its expectations.

Given:
[2, 0, 1]
[2, 1, 2]
[3, 0, 4]

Eigenvalues: -3, 1, 1

Expected eigenvectors:
[0.26726124191242, 0, 0.70710678118655]
[0.53452248382485, 1, 0]
[0.80178372573727, 0, -0.70710678118655]

Actual:
[0.26726124191242, 0, 0]
[0.53452248382485, 1, 1]
[0.80178372573727, 0, 0]

@Beakerboy what sort of input does the eigenvectors method expect for the $eigenvalues variable? I thought I'd try only giving it unique eigenvalues but it complains the length of the array doesn't match the length of the diagonal.

Beakerboy · 2023-11-06T15:04:19Z

@Aweptimum According to the source, it's an array of floats and is the same size as the matrix. Duplicated should be listed multiple times. When you say "it complains" what is the error that is reported?

Aweptimum · 2023-11-06T15:10:53Z

I just called array_unique on the eigenvalues returned from the QR Algorithm and passed them in. The error is at line 72 in Eigenvectors: MathPHP\Exception\MatrixException: Matrices have different number of rows
The identity matrix is constructed using the length of $eigenvalues array, so I figured it might be intended that the duplicates should be passed in

Beakerboy · 2023-11-06T15:18:05Z

Yeah, that's what I said, I think the duplicates should be passed in, so don't call array_unique.

Aweptimum · 2023-11-06T15:26:29Z

Yeah, I meant to communicate it was my assumption and I'm glad to have your affirmation. I'm having a hard time writing today.

I guess there's some weird edge case in the eigenvectors method when the supplied eigenvalues contain repeats. I put the failing eigenvector cases (#4 and #5 in dataProviderForEigenvector) in the eigenvalue test and they pass. The baffling part is the identity matrix case, where the eigenvalues are all 1's, returns the correct eigenvectors.

Beakerboy · 2023-11-06T16:14:56Z

From scanning the code, it looks like the algorithm is supposed to return two orthogonal vectors when it sees the first “1” in the eigenvalue list. Then when it is generating the return matrix, it will sort them so that one of them is assigned to the position of the first “1” and the other the position of the second “1”. It’s been years since I wrote this so no idea what might be going on.

Aweptimum · 2023-11-06T16:20:31Z

It's ok, I found the culprit! It's this line here
The floating point comparison fails in every case (except for the identity matrix one), so $key is always false when it isn't meant to be. I rewrote it to this:

$found = \array_column($solution_array, 'eigenvalue');
$filtered = \array_filter($found, fn($v) => Arithmetic::almostEqual($v, $eigenvalue, $A->getError()));
$key = array_key_first($filtered);
if ($key === null) {
    // ... solve
}

And it returns the correct solutions for the repeats. I think I'll submit this patch as a separate PR so mark can just merge it, although array_key_first was introduced in 7.3 so I'll have to use something else

array_search seems to fail in most cases when looking for a float in an array of floats. And even if it does find a match, if the key is 0, php evaluates `!0` to true. To find a match, we can instead loop through and compare the numbers with `Arithmetic::almostEqual` and then explicitly check if `$key === false`

Converts a matrix to hessenberg form using householder reflections

The Wilkinson shift accelerates the rate of convergence for the QR algorithm - it's in the static calcShift helper Additionally, moved the iteration to an inner helper function to hide the $values parameter from the public method

Just gets the eigenvalues from the qr algorithm and feeds them to the eigenvectors method

They work, so the problem must be in the eigenvectors method

Aweptimum · 2023-11-08T19:17:34Z

I have found another problem, and that is the original matrix that produces a pseudo-inverse failure also produces an rref failure

If you take this guy and pass it to the eigenvectors method with its eigenvalues:

[2, 0, 0, 0, 0, 0]
[0, 2, 0, 0, 0, 1729.7]
[0, 0, 2, 0, -1729.7, 0]
[0, 0, 0, 0, 0, 0]
[0, 0, -1729.7, 0, 2828785.69, 0]
[0, 1729.7, 0, 0, 0, 2828785.69]

The first step the eigenvector method does when it loops over the given eigenvalues is to subtract the largest eigenvalue from the matrix (2828791.05765 in the first loop) and then get the rref. The rref returned by the matrix method on line 74 is this:

[1, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, -0.00061146305530409]
[0, 0, 1, 0, 0.00061146305530409, 0]
[0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0]

Whereas wolfram and this random rref calculator I found produce an identity matrix (the latter is using the scaled-down matrix in the eigenvectors method)

I'll add the matrix as a test case to the ReducedRowEchelonFormTest and submit a separate PR with it so anyone can look at what's going on

Aweptimum · 2024-02-23T17:34:19Z

Ok, been a bit, but I found the real problem and I'm not sure what to do about it

In the above test case, when it gets to the 3rd eigenvalue, the resulting RREF is an identity matrix.
This causes the algorithm to die.

The matrix copied from above:

[2, 0, 0, 0, 0, 0]
[0, 2, 0, 0, 0, 1729.7]
[0, 0, 2, 0, -1729.7, 0]
[0, 0, 0, 0, 0, 0]
[0, 0, -1729.7, 0, 2828785.69, 0]
[0, 1729.7, 0, 0, 0, 2828785.69]

The eigenvalues: 2828786.747649, 2828786.747649, 2, 0.94235235527, 0.94235235527, 0

The offending eigenvalue, 2, produces this $T (A-λI) matrix (after everything is scaled down by the largest matrix element):

[3.7388808264281E-7, 0, 0, 0, 0, 0]
[0, 3.7388808264281E-7, 0, 0, 0, 0.00061146378324616]
[0, 0, 3.7388808264281E-7, 0, -0.00061146378324616, 0]
[0, 0, 0, -3.3312906859285E-7, 0, 0]
[0, 0, -0.00061146378324616, 0, 0.99999966687093, 0]
[0, 0.00061146378324616, 0, 0, 0, 0.99999966687093]

Which I plugged into wolfram and asked for the rref. It also gives an identity.

So nothing is wrong with the RREF implementation or eigenvalues implementation. However, wolfram can solve for the eigenvectors of the original matrix, so that must mean that the eigenvector procedure is missing something - there is either a way to recover from the RREF not pointing to a solution or an alternative method of getting the eigenvectors.

Edit: @Beakerboy idk if you would know anything of that, but any thoughts? Would you know a resource to look at?

markrogoyski · 2024-02-27T01:48:18Z

Hi @Aweptimum,

Thanks for your continued work on this.

I'm presuming the eigenvalue calculation is correct. If that is the case, and we think the eigenvector calculation is the issue, are you able to reproduce the issue with a simpler matrix, or another matrix, which might help identify what characteristics are causing the issue?

Alternatively, is there a more robust method of calculating eigenvectors that we could be using?

Aweptimum · 2024-02-27T15:03:09Z

The eigenvalue calculation is correct, yes, I have the correct ones verified by numpy + wolfram + a professor who has matlab
I'm not sure how to reproduce it with a simpler matrix :/ but I can try some ideas

I'm not sure if there's a more "robust" way because the eigenvector algorithm is just solving (A-λI) = 0 directly
That said, the professor I tried getting help from just pointed me at the matlab docs for eig, which uses the QZ Algorithm in the general case.

The QZ Algorithm is analogous to the QR algorithm, but it uses the results of the QR algorithm to form what's called the Schur Decomposition, then solves the system using the decomposition rather than the original matrix, and then transforms the eigenvectors back to match the original matrix.

The original paper is behind paywalls and the only implementations I can find are in C or Lapack (and I can read neither). However, there is a nasa paper someone wrote on extending the algorithm that might be plain enough to work from: https://ntrs.nasa.gov/citations/19730018792

Beakerboy · 2024-02-27T15:59:46Z

I thought I’d at least chime in to say that I have no idea how to help, but a quick scan of the nasa paper looks like there’s enough direction that a well motivated individual could just follow the provided steps and implement it. I am not a mathematician, most of my contributions here were just implementing algorithms from Wikipedia, textbooks, or other software packages in PHP, just like this case.

markrogoyski reviewed Nov 2, 2023

View reviewed changes

Aweptimum mentioned this pull request Nov 6, 2023

Replace array_search with floating-point filter in Eigenvector::eigenvectors #473

Merged

Aweptimum added 9 commits November 6, 2023 10:59

Make 8.2 linter happy

7c75688

Add NumericaMatrix->upperHessenberg

22ffbbf

Converts a matrix to hessenberg form using householder reflections

Test uppserHessenberg

9fb768a

Add qrAlgorithm

44637c3

test QRAlgorithm eigenvalues

51ba07e

Shift QR by Wilkinson, refactor a bit

edb1bb3

The Wilkinson shift accelerates the rate of convergence for the QR algorithm - it's in the static calcShift helper Additionally, moved the iteration to an inner helper function to hide the $values parameter from the public method

Add qr method to Eigenvector

51a36e9

Just gets the eigenvalues from the qr algorithm and feeds them to the eigenvectors method

Test Eigenvector::qrAlgorithm

6b445d9

Add failing eigenvector cases to eigenvalues

2f08e84

They work, so the problem must be in the eigenvectors method

Aweptimum force-pushed the qr-algorithm branch from 6a81002 to 2f08e84 Compare November 6, 2023 18:04

Aweptimum mentioned this pull request Nov 8, 2023

Add failing RREF Case #475

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add QR algorithm #472

Add QR algorithm #472

Aweptimum commented Nov 1, 2023

markrogoyski Nov 2, 2023

Aweptimum Nov 2, 2023

Aweptimum commented Nov 3, 2023 •

edited

Beakerboy commented Nov 6, 2023 •

edited

Aweptimum commented Nov 6, 2023

Beakerboy commented Nov 6, 2023

Aweptimum commented Nov 6, 2023 •

edited

Beakerboy commented Nov 6, 2023

Aweptimum commented Nov 6, 2023 •

edited

Aweptimum commented Nov 8, 2023

Aweptimum commented Feb 23, 2024 •

edited

markrogoyski commented Feb 27, 2024

Aweptimum commented Feb 27, 2024

Beakerboy commented Feb 27, 2024

Add QR algorithm #472

Are you sure you want to change the base?

Add QR algorithm #472

Conversation

Aweptimum commented Nov 1, 2023

markrogoyski Nov 2, 2023

Choose a reason for hiding this comment

Aweptimum Nov 2, 2023

Choose a reason for hiding this comment

Aweptimum commented Nov 3, 2023 • edited

Beakerboy commented Nov 6, 2023 • edited

Aweptimum commented Nov 6, 2023

Beakerboy commented Nov 6, 2023

Aweptimum commented Nov 6, 2023 • edited

Beakerboy commented Nov 6, 2023

Aweptimum commented Nov 6, 2023 • edited

Aweptimum commented Nov 8, 2023

Aweptimum commented Feb 23, 2024 • edited

markrogoyski commented Feb 27, 2024

Aweptimum commented Feb 27, 2024

Beakerboy commented Feb 27, 2024

Aweptimum commented Nov 3, 2023 •

edited

Beakerboy commented Nov 6, 2023 •

edited

Aweptimum commented Nov 6, 2023 •

edited

Aweptimum commented Nov 6, 2023 •

edited

Aweptimum commented Feb 23, 2024 •

edited