Skip to content

Add PInv Moore-Penrose pseudoinverse #292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
emiddleton opened this issue May 24, 2021 · 9 comments
Open

Add PInv Moore-Penrose pseudoinverse #292

emiddleton opened this issue May 24, 2021 · 9 comments

Comments

@emiddleton
Copy link

I was porting some code from NumPy, and needed the pinv[1][2] function (Moore-Penrose pseudoinverse). I have started implementing it but have run into some issues making it generic over real and complex numbers.

pub trait Pinv {
    type B;
    type E;
    type C;
    fn pinv(&self, rcond: Option<Self::E>) -> Result<Self::B, Error>;
}

pub trait PInvVInto {
    type B;
    type E;
    fn pinv(self, rcond: Option<Self::E>) -> Result<Self::B, Error>;
}

pub trait PInvInplace {
    type B;
    type E;
    fn pinv(&mut self, rcond: Option<Self::E>) -> Result<Self::B, Error>;
}

impl<A, S> Pinv for ArrayBase<S, Ix2>
where
    A: Scalar + Lapack + PartialOrd + Zero,
    S: Data<Elem = A>,
{
    type E = A::Real;
    type B = Array2<A>;
    type C = Array1<A::Real>;

    fn pinv(&self, rconf: Option<Self::E>) -> Result<Self::B, Error> {
        let result: (Option<Self::B>, Self::C, Option<Self::B>) =
            self.svd(true, true).map_err(|_| Error::SvdFailed)?;
        if let (Some(u), s, Some(vt)) = result {
            // discard small singular values
            let s: Self::C = if let Some(rconf) = rconf {
                s.mapv(|v| if v > rconf { v } else { Zero::zero() })
            } else {
                s
            };

            let u = u.reversed_axes();
            let s = s.insert_axis(Axis(1));
            let x1 = s * u;
            Ok(&vt.reversed_axes() * x1)
        } else {
            Err(Error::SvdFailed)
        }
    }
}

At the moment I am getting this error

error[E0277]: cannot multiply `ndarray::ArrayBase<OwnedRepr<<A as ndarray_linalg::Scalar>::Real>, Dim<[usize; 2]>>` by `ndarray::ArrayBase<OwnedRepr<A>, Dim<[usize; 2]>>`
  --> numrs/src/linalg.rs:58:24
   |
58 |             let x1 = s * u;
   |                        ^ no implementation for `ndarray::ArrayBase<OwnedRepr<<A as ndarray_linalg::Scalar>::Real>, Dim<[usize; 2]>> * ndarray::ArrayBase<OwnedRepr<A>, Dim<[usize; 2]>>`
   |
   = help: the trait `Mul<ndarray::ArrayBase<OwnedRepr<A>, Dim<[usize; 2]>>>` is not implemented for `ndarray::ArrayBase<OwnedRepr<<A as ndarray_linalg::Scalar>::Real>, Dim<[usize; 2]>>`
help: consider extending the `where` bound, but there might be an alternative better way to express this requirement
   |
38 |     S: Data<Elem = A>, ndarray::ArrayBase<OwnedRepr<<A as ndarray_linalg::Scalar>::Real>, Dim<[usize; 2]>>: Mul<ndarray::ArrayBase<OwnedRepr<A>, Dim<[usize; 2]>>>

My understanding this is because I am trying to multiple Scalar::Real with a Scalar, but I am not sure how this should be fixed.

  1. https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html
  2. https://www.mathworks.com/help/matlab/ref/pinv.html
@jturner314
Copy link
Member

jturner314 commented May 26, 2021

You could use s.mapv(A::from_real) to convert the real singular values to the corresponding complex type. There are a couple of other issues with the implementation in your comment. The first is that the multiplications are element-wise, when they should be matrix products. The second is that we need to compute the conjugate transpose of the matrices, instead of just the transpose, in the complex case.

Based on Wikipedia, given U, Σ, and V^H, where U and V^H are complex and Σ is real, we need to compute V Σ⁺ U^H. So, we need to do a few things:

  1. Determine how many nonzero singular values to consider, and compute their reciprocals.
  2. Get the conjugate transposes of U and V^H. (Actually, we only need the portions corresponding to "nonzero" singular values.)
  3. Compute the matrix multiplications.

It's also useful to note a couple of things:

  • The value of D A, where D is a diagonal matrix, can be computed by multiplying the rows of A by the values in the diagonal of D. Similarly, the value of A D, where D is a diagonal matrix, can be computed by multiplying the columns of A by the values in the diagonal of D.
  • The singular values are nonnegative, and the LAPACK docs say that they're sorted in descending order.

Here's an implementation with the steps (a) compute Σ⁺ U^H, (b) compute V, and then (c) compute V Σ⁺ U^H.

            if let (Some(u), s, Some(v_h)) = result {
                // Determine how many singular values to keep and compute the
                // values of `Σ⁺ U^H` (up to `num_keep` rows).
                let (num_keep, s_inv_u_h) = {
                    let mut u_t = u.reversed_axes();
                    let mut num_keep = 0;
                    for (&sing_val, mut u_t_row) in s.iter().zip(u_t.rows_mut()) {
                        if sing_val > threshold {
                            let sing_val_recip = sing_val.recip();
                            u_t_row.map_inplace(|u_t| *u_t = A::from_real(sing_val_recip) * u_t.conj());
                            num_keep += 1;
                        } else {
                            break;
                        }
                    }
                    u_t.slice_axis_inplace(Axis(0), Slice::from(..num_keep));
                    (num_keep, u_t)
                };

                // Compute `V` (up to `num_keep` columns).
                let v = {
                    let mut v_h_t = v_h.reversed_axes();
                    v_h_t.slice_axis_inplace(Axis(1), Slice::from(..num_keep));
                    v_h_t.map_inplace(|x| *x = x.conj());
                    v_h_t
                };

                Ok(v.dot(&s_inv_u_h))
            } else {
                ...
            }

Here's another implementation with the steps (a) compute V Σ⁺, (b) compute U^H, and then (c) compute V Σ⁺ U^H.

            if let (Some(u), s, Some(v_h)) = result {
                // Determine how many singular values to keep and compute the
                // values of `V Σ⁺` (up to `num_keep` columns).
                let (num_keep, v_s_inv) = {
                    let mut v_h_t = v_h.reversed_axes();
                    let mut num_keep = 0;
                    for (&sing_val, mut v_h_t_col) in s.iter().zip(v_h_t.columns_mut()) {
                        if sing_val > threshold {
                            let sing_val_recip = sing_val.recip();
                            v_h_t_col.map_inplace(|v_h_t| *v_h_t = A::from_real(sing_val_recip) * v_h_t.conj();
                            num_keep += 1;
                        } else {
                            break;
                        }
                    }
                    v_h_t.slice_axis_inplace(Axis(1), Slice::from(..num_keep));
                    (num_keep, v_h_t)
                };

                // Compute `U^H` (up to `num_keep` rows).
                let u_h = {
                    let mut u_t = u.reversed_axes();
                    u_t.slice_axis_inplace(Axis(0), Slice::from(..num_keep));
                    u_t.map_inplace(|x| *x = x.conj());
                    u_t
                };

                Ok(v_s_inv.dot(&u_h))
            } else {
                ...
            }

Please test these before using them. I haven't verified their correctness, and they may have bugs.

@emiddleton
Copy link
Author

Thank you for doing this @jturner314 this is awesome. I really like the optimization with the break after the first singular value below the threshold. Is this something we could get integrated into the ndarray-linalg (if I clean it up and make a pull request)? I don't have time tonight but if there is interest in integrating it I will work on it over the weekend.

@jturner314
Copy link
Member

Is this something we could get integrated into the ndarray-linalg (if I clean it up and make a pull request)?

Yes, this seems useful to include in ndarray-linalg. Other than docs, the biggest thing I'd want to see before merging is tests. The cases that come to mind are combinations of square/rectangular, real/complex, C/Fortran layout, and zero-rank/partial-rank/full-rank (and maybe a case where a singular value is nonzero but below the threshold).

Although not strictly necessary, it seems useful to have a method which uses a heuristic to choose the threshold value rather than requiring the user to manually specify a threshold. Wikipedia says, "For example, in the MATLAB, GNU Octave, or NumPy function pinv, the tolerance is taken to be t = ε⋅max(m, n)⋅max(Σ), where ε is the machine epsilon."

@emiddleton
Copy link
Author

I am working through this, I noticed octave[1] calculates the threshold using

ε⋅max(m, n)⋅max(Σ)

but NumPy[2] just does

rcond⋅max(Σ)

where rcond is an argument that defaults to 1e-15, which is about the same as ε for a f64 with a maximum dimension of 10.

  1. http://octave.org/doxygen/4.0/dd/d4d/dMatrix_8cc_source.html#l00884
  2. https://github.com/numpy/numpy/blob/v1.20.0/numpy/linalg/linalg.py#L1915-L2011

@jturner314
Copy link
Member

Huh, that's interesting. NumPy's approach of multiplying a user-specified parameter by max(Σ) seems like a good choice. The user could specify rcond = ε⋅max(m, n) to reproduce Octave's behavior. Another option would be to multiply a user-specified parameter by max(m, n)⋅max(Σ). I haven't used pseudoinverses myself, so I'm not familiar enough with the area to have a strong opinion.

@emiddleton
Copy link
Author

I was thinking of having an optional argument rcond like NumPy that defaults to ε⋅max(m, n) if its not specified but I would really like to know where the 1e-15 came from in NumPy.

let rcond = rcond.unwrap_or_else(|| {
     let (n, m) = self.dim();
     Self::E::epsilon() * Self::E::real(n.max(m))
});
let threshold = rcond * s[0];

@emiddleton
Copy link
Author

emiddleton commented May 31, 2021

The plot thickens, numpy.linalg.matrix_rank uses the threshold

ε⋅max(m, n)⋅max(Σ)

and references

W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery,
"Numerical Recipes (3rd edition)", Cambridge University Press, 2007, page 795.

Which says

if a singular value w_i is nonzero but very small, you should also
define its reciprocal to be zero, since its apparent value is probably an artifact of
roundoff error, not a meaningful number. A plausible answer to the question “how
small is small?” is to edit in this fashion all singular values whose ratio to the largest
singular value is less than N times the machine precision

and Just to add more confusion there is

ε⋅0.5*sqrt(m+n+1.)⋅max(Σ)

On page 71

The numpy.linalg.matrix_rank documentation also stated that an absolute value might be appropriate in some cases

The thresholds above deal with floating point roundoff error in the calculation of the SVD. However, you may have more information about the sources of error in M that would make you consider other tolerance values to detect effective rank deficiency. The most useful measure of the tolerance depends on the operations you intend to use on your matrix. For example, if your data come from uncertain measurements with uncertainties greater than floating point epsilon, choosing a tolerance near that uncertainty may be preferable. The tolerance may be absolute if the uncertainties are absolute rather than relative.

But I think its probably better not to support this unless there is a compelling need.

  1. https://github.com/numpy/numpy/blob/v1.20.0/numpy/linalg/linalg.py#L1804-L1906

@emiddleton
Copy link
Author

I have implemented tests for various ranks for matrix but it seems like there must be a better way to create them.

@powpingdone
Copy link

Sorry for bumping, but what is the status of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants