Skip to content

Using Gram matrices to speed up opnorm #1185

Open
@njericha

Description

@njericha

In Julia version 1.11.2, it appears as though I can speed up calculation of opnorm by taking advantage of the identity

$$ \left\lVert A^\top A \right\rVert_{op} = \left\lVert A A^\top \right\rVert_{op} = \left\lVert A \right\rVert_{op}^2. $$

See https://en.wikipedia.org/wiki/Operator_norm#Operators_on_a_Hilbert_space .

In the following test, we can speed up the calculation of the operator norm for tall matrices $A$ by using $\sqrt{\left\lVert A^\top A \right\rVert_{op}}$ (and presumably wide matrices $A$ by using $\sqrt{\left\lVert AA^\top \right\rVert_{op}}$). Note that wrapping the Gram matrix with Symmetric also reduces the memory needed.

using Random: randn
using LinearAlgebra: opnorm, Symmetric
using BenchmarkTools

fopnorm1(X) = opnorm(X)
fopnorm2(X) = sqrt(opnorm(X'X))
fopnorm3(X) = sqrt(opnorm(Symmetric(X'X)))

dimensions = (10000, 10)

b = @benchmark fopnorm1(X) setup=(X=randn(dimensions))
display(b)
b = @benchmark fopnorm2(X) setup=(X=randn(dimensions))
display(b)
b = @benchmark fopnorm3(X) setup=(X=randn(dimensions))
display(b)
BenchmarkTools.Trial: 6522 samples with 1 evaluation per sample.
 Range (min … max):  391.800 μs …   5.152 ms  ┊ GC (min … max): 0.00% … 84.40%
 Time  (median):     463.000 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   503.166 μs ± 153.189 μs  ┊ GC (mean ± σ):  4.63% ± 10.26%

   ▄▆▇██▇▇▆▄▃▂▂▁▁▁ ▁▂▂▁▁  ▁                                     ▂
  ▇█████████████████████████▇▇▇▆▅▆▆▅▄▆▃▅▃▅▄▄▄▁▃▄▄▅▅▆▇▇█▇▇▇██▇██ █
  392 μs        Histogram: log(frequency) by time       1.14 ms <

 Memory estimate: 787.58 KiB, allocs estimate: 13.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  82.300 μs … 483.100 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     97.400 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   99.002 μs ±  16.227 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃▄▅▆▇▆▆▅▅▅▄▆▅█▅▆▅▃▃▂▂▁ ▁                                     ▂
  ███████████████████████████▆▇▇██▆█▅▇▇▇▇▆▇▇▇▇▇▆▆▇█▇█▇▆▇▇▇▅▅▄▅ █
  82.3 μs       Histogram: log(frequency) by time       163 μs <

 Memory estimate: 8.09 KiB, allocs estimate: 14.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  79.000 μs … 376.400 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     95.700 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   95.234 μs ±  12.877 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                █
  ▂▂▃▄▂▄▃▃▄▃▃▂▆▃█▅▄▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  79 μs           Histogram: frequency by time          149 μs <

 Memory estimate: 6.08 KiB, allocs estimate: 19.

This trick also works for square matrices, although with possibly more memory requirements.

dimensions = (100, 100)

b = @benchmark fopnorm1(X) setup=(X=randn(dimensions))
display(b)
b = @benchmark fopnorm2(X) setup=(X=randn(dimensions))
display(b)
b = @benchmark fopnorm3(X) setup=(X=randn(dimensions))
display(b)
BenchmarkTools.Trial: 7594 samples with 1 evaluation per sample.
 Range (min … max):  531.600 μs …   5.870 ms  ┊ GC (min … max): 0.00% … 88.93%
 Time  (median):     597.600 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   623.509 μs ± 133.170 μs  ┊ GC (mean ± σ):  1.44% ±  5.50%

     ▂█▇▃
  ▂▂▄█████▇▅▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▂▁▁▂▁▁▂▂▂▁▁▂▂▂▂▂ ▃
  532 μs           Histogram: frequency by time         1.21 ms <

 Memory estimate: 137.97 KiB, allocs estimate: 14.
BenchmarkTools.Trial: 6847 samples with 1 evaluation per sample.
 Range (min … max):  574.900 μs …   8.099 ms  ┊ GC (min … max): 0.00% … 91.36%
 Time  (median):     669.000 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   695.524 μs ± 163.002 μs  ┊ GC (mean ± σ):  2.15% ±  6.82%

     ▃▄▆█▇▄
  ▂▃████████▆▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  575 μs           Histogram: frequency by time         1.41 ms <

 Memory estimate: 216.17 KiB, allocs estimate: 17.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  308.100 μs …   5.089 ms  ┊ GC (min … max): 0.00% … 90.75%
 Time  (median):     366.400 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   383.650 μs ± 123.907 μs  ┊ GC (mean ± σ):  2.79% ±  7.16%

    ▃▄█▄
  ▄▇████▇▆▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▂▁▁▂▂▂▂▂▂▂▂ ▃
  308 μs           Histogram: frequency by time          1.1 ms <

 Memory estimate: 194.64 KiB, allocs estimate: 24.

More testing might be needed to see if this performance boost still works for other types of arrays like sparce matrices, and element types like complex. But it may be worth redefining opnorm if this the performance can be reliably achieved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions