Closed
Description
Description
@ricardoV94 was doing some benchmarking of numba code generated by scan
, and we discovered that wrapping output outer_in
values with np.array
significantly slows down the scans. Some speed tests are here. Cliff's notes:
- The numba code scan generates is 100x slower than a vectorized numba function
- This speed difference is entirely due to wrapping
outer_in
withnp.array
in theidx_to_str
function.
I'm going to test removing the np.array
part entirely and run the test suite to see what happens, but in the meantime I'm opening this issue to ask if anyone knows why this exists.
Activity
ricardoV94 commentedon Mar 4, 2023
I suspect it exists for tensor scalar inputs which our transpilation of numba has a tendency to "downcast" to floats here and there.
This is evidenced by the fact that helper function has an "allow_scalar" which is set to True just for nit-sot (outputs that don't feed back to the internal function).
This safeguard should not be needed for tensor inputs withndim > 0
(which is know at compile time) and which was the case in the gistThis safeguard applies exactly to our case where indexing the vector sequence would return a scalar instead of a scalar array
jessegrabowski commentedon Mar 4, 2023
Right. In the case the speed test considers, the input is 1d. We also tested the 2d case, but there's no speedup to be had here.
As you said to me before, the danger in passing a scalar to the inner function would be if there is some kind of indexing operation that would error out if the input were just a floatX. But if scalars were handled somehow in optimization, it might end up being safe to remove this
np.array
?ricardoV94 commentedon Mar 4, 2023
Yup, I don't know exactly if indexing is allowed but there must be some Ops that work with scalar arrays but not floats.
We should invest in #107 so that we can properly support scalar graphs, also in Scans
ricardoV94 commentedon Mar 12, 2024
Closing this for now. We avoided useless creation of arrays at every step. The scalar optimizations are more general than the Numba backend and should come for free in a rewritten Scan Op like #191