-
-
Notifications
You must be signed in to change notification settings - Fork 669
Closed
Description
The following function is roughly 3x slower for me compared to regular js for large values. Am I doing something wrong or is there something I can do to speed it up? Any kind of help is appreciated!
function map_page_rank(pages: Array<Int32Array>, page_ranks: Float64Array, maps: Array<Float64Array>, noutlinks: Int32Array, n: i32): void {
const t1 = performanceNow();
for (let i=0; i < n; ++i) {
const outbound_rank = unchecked(page_ranks[i])/unchecked(noutlinks[i]);
for(let j=0; j < n; ++j) {
if (unchecked(pages[i][j]) === 0) {
unchecked(maps[i][j] = 0);
} else {
unchecked(maps[i][j] = unchecked(pages[i][j])*outbound_rank);
}
}
}
const t2 = performanceNow();
consoleLog(((t2 - t1)/1000).toString());
}
Activity
jtenner commentedon Mar 25, 2020
Using
unchecked()
will get you some of the way, but it does appear that JS is good at doing JS things.You can probably speed up your code a lot by using
StaticArray<T>
instead ofArray<T>
,Int32Array
andFloat64Array
If
pages
must be resized, you will need regular Array, instead.jtenner commentedon Mar 25, 2020
dcodeIO commentedon Mar 25, 2020
Would imagine two reasons for the slowdown there: One is the level of indirection of normal and typed arrays, as mentioned by jtenner (using
StaticArray
can help there), and the other is reference counting overhead inside the loop. For instancelet map = unchecked(maps[i])
will retain a reference to the i'thFloat64Array
withinmaps
, use it and release it again when continuing or breaking. In general it's fair to say that AssemblyScript isn't super fast when it comes to idiomatic TypeScript code like in this snippet and will require two things to become better at it: Wasm GC, so we can get rid of reference counting overhead, and an optimization pass to directize array accesses (for example by representing arrays as multi-value structs, with one value being a direct pointer to the backing buffer).nischayv commentedon Mar 25, 2020
Thanks this helped a lot!
let map = unchecked(maps[i])
had a much more significant impact on the performance than StaticArray, but both improved the performance!nischayv commentedon Mar 25, 2020
I had a follow up question with a similar issue. I'm doing a transpose of a matrix, I need to access the j'th element in the inner loop like
matrix[j][i]
, so I can't maintain a reference for the inner loop like the above situation. Is there anything else that can be done to speed it up? Exact code is below. I'm working on building some benchmarks to see how Assemblyscript compiled wasm performs in comparison to js for scientific programming.dcodeIO commentedon Mar 25, 2020
Looks like the ideal thing to to there is to map the multi dimensional
StaticArray<PolarArray>
to a one dimensionalStaticArray<Float64Array>
with two subsequent elements per entry.MaxGraey commentedon Mar 26, 2020
also try cache all matrix elements:
nischayv commentedon Apr 13, 2020
Caching all elements helped a bit, but the best solution is probably turning it into a 1D array. Before I close the issue, @dcodeIO could you explain this statement a bit more -
an optimization pass to directize array accesses (for example by representing arrays as multi-value structs, with one value being a direct pointer to the backing buffer).
? How are arrays being represented currently then?Also regarding the reference counting overhead just to be clear, is the reference to the ith element released after each iteration of j? Appreciate the help!
dcodeIO commentedon Apr 13, 2020
Unless using
StaticArray
, the representation currently involves one level of indirection when accessing an array element. This leads to two loads,Array#buffer
first, then loading the element from that buffer, on each access. Lowering the array struct to multiple values (one we have multi value support), passing these around instead of just a pointer to the array, might help to get rid of indirection, but whether this also has a caveat in actual Wasm engines needs to be investigated.I might not be understanding the question fully, but references are not re-retained when entering an inner code block. So, what's referenced in the outer block remains retained during execution of the inner block. Perhaps looking at the generated text format will give a few hints what's exactly happening there. Instructions to look out for are calls to
__retain
and__release
, modifying reference counts. There are usually fewer of those after optimizations, because there is a pass eliminating unnecessary ones.nischayv commentedon Apr 16, 2020
I just misunderstood. I looked at the array code and it makes sense now. Thanks for the help! Closing this issue.