Skip to content

Add a binary_heap module #459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 4, 2025
Merged

Add a binary_heap module #459

merged 1 commit into from
Apr 4, 2025

Conversation

dr-m
Copy link
Contributor

@dr-m dr-m commented Feb 23, 2025

binary_heap.make_heap(array,cmp) transforms a Berry list into a binary heap where the heap property is defined by a comparison cmp.

binary_heap.remove_heap(array,cmp) removes and returns the first element of a binary heap and maintains the heap property for the remaining elements. By repeatedly invoking it, one can obtain the first few elements of a Berry list in a sorted order, without having to sort all elements.

binary_heap.sort(array,cmp) implements a heap sort of a Berry list.

The example program can be run as follows:

berry -m modules modules/examples/binary_heap_sort.be

Copy link
Contributor

@s-hadinger s-hadinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'm not sure to understand what use the heap serves? There are no heap traversal functions, so it looks like it's only used to keep a sorted list of items?

binary_heap.make_heap(array,cmp) transforms a Berry list into a
binary heap where the heap property is defined by a comparison cmp.

binary_heap.remove_heap(array,cmp) removes and returns the first element
of a binary heap and maintains the heap property for the remaining
elements.  By repeatedly invoking it, one can obtain the first few
elements of a Berry list in a sorted order, without having to sort
all elements.

binary_heap.sort(array,cmp) implements a heap sort of a Berry list.

The example program can be run as follows:
berry -m modules modules/examples/binary_heap_sort.be
@s-hadinger
Copy link
Contributor

Hi, lots of good discussion. I have the impression we are at a good state now. Is there any more change needed? Or should we merge as-is?

@dr-m
Copy link
Contributor Author

dr-m commented Mar 18, 2025

Hi, lots of good discussion. I have the impression we are at a good state now. Is there any more change needed? Or should we merge as-is?

I was not planning to revise this further, but I am of course open to any suggestions.

One addition that might be useful would be making this and other examples part of your regular testing, for catching any unexpected correctness or performance changes whenever the interpreter is changed. I experimented with the following:

diff --git a/testall.be b/testall.be
index 57414db..5636ea3 100755
--- a/testall.be
+++ b/testall.be
@@ -11,7 +11,7 @@ var total = 0, failed = 0
 for i : testcases
     if os.path.splitext(i)[1] == '.be'
         print('\033[0;36mrun testcase: ' + i + '\033[0m')
-        var ret = os.system(exec, os.path.join(path, i))
+        var ret = os.system(exec, '-m modules ' + os.path.join(path, i))
         if ret != 0
             print('\033[0;31mreturn code:', ret, '\033[0m')
             failed += 1

and the following modified version of my example added to the tests directory:

--- modules/examples/binary_heap_sort.be	2025-03-12 20:08:52.933266643 +0200
+++ tests/binary_heap_sort.be	2025-03-18 21:35:56.339839650 +0200
@@ -32,3 +32,10 @@
 end
 sort(max, less)
 while i < 7 print(f"{max[i]}: {pi[max[i]]}") i += 1 end
+assert(max[0] == 3)
+assert(max[1] == 6)
+assert(max[2] == 7)
+assert(max[3] == 15)
+assert(max[4] == 18)
+assert(max[5] == 20)
+assert(max[6] == 21)

We’d likely want to disable the print statements somehow (redefining the function?), and maybe import the example file somehow as is, and just add some assert() at the end. The examples/qsort.be could be invoked in a similar fashion.

For catching any performance regression, the test would have to run on some relevant embedded targets, and setting up a CI pipeline for that is nontrivial. I got the impression that the current testall.be is not designed for that either.

@dr-m
Copy link
Contributor Author

dr-m commented Mar 26, 2025

@s-hadinger @sfromis Can this be merged?

@s-hadinger
Copy link
Contributor

On my side yes. I would prefer to have @skiars green light too. But he's often quite busy and not always very responsive.

@skiars
Copy link
Member

skiars commented Apr 4, 2025

No problem on my side either.

As for performance regression testing, I'm dealing with a similar issue in my current work. But I'm not planning to run benchmarks on embedded targets - x86 can give a close enough measurement, and it's way easier to set up and track metrics. The catch is, even regular CI servers suck at running benchmark jobs reliably. You’d need a dedicated, stable environment to pull it off.

@dr-m
Copy link
Contributor Author

dr-m commented Apr 4, 2025

Right, automated performance testing is a challenge. You would most likely want dedicated environments and some queue management. It is possible to integrate GitHub with external CI, but it is not trivial, especially if you would want to include some embedded targets.

Maybe I would have observed more substantial performance differences between variations on x86 if I had scaled up the test program and data so that it would not fit in L1 cache and one run would take seconds instead of milliseconds.

@skiars
Copy link
Member

skiars commented Apr 4, 2025

On embedded targets, there's a huge performance gap between PSRAM and SRAM. Plus, optimizing code meant to run in PSRAM (which usually has cache) requires completely different tricks. This makes truly objective performance benchmarking damn near impossible.

@skiars skiars merged commit 5bf1266 into berry-lang:master Apr 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants