Skip to content

remove last #ifdef #193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 3, 2022

Conversation

guijan
Copy link
Contributor

@guijan guijan commented Jul 25, 2022

Here's #192 but without the extra baggage.

@daltomi
Copy link
Collaborator

daltomi commented Jul 30, 2022

Again?
Don't you understand that it's not a "micro-optimization"? what do you call it, but to make use of the data that we already have.
On a 32-bit architecture, we don't need to allocate new memory, we don't need to copy memory into new memory,
we don't need to use that new memory later. Why do all that when we already have the data?

I'm going to send the next patch, later, now I don't have much time.

diff --git a/src/scrot.c b/src/scrot.c
index b63f9bd..3f41d1e 100644
--- a/src/scrot.c
+++ b/src/scrot.c
@@ -402,10 +402,10 @@ void scrotGrabMousePointer(const Imlib_Image image, const int xOffset,
     const int y = (xcim->y - xcim->yhot) - yOffset;
     DATA32 *pixels = NULL;

-#ifdef __i386__
+#if defined(__i386__) && (UINT_MAX == UINT_LONG)
     pixels = (DATA32 *)xcim->pixels;
 #else
-    DATA32 data[width * height * 4];
+    DATA32 data[width * height];

     unsigned int i;
     for (i = 0; i < (width * height); i++)

@daltomi daltomi closed this Jul 30, 2022
@guijan
Copy link
Contributor Author

guijan commented Aug 2, 2022

And that's a micro-optimization, it complicates code to optimize something that I very much doubt shows up in scrot's runtime or that anyone benchmarked. Most likely, scrot's runtime is dominated by the image codec or writing to disk, not copying a few kilobytes of cursor for a bitmap that is a few megabytes.
I'm more worried about simplicity and who's going to test all the code inside #ifdef, best to just run the same code everywhere.

I assume "UINT_LONG" is a placeholder, but it adds nothing. Even the #ifdef itself is wrong because i386 isn't the only 32-bit arch in the world, the checking should be if (sizeof(*xcim->pixels) == sizeof(*pixels)) { ... }. The compiler will be able to remove the branch that is dead code in the target machine, sizeof can't be used in the preprocessor.

Cleaning up the code bit by bit is the way to make scrot actually faster in the long run. That alone has already removed more kilobytes of code that would get copied into memory by the OS than not copying the cursor here, and it gives us an intelligible program that could receive some real optimizations later.

@daltomi
Copy link
Collaborator

daltomi commented Aug 28, 2022

Hi,
Sorry for not responding sooner, I'm taking exams.

I assume "UINT_LONG" is a placeholder

It's a typo, I meant ULONG_MAX.

Although we don't agree with the "micro-optimization" thing, you have a good point that makes me reopen this PR, and it is this:

I'm more worried about simplicity and who's going to test all the code inside #ifdef, best to just run the same code everywhere.

It is for the same reason that I rejected a PR long ago that involved using #ifdef everywhere.

TODO:

  • change title and description similar to remove i386 ifdef #192
  • the declaration of the variable outside the for loop was proposed here 32b9411, I think it's not strictly necessary anymore.

If you can, complete these tasks, otherwise I will do it when I can. Thanks.

@daltomi daltomi reopened this Aug 28, 2022
Partially revert: 32b9411 ("Fix compilation on Sparc64", 2021-01-23)
@daltomi daltomi merged commit 2bac114 into resurrecting-open-source-projects:master Sep 3, 2022
@@ -411,21 +411,12 @@ void scrotGrabMousePointer(const Imlib_Image image, const int xOffset,
const unsigned short height = xcim->height;
const int x = (xcim->x - xcim->xhot) - xOffset;
const int y = (xcim->y - xcim->yhot) - yOffset;
DATA32 *pixels = NULL;
DATA32 data[width * height];
Copy link
Collaborator

@N-R-K N-R-K Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd avoid using VLA, especially unchecked like this.

Compiling with -Wvla shows this to be the only VLA in the codebase and it could be easily replaced by malloc/free.

Copy link
Collaborator

@N-R-K N-R-K Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually since unsigned long cannot be less than 32bits, you don't need malloc either. You can easily do the mutation in place. Though that would violate the strict aliasing rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider violating C's aliasing rules by reusing the buffer xcim->pixels points to a bug.

The original version of this PR used reallocarray() to get rid of the VLA, but it was rejected partly because using reallocarray() introduced a dependency on libobsd on OS X, so I kept the VLA in this version.
Looking at this code again, it would be a bug in X if it gave us a cursor that was too large to measure, and it wouldn't be able to allocate xcim->pixels anyway, so we can just call malloc(sizeof(DATA32)*width*height) and assume it doesn't wrap around.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at the Imlib2 API and found a function that can help us. Can anyone of you test if this is correct?
Sorry for the rush, see you next week.

diff --git a/src/scrot.c b/src/scrot.c
index 8783602..274e3c4 100644
--- a/src/scrot.c
+++ b/src/scrot.c
@@ -411,12 +411,8 @@ void scrotGrabMousePointer(const Imlib_Image image, const int xOffset,
     const unsigned short height = xcim->height;
     const int x = (xcim->x - xcim->xhot) - xOffset;
     const int y = (xcim->y - xcim->yhot) - yOffset;
-    DATA32 data[width * height];
-    
-    for (size_t i = 0; i < width*height; i++)
-            data[i] = xcim->pixels[i];
 
-    Imlib_Image imcursor = imlib_create_image_using_data(width, height, data);
+    Imlib_Image imcursor = imlib_create_image_using_copied_data(width, height, (DATA32*)xcim->pixels);
 
     XFree(xcim);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at the Imlib2 API and found a function that can help us. Can anyone of you test if this is correct?

I don't see how that helps. The problem here wasn't ownership, the problem was imlib2 expects 32bit pixels, but XFixesGetCursorImage gives us pixels in unsigned long which might not be 32bits (and is typically 64bits on modern platform).

I would consider violating C's aliasing rules by reusing the buffer xcim->pixels points to a bug.

-fno-strict-alising flag can be enforced in the build if you're worried. The following patch works fine with or without that flag.

diff --git a/src/scrot.c b/src/scrot.c
index 2567b31..d66244a 100644
--- a/src/scrot.c
+++ b/src/scrot.c
@@ -412,14 +412,16 @@ void scrotGrabMousePointer(const Imlib_Image image, const int xOffset,
     const unsigned short height = xcim->height;
     const int x = (xcim->x - xcim->xhot) - xOffset;
     const int y = (xcim->y - xcim->yhot) - yOffset;
-    DATA32 data[width * height];
-    
-    for (size_t i = 0; i < width*height; i++)
-            data[i] = xcim->pixels[i];
+    DATA32 *data = (DATA32 *)xcim->pixels;
 
-    Imlib_Image imcursor = imlib_create_image_using_data(width, height, data);
+    assert(sizeof(*xcim->pixels) >= sizeof(*data));
+    if (sizeof(*xcim->pixels) != sizeof(*data)) {
+        /* NB: this violates strict aliasing rule, but whatever. */
+        for (size_t i = 0; i < (size_t)width*height; i++)
+            data[i] = (DATA32)xcim->pixels[i];
+    }
 
-    XFree(xcim);
+    Imlib_Image imcursor = imlib_create_image_using_data(width, height, data);
 
     if (!imcursor) {
         errx(EXIT_FAILURE,
@@ -433,6 +435,7 @@ void scrotGrabMousePointer(const Imlib_Image image, const int xOffset,
         height);
     imlib_context_set_image(imcursor);
     imlib_free_image();
+    XFree(xcim);
 }
 
 // It assumes that the local variable 'scrot.c:Imlib_Image image' is in context

@N-R-K
Copy link
Collaborator

N-R-K commented Sep 5, 2022

Also FYI, the DATA{n} types are depreciated 1 and imlib2 advises to use the standard fixed width types instead.

Footnotes

  1. https://git.enlightenment.org/old/legacy-imlib2/src/commit/0c687ee837dffdc7a715d7bce5dfe8a4d9ccf0df/src/lib/Imlib2.h.in#L2915

@N-R-K N-R-K mentioned this pull request Sep 24, 2022
daltomi pushed a commit that referenced this pull request Oct 9, 2022
@N-R-K pointed out the deprecated type and revived interest  in removing
the only usage of a VLA in this project (see the discussion: #193).

Author: Guilherme Janczak <[email protected]>
Reviewed-By: NRK <[email protected]>
Reviewed-By: Daniel T. Borelli <[email protected]>
Closes: #198
@guijan guijan deleted the no-ifdef branch January 26, 2023 08:41
N-R-K added a commit to N-R-K/scrot that referenced this pull request May 25, 2023
initially this path wasn't chosen because we suspected this might
violate C's "strict-aliasing" rule. but a more careful look shows that
this is not the case.

xcim->pixels (due to being dynamically allocated memory) does not have
any declared type, and so it's effective type is determined by the last
write, in our case `unsigned long`.

	pixels[i] = xcim->pixels[i];

in all iteration of the loop, `xcim->pixels[i]` at the time of the read,
has the effective type of `unsigned long` which is what we are reading
it as, so there's no issue.

the `pixels[i]` write changes the written-to region's effective type to
`uint32_t` but that value is never read until the `imlib_create_image_using_data`
call later, which reads it as `uint32_t` - matching it's effective type.

as such, there's no real strict-aliasing violation here and we can
simply fixup the array in place. this reduces a potential runtime error
and (in theory) uses less memory.

ref: https://port70.net/~nsz/c/c99/n1256.html#6.5p6
ref: resurrecting-open-source-projects#193 (comment)
N-R-K added a commit that referenced this pull request May 30, 2023
initially this path wasn't chosen because we suspected this might
violate C's "strict-aliasing" rule. but a more careful look shows that
this is not the case.

xcim->pixels (due to being dynamically allocated memory) does not have
any declared type, and so it's effective type is determined by the last
write, in our case `unsigned long`.

	pixels[i] = xcim->pixels[i];

in all iteration of the loop, `xcim->pixels[i]` at the time of the read,
has the effective type of `unsigned long` which is what we are reading
it as, so there's no issue.

the `pixels[i]` write changes the written-to region's effective type to
`uint32_t` but that value is never read until the `imlib_create_image_using_data`
call later, which reads it as `uint32_t` - matching it's effective type.

as such, there's no real strict-aliasing violation here and we can
simply fixup the array in place. this reduces a potential runtime error
and (in theory) uses less memory.

ref: https://port70.net/~nsz/c/c99/n1256.html#6.5p6
ref: #193 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants