Skip to content

A way to position lights with transforms (maybe a variant of worldToScreen?) #7889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 17 tasks
davepagurek opened this issue Jun 8, 2025 · 1 comment
Open
1 of 17 tasks

Comments

@davepagurek
Copy link
Contributor

Increasing access

While working on this sketch https://openprocessing.org/sketch/2670771 I was positioning an exit sign that I also wanted to cast light. I was positioning the sign model using translations and rotations. Getting the light to go in the same spot was going to be tough because functions like pointLight don't take those into account, and requires matrix math. This is likely not something everyone is comfortable with, and may result in them not trying to use lights in their scene otherwise.

Most appropriate sub-area of p5.js?

  • Accessibility
  • Color
  • Core/Environment/Rendering
  • Data
  • DOM
  • Events
  • Image
  • IO
  • Math
  • Typography
  • Utilities
  • WebGL
  • Build process
  • Unit testing
  • Internationalization
  • Friendly errors
  • Other (specify if possible)

Feature enhancement details

I'm not sure yet what the best way to deal with this is, but, some ideas:

  • Make lights take transformations into account
    • Benefits: probably the most straightforward conceptually
    • Downsides: to not be a breaking change, this would likely have to be optional, and lighting overloads are already quite complex (see the long list for spotLight, for example)
  • Add a way to get world-space coordinates from local coordinates
    • Benefits: we have something similar for screen coordinates with worldToScreen and screenToWorld. This is what I ended up doing in my sketch, defining a similar worldPoint method at the top
    • Downsides:
      • The target space is world space, but I think we maybe named those other methods slightly inaccurately, since they also use "world" but in a less standard definition. They go from a local coordinate to a screen coordinate. In shader hooks, we use the terms object space, world space, and camera space. Based on those, it would more accurately be objectToScreen. Not sure how to navigate that naming just yet, open to suggestions!
      • It requires some more steps to use it -- you have to first grab the world coordinate given the current transforms, and then pass it into a lighting function
@GregStanton
Copy link
Collaborator

GregStanton commented Jun 9, 2025

Hi @davepagurek!

I’m so glad you posted this issue. It led me to some exciting ideas! I think I have a solution to all the problems you raised. I'll start with the overall concept. Then I'll explain how it solves (a) your light problem, (b) the issue you uncovered with worldToScreen() and screenToWorld(), and (c) other related, longstanding issues. I'd love to hear your feedback.

Transform class

As we’ve discussed, I’m working on a proposal for a Transform class that’s essentially an object-oriented version of the existing Transform features. It provides a user-friendly interface that abstracts away the necessary matrix math. Even for users who know a lot about matrices, putting them behind a user-friendly interface has many advantages:

  • It makes user code readable and self-documenting
  • It prevents errors
  • It enforces conventions
  • It even turns some multiline operations into one-liners.

Here are some of the core methods, to give you a sense of it:

// getting/setting reference frame
xAxis()
yAxis()
zAxis()
origin()

// building transforms from standard operations
translate()
scale()
rotate()

// building transforms from general operations
applyTransform()
applyToTransform()
invert()

// using transforms
applyToPoint()
applyToDirection()
applyToNormal()

With the Transform class in hand, we can easily provide access to the important Transform objects, including the usual model, view, and projection transforms. Access to such features has been requested for a long time, and it'd solve current problems faced by WEBGL users. At the moment, they're resorting to using unstable, undocumented internal features, as you pointed out. That's a desire path! We can pave that path by providing the features outlined below.

Standalone getters

The most basic getter is a function for getting the currently active transform (the one that's set using standalone features like translate() and rotate()). Like much of p5's existing Transform API, this feature has the same name as a feature on the native CanvasRenderingContext2D.

  • getTransform()
    • Description: wrapper for the model matrix
    • Conversion: local to world
    • Context: Works with P2D (2D) and WEBGL (2D and 3D)

Advanced users may want to convert straight from local coordinates to screen-plus-depth coordinates (i.e. clip space). For them, we can also have a getter for the transform that covers the whole graphics pipeline at once. We can name it after the pipeline to indicate that it's useful for 3D graphics, and it can gracefully degrade to getTransform() in the 2D case.

  • getPipeline()
    • Description: wrapper for the model-view-projection matrix
    • Conversion: local to clip
    • Context: Works with P2D (2D) and WEBGL (2D and 3D)
    • Note: This provides an equivalent of the misnamed worldToScreen(), as well as screenToWorld() via invert().

Class-based getters and setters

The following features expand access to all the key matrices in the graphics pipeline. They also include a simple but powerful API for getting composite transforms like model-view, view-projection, and model-view-projection.

  • Camera.prototype.getTransform(source, target)
    • Description: source, target can each take any value in the set {WORLD, EYE, CLIP}
    • Conversion: source to target
  • Camera.prototype.setEyeTransform(transform)
    • Description: sets transform corresponding to inverse view matrix (alternatively set via camera())
    • Conversion: eye to world
    • Note: places the camera's eye in the world; its origin()'s coordinates are the existing eyeX, eyeY, eyeZ properties
  • Camera.prototype.setProjectionTransform(transform)
    • Description: sets transform corresponding to projection matrix (alternatively set via ortho()/frustum()/perspective())
    • Conversion: eye to clip

Hypothetically, if we eventually add a Model class for 3D objects (built from a geometry/shape and a material), we could also add the following features.

  • Model.prototype.getTransform(source, target)
    • Description: source, target can each take any value in the set {LOCAL, WORLD, EYE, CLIP}
    • Conversion: source to target
  • Model.prototype.setLocalTransform(transform)
    • Description: sets the model's local transform, corresponding to the model matrix
    • Conversion: local to world
Click to reveal comments on select design considerations (for the curious)
  1. Benefits of parameterized getter: The class-based getTransform(source, target) provides a super easy, economical, discoverable, and flexible way to directly get any of the composite transforms like the model-view transform, as well as their inverses. It also allows us to potentially implement optimizations on behalf of the user.

  2. Return value of getters (copies not references): The getters would return copies rather than references, to protect internal state. This is what the native getTransform() method of CanvasRenderingContext2D does, and it's especially important for the composite transforms. Allowing the user to set those would mean we'd need to be able to factor their input into their component pieces and update them accordingly, which would add unnecessary complexity (if it's possible at all).

  3. Feature selection and naming: The local and eye transforms are both named after their respective source spaces, making their reference frames intuitive. For example, the origin() of the eye transform is simply eyeX, eyeY, eyeZ, as noted above. This is one reason to have the eye transform represent the inverse of the view transform. Another reason is that three.js sets a strong precedent for making the inverse-view transform the default feature. That allows us to address an inconsistency in the usual names: the model transform is named after its source space whereas the view transform is named after its target space. Those names would cause confusion if included directly in our API.

  4. Reasons to include one-shot setters: In the initial feature set, we might consider omitting setEyeTransform() and setProjectionTransform() for the sake of economy. But there's probably a reasonable case to be made for including them. Intuitively, one of the advantages of having transform objects is reusability and flexibility, so allowing users to directly set an eye or projection transform from a transform object seems to make sense. It'd also allow us to establish a useful naming convention for the inverse-view transform, and it'd let us document the main component transforms more clearly. And, if we eventually develop a Model class, then setLocalTransform() would be more essential; including that and not including dedicated setters for the eye and projection transforms would introduce an inconsistency that'd reduce predictability.

Small blocker in existing features: Can we fix this?

There's one aspect of the current feature set that would prevent users from working with the active view and projection transforms: there doesn't seem to be a way for users to get the active camera instance, or at least, they can't get it from camera().

If camera() were to work like other joint getters/setters, then users could get the active camera instance with the line let activeCamera = camera(). After that, they could access the active view and projection transforms via activeCamera.getTransform(). Since the standalone getTransform() feature lets users work with the model transform, they could easily access all three of the component transforms in the standard graphics pipeline (as well as combinations of those transforms and their inverses).

Do you know why camera() doesn't currently return the active camera instance, @davepagurek? Also, somewhat tangentially, do you know if there's a reason why the functionality of setCamera() isn't just implemented as an overload on camera()?

Lights

Now I'll come back to the exit sign that you wanted to light up in your sketch. A user facing the same problem could get the transform used to position and orient the sign, via getTransform(). Then they could apply it to their light's position and direction using applyToPoint() and applyToDirection(), respectively. And they could do all that without knowing any matrix math. That'd solve the problem without a breaking change, even if the solution isn't built in to the light functions themselves. If we think it's a good idea, we can change the light functions themselves in the next major version release.

Cleaning up worldToScreen() and screenToWorld()

Problem

Thanks for pointing out the API issue with the worldToScreen name. That will cause a lot of confusion. Specifically, "world" conveys the opposite of the intended meaning (i.e. global instead of local). So I think we do need to deprecate worldToScreen()/screenToWorld() and introduce an alternative.

To clarify the naming in my proposal, I'll also note that "screen" is problematic in this context. Sheesh! API design is hard. Fundamentally, the issue is this:

  • In everyday understanding, screens are two-dimensional.
  • As a term of art, screen space is also two-dimensional.
  • As a p5.js feature, screens are… three dimensional.

On the surface, referring to clip space as screen space looks like a user-friendly white lie that conveys the core idea. However, the lie turns out to produce considerable complexity. This includes confusing inconsistencies and inaccuracies in the documentation, where screen space is variously described as both two-dimensional and three-dimensional. While I'm all for simplification, I think an important design principle is to simplify as much as possible, but no more. This is especially true for relatively advanced features such as these.

Solution

The features I proposed above provide users with the functionality of both worldToScreen() and screenToWorld() (and much more). This is pretty cool, since these kinds of features have been requested for a long time. See, for example, #1553 and #4743; the latter issue lists requests going back to 2014!

EDIT 1: Revised the design after a bunch more analysis, in order to address some hidden sources of confusion.
EDIT 2: Revised the design again, after a lot more analysis, to resolve a flaw. The result is more intuitive and more powerful.
EDIT 3: Revised the writing and added structure to this comment, to make it easier to parse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants