Skip to content

Implement a segmentation and rastermask UI frontend module #18722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jenshannoschwalm
Copy link
Collaborator

@jenshannoschwalm jenshannoschwalm commented Apr 23, 2025

After a lot of experimenting and showing preliminary versions - see #18528 - be reminded about #18356 where this all started from.

See this as a generic UI frontend for such AI segmentation stuff or other tools that work on complete image data and should be computed once a) for performance reasons, b) because output is not stable as for AI algos or c) require some other external source.

To give something useful - and also to play around right now - there is a variance analysing tool generating segment masks for each RGB channel. Somewhat like details threshhold ... next tool is "file mask" reading allowing external files to be used as raster masks.
Bildschirmfoto vom 2025-04-23 14-07-13

@MikoMikarro @andriiryzhkov haven't heard from your work since a while but this should hopefully be something to work on more easily.

@TurboGit and @dterrahe - this is an example how the new module->data is used as dt_iop_segmap_module_data_t.
Please note that segmentation data is simply too costly to be included as a parameter. (Can easily be many MB massively blowing up xmp files) I would suggest to save/restore data provided in struct dt_segmentation_t via another "sidecar" file with another extension.


From commit docs

Note 1: we want to generate rastermasks based on full image data or from some external source that can be used in all other modules as usual.
Examples for this would be AI segmentation algorithms, external mask files or other tools based on whole image data.

As those algorithms can be very performance costly we would like to do the algorithm just once based on full image data for all pipes.
To make this compliant with darktable's roi strategy without a significant performance drop this module must be very early in the pipe and requires some care as we want to support all image types. The segmentation is either done while developing in darkroom fullpipe or while exporting and keep results in per instance module->data for later usage.

The results are validated via the dt_dev_pixelpipe_piece_hash() including model versioning, if we find a differing hash we do the segmentation algorithm again and refresh the preview pipes. Please note, all read & write to segmentation data must be protected via a mutex as all pipes will share this.


Note 2: the resulting rastermask is calculated from a selection of segment maps, the maximum number of possible segments is SEGMAP_MAXSEGMENTS. For all image locations we have data in segment maps, the number of generated segment maps depends on the algorithm. AI algorithms might do a segmentation - here the maps for each segment can overlap - providing multiple segment maps. Other algorithms might provide just one map or 3, maybe for each RGB channel.

To keep memory consumption within limits we

  • keep segment map information in uint8_t maps
  • possibly save & restore maps in lower resolution and do a bilinear interpolation before they get transformed via the module->distort_mask() functions to the final rastermask.
  • When keeping maps in lower resolutions a model might provide a post_process function called when providing the rastermask, the defaults is a slight gaussian blur.

Note 3: the generated rastermask is combined from a list of selected segment maps. The module provides the user interface to select/deselect maps for the combined list.

Whenever the module has focus we are in UI visualizing mode showing a false color representation on a dark grayscale image background. A pixel is

  • brightened if it is in any segment map.
  • red if it belongs to the segment under the mouse
  • green if it is included in the combined raster map list.
  • yellow if belongs to the segment under the mouse and that segment is included in the combined list.

We can select/deselect segments from the combined list via the mouse,

  • a left click adds the segment under the mouse to the combined segments
  • a right click removes it
  • if combined with shift a click adds/removes all segments to/from the combined segments

A double-click de-focus the module for convenience.


Note 4: the segment maps are kept in a dt_segmentation_t struct so we can save/read all data via files (after agreeing how/where) to keep edits after possibly changing a segmentation model.

EDIT: Later commits adds the tool to provide a rastermask from external image files.

@jenshannoschwalm jenshannoschwalm added feature: new new features to add scope: image processing correcting pixels labels Apr 23, 2025
@jenshannoschwalm jenshannoschwalm force-pushed the rastermaps branch 3 times, most recently from de994d6 to 5a939e4 Compare April 24, 2025 15:58
@jenshannoschwalm
Copy link
Collaborator Author

Latest squashed and force-pushed commit version has some fixes and improvements:

  1. Slightly faster and quality-improved visual mode
  2. Better logs and dt_control log in case of errors
  3. Fixed superfluous recalculation of segmentation
  4. Added OpenCL code. It's not a full implementation as we simply do a fallback in visualizing mode
    or if we find a bad hash but afterwards much faster.

@jenshannoschwalm
Copy link
Collaborator Author

The latest committing adds a tool that loads any PFM file and makes that available as a rastermask. (In fact with RGB PFM files you can choose the segment per RGB channel and combine as usual.

First image shows the selected raster map and second image the visualizing of the raster mask as we know it while being used by the exposure module.

Bildschirmfoto vom 2025-04-25 18-35-03

Bildschirmfoto vom 2025-04-25 18-35-19

@jenshannoschwalm jenshannoschwalm force-pushed the rastermaps branch 2 times, most recently from 400d0f8 to b9007bc Compare April 26, 2025 06:06
@TurboGit
Copy link
Member

I would suggest to save/restore data provided in struct dt_segmentation_t via another "sidecar" file with another extension.

My proposal would be to not save the raster/segmentation mask but to create a path shape for the masking. So the segmentation mask is only a temporary object in memory and after that we get back to standard selection mask that we can adjust. Yes I know creating a path mask out of a raster mask is certainly difficult. IIRC one has commented that GIMP has such algorithm, maybe we could reused it? Or maybe this is a lib for this?

@jenshannoschwalm
Copy link
Collaborator Author

Split the raster file stuff to another module and force-pushed.

My proposal would be to not save the raster/segmentation mask but to create a path shape for the masking. So the segmentation mask is only a temporary object in memory and after that we get back to standard selection mask that we can adjust. Yes I know creating a path mask out of a raster mask is certainly difficult.

a) we want to support possibly overlapping segments. This could be handled in this module and we provide a "combined path mask"

b) A simple "path" or "brush" mask wouldn't work if the segments are not morphologically "one region", could we agree that we only want "morphologically one part of the image" to be used as a segment? That could be done with a path or brush mask, possibly doing some morphological cleaning up (as done in highlights segmentation) @MikoMikarro would that be sufficient?

@TurboGit
Copy link
Member

TurboGit commented May 4, 2025

A simple "path" or "brush" mask wouldn't work if the segments are not morphologically "one region", could we agree that we only want "morphologically one part of the image" to be used as a segment?

If there is more than one region we can create multiple path masks. And let users combine them on the mask manager as needed.

@TurboGit
Copy link
Member

TurboGit commented May 4, 2025

To explain what I fear. To me having such feature would be quite powerful and will certainly be used more and more and maybe at some point most people would use only the segmentation mask and so a huge raster mask. And so no more masking will be part of the .xmp making it no more self contained. This would be a shame really.

So:

  • A raster mask is huge
  • A raster mask is not saved into XMP
  • A raster mask cannot be edited in dt

If we have a way to convert them as path masks we would have all the power already supported in dt:

  • Can be edited (path & feather)
  • Multiple masks can be selected and combined (union, intersection...)
  • Is saved into XMP

@jenshannoschwalm
Copy link
Collaborator Author

To explain what I fear ...

Good point and indeed i didn't think of this in depth before.

So we can agree on: This module

  1. should not provide a raster mask
  2. has a UI that allows to select and possibly combine selected "segments".
  3. will provide brush/path mask(s) that can be used further on via the mask manager
  4. must keep segmentation results stable as the resulting brush/path masks are depending on this to keep edits from run to run.

@TurboGit
Copy link
Member

TurboGit commented May 5, 2025

1,2,3 OK

I'm not sure about 4. My view is that you run the segmentation tool and create path/brush out of it. Then the segmentation result can be scratched and if the result is different 2 years later that's not a big problem. The saved path/brush are kept as-is anyway. So to summarize the segmentation result is only a temporary object. Or maybe I don't understand your 4th point.

@jenshannoschwalm
Copy link
Collaborator Author

I meant, we should not do the segmentation again if once done as that would overwrite the generated masks.

@TurboGit
Copy link
Member

TurboGit commented May 5, 2025

I meant, we should not do the segmentation again if once done as that would overwrite the generated masks.

Agreed!

EDIT: Or at least the segmentation process is restarted it should create new paths only if they have changed.

**Note 1:** we want to generate rastermasks based on **full** image data or from some external source
that can be used in all other modules as usual.
Examples for this would be AI segmentation algorithms, external mask files or other tools based on whole image data.

As those algorithms can be very performance costly we would like to do the algorithm just once based on full image
data for all pipes.
To make this compliant with darktable's roi strategy without a significant performance drop this module must be
very early in the pipe and requires some care as we want to support all image types.
The segmentation is either done while developing in darkroom fullpipe or while exporting and keep results in
per instance module->data for later usage.

The results are validated via the `dt_dev_pixelpipe_piece_hash()` including model versioning,
if we find a differing hash we do the segmentation algorithm again and refresh the preview pipes.
Please note, all read & write to segmentation data must be protected via a mutex as all pipes will share this.

____________________________________________________________________________________________________________________
**Note 2:** the resulting rastermask is calculated from a selection of segment maps, the maximum
number of possible segments is SEGMAP_MAXSEGMENTS.
For all image locations we have data in segment maps, the number of generated segment maps depends on the algorithm.
AI algorithms might do a segmentation - here the maps for each segment can overlap - providing multiple segment maps.
Other algorithms might provide just one map or 3, maybe for each RGB channel.

To keep memory consumption within limits we
 - keep segment map information in uint8_t maps
 - possibly save & restore maps in lower resolution and do a bilinear interpolation before they
   get transformed via the module->distort_mask() functions to the final rastermask.
 - When keeping maps in lower resolutions a model might provide a post_process function called when
   providing the rastermask, the defaults is a slight gaussian blur.

____________________________________________________________________________________________________________________
**Note 3:** the generated rastermask is combined from a list of selected segment maps.
The module provides the user interface to select/deselect maps for the combined list.

Whenever the module has focus we are in UI visualizing mode showing a false color representation on a dark grayscale
image background. A pixel is
- *brightened* if it is in any segment map.
- *red* if it belongs to the segment under the mouse
- *green* if it is included in the combined raster map list.
- *yellow* if belongs to the segment under the mouse and that segment is included in the combined list.

We can select/deselect segments from the combined list via the mouse,
- a left click *adds* the segment under the mouse to the combined segments
- a right click *removes* it
- if combined with shift a click adds/removes **all** segments to/from the combined segments

A left-mouse double-click de-focus the module for convenience.

____________________________________________________________________________________________________________________
**Note 4:** the segment maps are kept in a dt_segmentation_t struct so we can save/read all data via files
(after agreeing how/where) to keep edits after possibly changing a segmentation model.
@MikoMikarro
Copy link

I'm really happy with how the conversation turned out! It is a very good idea to produce the paths instead of storing the images. At the end, the segmentations are 512x512 or 1024x1024 depending on the model, and having the path would help to further refine the mask.

As I understand, the workflow would work like the following:

  1. You open a module that allows brush masks
  2. You select the subject
  3. If it is the first time this is selected (since opening DT), in this image, the algorithm is executed and generates all the possible subjects.
  4. The artist uses the pointer of the subject selection until it highlights what he wants (maybe adding multiple points would be awesome, as some of the masks are not perfect,t and your subject is part of "multiple" subjects generated by the model.
  5. Once it confirms the subject, we generate the path (or paths), and those are stored as normal DT masks
  6. If the new mask is wanted, go to step 1
  7. Once DT is closed, the temporal raster subjects created by the AI model are lost.

This would also work with other models that were presented in #18356 that are even more precise, so it would make it much more modular for future developers and AI model integrations. Maybe some kind of garbage collection may need to be implemented so the temporal raster subjects are not stored for many pictures and the RAM goes 🔥, but they are indeed only 1 512x512x1 uint8 picture for each of the images that wanted subject selection, so I shouldn't be a problem.

@TurboGit
Copy link
Member

@MikoMikarro : Yes that's basically my thinking about this. This way the AI masks will be properly integrated into darktable.

@MikoMikarro MikoMikarro mentioned this pull request May 17, 2025
@XDjackieXD
Copy link

I'm unsure how well "raster to path" converted masks work for fine details? there are cases - especially combined with any machine learning based foreground/background extraction where I'd want an extremely accurate selection for it to work well (and in such cases I'd have no issue with large XMP files). Hair is one good example where depending on the filters used it can be very obvious if the selection isn't perfect.

@MikoMikarro
Copy link

I understand, but there isn't any current implementation of the masks that would allow for hair-level detail. In any case, the mask fine-tuning you can do with the feather + contrast and detail contrast should be more than enough to get that going. The other advantage of generating paths is that they can be further edited in a way most users are used to. Do you propose any way to store the accurate selections without the problem of the big XML files?

However, it brings a good point that currently, in Darktable is hard to generate very detailed masks "by hand". Usually, I can use the mask magic sliders that are on the bottom to get it to the point I like, but it is an interesting conversation to bring.

@XDjackieXD
Copy link

XDjackieXD commented May 27, 2025

I understand, but there isn't any current implementation of the masks that would allow for hair-level detail.

That is true but even a raster mask import for masks with the same size as the image could be helpful for experimenting with new things or some other workflows.

Do you propose any way to store the accurate selections without the problem of the big XML files?

I know that this is suboptimal in every possible imaginable way (see the Matrix protocol as an example of how shitty this can be...).
I also wouldn't propose to use this all the time but have it as an option for when it is actually useful.
Even if it is base64 encoded - which would be an extremely inefficient way but probably the most obvious one if you'd like to stay 100% XML spec compliant - it would be nice to have it as an option for those cases where it is useful.
If I need/want it for a specific thing I can live with absurdly large XMP files.

in Darktable is hard to generate very detailed masks "by hand".

Yes and I don't need (at least not right now) tooling to edit such detailed mask in Darktables. A pure import/export would be a very good start (ideally there'd be an interface in the plugin system to set such a mask but even that I'd consider "bonus" and not required for the beginning).

EDIT: Oh and I forgot to mention that I definitely see your point about the "magic sliders" usually being enough. Right now it's what I do and it works fine. Sometimes I'm too much of a perfectionist though (non-perfect masks are easy to spot when doing some rather extreme localized adjustments so I sometimes spend waaay too much time tinkering with the masks ^^')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature: new new features to add scope: image processing correcting pixels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants