Implement a segmentation and rastermask UI frontend module #18722

jenshannoschwalm · 2025-04-23T12:10:17Z

After a lot of experimenting and showing preliminary versions - see #18528 - be reminded about #18356 where this all started from.

See this as a generic UI frontend for such AI segmentation stuff or other tools that work on complete image data and should be computed once a) for performance reasons, b) because output is not stable as for AI algos or c) require some other external source.

To give something useful - and also to play around right now - there is a variance analysing tool generating segment masks for each RGB channel. Somewhat like details threshhold ... next tool is "file mask" reading allowing external files to be used as raster masks.

@MikoMikarro @andriiryzhkov haven't heard from your work since a while but this should hopefully be something to work on more easily.

@TurboGit and @dterrahe - this is an example how the new module->data is used as dt_iop_segmap_module_data_t.
Please note that segmentation data is simply too costly to be included as a parameter. (Can easily be many MB massively blowing up xmp files) I would suggest to save/restore data provided in struct dt_segmentation_t via another "sidecar" file with another extension.

From commit docs

Note 1: we want to generate rastermasks based on full image data or from some external source that can be used in all other modules as usual.
Examples for this would be AI segmentation algorithms, external mask files or other tools based on whole image data.

As those algorithms can be very performance costly we would like to do the algorithm just once based on full image data for all pipes.
To make this compliant with darktable's roi strategy without a significant performance drop this module must be very early in the pipe and requires some care as we want to support all image types. The segmentation is either done while developing in darkroom fullpipe or while exporting and keep results in per instance module->data for later usage.

The results are validated via the dt_dev_pixelpipe_piece_hash() including model versioning, if we find a differing hash we do the segmentation algorithm again and refresh the preview pipes. Please note, all read & write to segmentation data must be protected via a mutex as all pipes will share this.

Note 2: the resulting rastermask is calculated from a selection of segment maps, the maximum number of possible segments is SEGMAP_MAXSEGMENTS. For all image locations we have data in segment maps, the number of generated segment maps depends on the algorithm. AI algorithms might do a segmentation - here the maps for each segment can overlap - providing multiple segment maps. Other algorithms might provide just one map or 3, maybe for each RGB channel.

To keep memory consumption within limits we

keep segment map information in uint8_t maps
possibly save & restore maps in lower resolution and do a bilinear interpolation before they get transformed via the module->distort_mask() functions to the final rastermask.
When keeping maps in lower resolutions a model might provide a post_process function called when providing the rastermask, the defaults is a slight gaussian blur.

Note 3: the generated rastermask is combined from a list of selected segment maps. The module provides the user interface to select/deselect maps for the combined list.

Whenever the module has focus we are in UI visualizing mode showing a false color representation on a dark grayscale image background. A pixel is

brightened if it is in any segment map.
red if it belongs to the segment under the mouse
green if it is included in the combined raster map list.
yellow if belongs to the segment under the mouse and that segment is included in the combined list.

We can select/deselect segments from the combined list via the mouse,

a left click adds the segment under the mouse to the combined segments
a right click removes it
if combined with shift a click adds/removes all segments to/from the combined segments

A double-click de-focus the module for convenience.

Note 4: the segment maps are kept in a dt_segmentation_t struct so we can save/read all data via files (after agreeing how/where) to keep edits after possibly changing a segmentation model.

EDIT: Later commits adds the tool to provide a rastermask from external image files.

jenshannoschwalm · 2025-04-24T16:02:21Z

Latest squashed and force-pushed commit version has some fixes and improvements:

Slightly faster and quality-improved visual mode
Better logs and dt_control log in case of errors
Fixed superfluous recalculation of segmentation
Added OpenCL code. It's not a full implementation as we simply do a fallback in visualizing mode
or if we find a bad hash but afterwards much faster.

jenshannoschwalm · 2025-04-25T17:15:16Z

The latest committing adds a tool that loads any PFM file and makes that available as a rastermask. (In fact with RGB PFM files you can choose the segment per RGB channel and combine as usual.

First image shows the selected raster map and second image the visualizing of the raster mask as we know it while being used by the exposure module.

TurboGit · 2025-04-30T06:30:26Z

I would suggest to save/restore data provided in struct dt_segmentation_t via another "sidecar" file with another extension.

My proposal would be to not save the raster/segmentation mask but to create a path shape for the masking. So the segmentation mask is only a temporary object in memory and after that we get back to standard selection mask that we can adjust. Yes I know creating a path mask out of a raster mask is certainly difficult. IIRC one has commented that GIMP has such algorithm, maybe we could reused it? Or maybe this is a lib for this?

jenshannoschwalm · 2025-05-04T08:20:38Z

Split the raster file stuff to another module and force-pushed.

My proposal would be to not save the raster/segmentation mask but to create a path shape for the masking. So the segmentation mask is only a temporary object in memory and after that we get back to standard selection mask that we can adjust. Yes I know creating a path mask out of a raster mask is certainly difficult.

a) we want to support possibly overlapping segments. This could be handled in this module and we provide a "combined path mask"

b) A simple "path" or "brush" mask wouldn't work if the segments are not morphologically "one region", could we agree that we only want "morphologically one part of the image" to be used as a segment? That could be done with a path or brush mask, possibly doing some morphological cleaning up (as done in highlights segmentation) @MikoMikarro would that be sufficient?

TurboGit · 2025-05-04T16:41:26Z

A simple "path" or "brush" mask wouldn't work if the segments are not morphologically "one region", could we agree that we only want "morphologically one part of the image" to be used as a segment?

If there is more than one region we can create multiple path masks. And let users combine them on the mask manager as needed.

TurboGit · 2025-05-04T16:50:58Z

To explain what I fear. To me having such feature would be quite powerful and will certainly be used more and more and maybe at some point most people would use only the segmentation mask and so a huge raster mask. And so no more masking will be part of the .xmp making it no more self contained. This would be a shame really.

So:

A raster mask is huge
A raster mask is not saved into XMP
A raster mask cannot be edited in dt

If we have a way to convert them as path masks we would have all the power already supported in dt:

Can be edited (path & feather)
Multiple masks can be selected and combined (union, intersection...)
Is saved into XMP

jenshannoschwalm · 2025-05-05T02:32:48Z

To explain what I fear ...

Good point and indeed i didn't think of this in depth before.

So we can agree on: This module

should not provide a raster mask
has a UI that allows to select and possibly combine selected "segments".
will provide brush/path mask(s) that can be used further on via the mask manager
must keep segmentation results stable as the resulting brush/path masks are depending on this to keep edits from run to run.

TurboGit · 2025-05-05T06:08:44Z

1,2,3 OK

I'm not sure about 4. My view is that you run the segmentation tool and create path/brush out of it. Then the segmentation result can be scratched and if the result is different 2 years later that's not a big problem. The saved path/brush are kept as-is anyway. So to summarize the segmentation result is only a temporary object. Or maybe I don't understand your 4th point.

jenshannoschwalm · 2025-05-05T07:32:04Z

I meant, we should not do the segmentation again if once done as that would overwrite the generated masks.

TurboGit · 2025-05-05T07:46:53Z

I meant, we should not do the segmentation again if once done as that would overwrite the generated masks.

Agreed!

EDIT: Or at least the segmentation process is restarted it should create new paths only if they have changed.

**Note 1:** we want to generate rastermasks based on **full** image data or from some external source that can be used in all other modules as usual. Examples for this would be AI segmentation algorithms, external mask files or other tools based on whole image data. As those algorithms can be very performance costly we would like to do the algorithm just once based on full image data for all pipes. To make this compliant with darktable's roi strategy without a significant performance drop this module must be very early in the pipe and requires some care as we want to support all image types. The segmentation is either done while developing in darkroom fullpipe or while exporting and keep results in per instance module->data for later usage. The results are validated via the `dt_dev_pixelpipe_piece_hash()` including model versioning, if we find a differing hash we do the segmentation algorithm again and refresh the preview pipes. Please note, all read & write to segmentation data must be protected via a mutex as all pipes will share this. ____________________________________________________________________________________________________________________ **Note 2:** the resulting rastermask is calculated from a selection of segment maps, the maximum number of possible segments is SEGMAP_MAXSEGMENTS. For all image locations we have data in segment maps, the number of generated segment maps depends on the algorithm. AI algorithms might do a segmentation - here the maps for each segment can overlap - providing multiple segment maps. Other algorithms might provide just one map or 3, maybe for each RGB channel. To keep memory consumption within limits we - keep segment map information in uint8_t maps - possibly save & restore maps in lower resolution and do a bilinear interpolation before they get transformed via the module->distort_mask() functions to the final rastermask. - When keeping maps in lower resolutions a model might provide a post_process function called when providing the rastermask, the defaults is a slight gaussian blur. ____________________________________________________________________________________________________________________ **Note 3:** the generated rastermask is combined from a list of selected segment maps. The module provides the user interface to select/deselect maps for the combined list. Whenever the module has focus we are in UI visualizing mode showing a false color representation on a dark grayscale image background. A pixel is - *brightened* if it is in any segment map. - *red* if it belongs to the segment under the mouse - *green* if it is included in the combined raster map list. - *yellow* if belongs to the segment under the mouse and that segment is included in the combined list. We can select/deselect segments from the combined list via the mouse, - a left click *adds* the segment under the mouse to the combined segments - a right click *removes* it - if combined with shift a click adds/removes **all** segments to/from the combined segments A left-mouse double-click de-focus the module for convenience. ____________________________________________________________________________________________________________________ **Note 4:** the segment maps are kept in a dt_segmentation_t struct so we can save/read all data via files (after agreeing how/where) to keep edits after possibly changing a segmentation model.

MikoMikarro · 2025-05-14T07:00:21Z

I'm really happy with how the conversation turned out! It is a very good idea to produce the paths instead of storing the images. At the end, the segmentations are 512x512 or 1024x1024 depending on the model, and having the path would help to further refine the mask.

As I understand, the workflow would work like the following:

You open a module that allows brush masks
You select the subject
If it is the first time this is selected (since opening DT), in this image, the algorithm is executed and generates all the possible subjects.
The artist uses the pointer of the subject selection until it highlights what he wants (maybe adding multiple points would be awesome, as some of the masks are not perfect,t and your subject is part of "multiple" subjects generated by the model.
Once it confirms the subject, we generate the path (or paths), and those are stored as normal DT masks
If the new mask is wanted, go to step 1
Once DT is closed, the temporal raster subjects created by the AI model are lost.

This would also work with other models that were presented in #18356 that are even more precise, so it would make it much more modular for future developers and AI model integrations. Maybe some kind of garbage collection may need to be implemented so the temporal raster subjects are not stored for many pictures and the RAM goes 🔥, but they are indeed only 1 512x512x1 uint8 picture for each of the images that wanted subject selection, so I shouldn't be a problem.

TurboGit · 2025-05-14T15:46:40Z

@MikoMikarro : Yes that's basically my thinking about this. This way the AI masks will be properly integrated into darktable.

XDjackieXD · 2025-05-25T11:43:13Z

I'm unsure how well "raster to path" converted masks work for fine details? there are cases - especially combined with any machine learning based foreground/background extraction where I'd want an extremely accurate selection for it to work well (and in such cases I'd have no issue with large XMP files). Hair is one good example where depending on the filters used it can be very obvious if the selection isn't perfect.

MikoMikarro · 2025-05-26T15:30:43Z

I understand, but there isn't any current implementation of the masks that would allow for hair-level detail. In any case, the mask fine-tuning you can do with the feather + contrast and detail contrast should be more than enough to get that going. The other advantage of generating paths is that they can be further edited in a way most users are used to. Do you propose any way to store the accurate selections without the problem of the big XML files?

However, it brings a good point that currently, in Darktable is hard to generate very detailed masks "by hand". Usually, I can use the mask magic sliders that are on the bottom to get it to the point I like, but it is an interesting conversation to bring.

XDjackieXD · 2025-05-27T09:14:59Z

I understand, but there isn't any current implementation of the masks that would allow for hair-level detail.

That is true but even a raster mask import for masks with the same size as the image could be helpful for experimenting with new things or some other workflows.

Do you propose any way to store the accurate selections without the problem of the big XML files?

I know that this is suboptimal in every possible imaginable way (see the Matrix protocol as an example of how shitty this can be...).
I also wouldn't propose to use this all the time but have it as an option for when it is actually useful.
Even if it is base64 encoded - which would be an extremely inefficient way but probably the most obvious one if you'd like to stay 100% XML spec compliant - it would be nice to have it as an option for those cases where it is useful.
If I need/want it for a specific thing I can live with absurdly large XMP files.

in Darktable is hard to generate very detailed masks "by hand".

Yes and I don't need (at least not right now) tooling to edit such detailed mask in Darktables. A pure import/export would be a very good start (ideally there'd be an interface in the plugin system to set such a mask but even that I'd consider "bonus" and not required for the beginning).

EDIT: Oh and I forgot to mention that I definitely see your point about the "magic sliders" usually being enough. Right now it's what I do and it works fine. Sometimes I'm too much of a perfectionist though (non-perfect masks are easy to spot when doing some rather extreme localized adjustments so I sometimes spend waaay too much time tinkering with the masks ^^')

jenshannoschwalm added feature: new new features to add scope: image processing correcting pixels labels Apr 23, 2025

jenshannoschwalm force-pushed the rastermaps branch 3 times, most recently from de994d6 to 5a939e4 Compare April 24, 2025 15:58

jenshannoschwalm force-pushed the rastermaps branch from f72ba00 to 8087365 Compare April 25, 2025 17:10

jenshannoschwalm force-pushed the rastermaps branch 2 times, most recently from 400d0f8 to b9007bc Compare April 26, 2025 06:06

marc-fouquet mentioned this pull request May 1, 2025

Tone equalizer 2025-04-06 preview version #18656

Open

jenshannoschwalm force-pushed the rastermaps branch from b9007bc to d44e82c Compare May 4, 2025 07:23

jenshannoschwalm force-pushed the rastermaps branch from d44e82c to 206a71c Compare May 8, 2025 17:39

MikoMikarro mentioned this pull request May 17, 2025

AI Masks #12295

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement a segmentation and rastermask UI frontend module #18722

Implement a segmentation and rastermask UI frontend module #18722

Uh oh!

jenshannoschwalm commented Apr 23, 2025 •

edited

Loading

Uh oh!

jenshannoschwalm commented Apr 24, 2025

Uh oh!

jenshannoschwalm commented Apr 25, 2025

Uh oh!

TurboGit commented Apr 30, 2025

Uh oh!

jenshannoschwalm commented May 4, 2025

Uh oh!

TurboGit commented May 4, 2025

Uh oh!

TurboGit commented May 4, 2025

Uh oh!

jenshannoschwalm commented May 5, 2025

Uh oh!

TurboGit commented May 5, 2025

Uh oh!

jenshannoschwalm commented May 5, 2025

Uh oh!

TurboGit commented May 5, 2025 •

edited

Loading

Uh oh!

MikoMikarro commented May 14, 2025

Uh oh!

TurboGit commented May 14, 2025

Uh oh!

XDjackieXD commented May 25, 2025

Uh oh!

MikoMikarro commented May 26, 2025

Uh oh!

XDjackieXD commented May 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Implement a segmentation and rastermask UI frontend module #18722

Are you sure you want to change the base?

Implement a segmentation and rastermask UI frontend module #18722

Uh oh!

Conversation

jenshannoschwalm commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jenshannoschwalm commented Apr 24, 2025

Uh oh!

jenshannoschwalm commented Apr 25, 2025

Uh oh!

TurboGit commented Apr 30, 2025

Uh oh!

jenshannoschwalm commented May 4, 2025

Uh oh!

TurboGit commented May 4, 2025

Uh oh!

TurboGit commented May 4, 2025

Uh oh!

jenshannoschwalm commented May 5, 2025

Uh oh!

TurboGit commented May 5, 2025

Uh oh!

jenshannoschwalm commented May 5, 2025

Uh oh!

TurboGit commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MikoMikarro commented May 14, 2025

Uh oh!

TurboGit commented May 14, 2025

Uh oh!

XDjackieXD commented May 25, 2025

Uh oh!

MikoMikarro commented May 26, 2025

Uh oh!

XDjackieXD commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jenshannoschwalm commented Apr 23, 2025 •

edited

Loading

TurboGit commented May 5, 2025 •

edited

Loading

XDjackieXD commented May 27, 2025 •

edited

Loading