-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Implement a segmentation and rastermask UI frontend module #18722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
de994d6
to
5a939e4
Compare
Latest squashed and force-pushed commit version has some fixes and improvements:
|
f72ba00
to
8087365
Compare
The latest committing adds a tool that loads any PFM file and makes that available as a rastermask. (In fact with RGB PFM files you can choose the segment per RGB channel and combine as usual. First image shows the selected raster map and second image the visualizing of the raster mask as we know it while being used by the exposure module. |
400d0f8
to
b9007bc
Compare
My proposal would be to not save the raster/segmentation mask but to create a path shape for the masking. So the segmentation mask is only a temporary object in memory and after that we get back to standard selection mask that we can adjust. Yes I know creating a path mask out of a raster mask is certainly difficult. IIRC one has commented that GIMP has such algorithm, maybe we could reused it? Or maybe this is a lib for this? |
b9007bc
to
d44e82c
Compare
Split the raster file stuff to another module and force-pushed.
a) we want to support possibly overlapping segments. This could be handled in this module and we provide a "combined path mask" b) A simple "path" or "brush" mask wouldn't work if the segments are not morphologically "one region", could we agree that we only want "morphologically one part of the image" to be used as a segment? That could be done with a path or brush mask, possibly doing some morphological cleaning up (as done in highlights segmentation) @MikoMikarro would that be sufficient? |
If there is more than one region we can create multiple path masks. And let users combine them on the mask manager as needed. |
To explain what I fear. To me having such feature would be quite powerful and will certainly be used more and more and maybe at some point most people would use only the segmentation mask and so a huge raster mask. And so no more masking will be part of the .xmp making it no more self contained. This would be a shame really. So:
If we have a way to convert them as path masks we would have all the power already supported in dt:
|
Good point and indeed i didn't think of this in depth before. So we can agree on: This module
|
1,2,3 OK I'm not sure about 4. My view is that you run the segmentation tool and create path/brush out of it. Then the segmentation result can be scratched and if the result is different 2 years later that's not a big problem. The saved path/brush are kept as-is anyway. So to summarize the segmentation result is only a temporary object. Or maybe I don't understand your 4th point. |
I meant, we should not do the segmentation again if once done as that would overwrite the generated masks. |
Agreed! EDIT: Or at least the segmentation process is restarted it should create new paths only if they have changed. |
**Note 1:** we want to generate rastermasks based on **full** image data or from some external source that can be used in all other modules as usual. Examples for this would be AI segmentation algorithms, external mask files or other tools based on whole image data. As those algorithms can be very performance costly we would like to do the algorithm just once based on full image data for all pipes. To make this compliant with darktable's roi strategy without a significant performance drop this module must be very early in the pipe and requires some care as we want to support all image types. The segmentation is either done while developing in darkroom fullpipe or while exporting and keep results in per instance module->data for later usage. The results are validated via the `dt_dev_pixelpipe_piece_hash()` including model versioning, if we find a differing hash we do the segmentation algorithm again and refresh the preview pipes. Please note, all read & write to segmentation data must be protected via a mutex as all pipes will share this. ____________________________________________________________________________________________________________________ **Note 2:** the resulting rastermask is calculated from a selection of segment maps, the maximum number of possible segments is SEGMAP_MAXSEGMENTS. For all image locations we have data in segment maps, the number of generated segment maps depends on the algorithm. AI algorithms might do a segmentation - here the maps for each segment can overlap - providing multiple segment maps. Other algorithms might provide just one map or 3, maybe for each RGB channel. To keep memory consumption within limits we - keep segment map information in uint8_t maps - possibly save & restore maps in lower resolution and do a bilinear interpolation before they get transformed via the module->distort_mask() functions to the final rastermask. - When keeping maps in lower resolutions a model might provide a post_process function called when providing the rastermask, the defaults is a slight gaussian blur. ____________________________________________________________________________________________________________________ **Note 3:** the generated rastermask is combined from a list of selected segment maps. The module provides the user interface to select/deselect maps for the combined list. Whenever the module has focus we are in UI visualizing mode showing a false color representation on a dark grayscale image background. A pixel is - *brightened* if it is in any segment map. - *red* if it belongs to the segment under the mouse - *green* if it is included in the combined raster map list. - *yellow* if belongs to the segment under the mouse and that segment is included in the combined list. We can select/deselect segments from the combined list via the mouse, - a left click *adds* the segment under the mouse to the combined segments - a right click *removes* it - if combined with shift a click adds/removes **all** segments to/from the combined segments A left-mouse double-click de-focus the module for convenience. ____________________________________________________________________________________________________________________ **Note 4:** the segment maps are kept in a dt_segmentation_t struct so we can save/read all data via files (after agreeing how/where) to keep edits after possibly changing a segmentation model.
d44e82c
to
206a71c
Compare
I'm really happy with how the conversation turned out! It is a very good idea to produce the paths instead of storing the images. At the end, the segmentations are 512x512 or 1024x1024 depending on the model, and having the path would help to further refine the mask. As I understand, the workflow would work like the following:
This would also work with other models that were presented in #18356 that are even more precise, so it would make it much more modular for future developers and AI model integrations. Maybe some kind of garbage collection may need to be implemented so the temporal raster subjects are not stored for many pictures and the RAM goes 🔥, but they are indeed only 1 512x512x1 uint8 picture for each of the images that wanted subject selection, so I shouldn't be a problem. |
@MikoMikarro : Yes that's basically my thinking about this. This way the AI masks will be properly integrated into darktable. |
I'm unsure how well "raster to path" converted masks work for fine details? there are cases - especially combined with any machine learning based foreground/background extraction where I'd want an extremely accurate selection for it to work well (and in such cases I'd have no issue with large XMP files). Hair is one good example where depending on the filters used it can be very obvious if the selection isn't perfect. |
I understand, but there isn't any current implementation of the masks that would allow for hair-level detail. In any case, the mask fine-tuning you can do with the feather + contrast and detail contrast should be more than enough to get that going. The other advantage of generating paths is that they can be further edited in a way most users are used to. Do you propose any way to store the accurate selections without the problem of the big XML files? However, it brings a good point that currently, in Darktable is hard to generate very detailed masks "by hand". Usually, I can use the mask magic sliders that are on the bottom to get it to the point I like, but it is an interesting conversation to bring. |
That is true but even a raster mask import for masks with the same size as the image could be helpful for experimenting with new things or some other workflows.
I know that this is suboptimal in every possible imaginable way (see the Matrix protocol as an example of how shitty this can be...).
Yes and I don't need (at least not right now) tooling to edit such detailed mask in Darktables. A pure import/export would be a very good start (ideally there'd be an interface in the plugin system to set such a mask but even that I'd consider "bonus" and not required for the beginning). EDIT: Oh and I forgot to mention that I definitely see your point about the "magic sliders" usually being enough. Right now it's what I do and it works fine. Sometimes I'm too much of a perfectionist though (non-perfect masks are easy to spot when doing some rather extreme localized adjustments so I sometimes spend waaay too much time tinkering with the masks ^^') |
After a lot of experimenting and showing preliminary versions - see #18528 - be reminded about #18356 where this all started from.
See this as a generic UI frontend for such AI segmentation stuff or other tools that work on complete image data and should be computed once a) for performance reasons, b) because output is not stable as for AI algos or c) require some other external source.
To give something useful - and also to play around right now - there is a variance analysing tool generating segment masks for each RGB channel. Somewhat like details threshhold ... next tool is "file mask" reading allowing external files to be used as raster masks.

@MikoMikarro @andriiryzhkov haven't heard from your work since a while but this should hopefully be something to work on more easily.
@TurboGit and @dterrahe - this is an example how the new
module->data
is used asdt_iop_segmap_module_data_t
.Please note that segmentation data is simply too costly to be included as a parameter. (Can easily be many MB massively blowing up xmp files) I would suggest to save/restore data provided in
struct dt_segmentation_t
via another "sidecar" file with another extension.From commit docs
Note 1: we want to generate rastermasks based on full image data or from some external source that can be used in all other modules as usual.
Examples for this would be AI segmentation algorithms, external mask files or other tools based on whole image data.
As those algorithms can be very performance costly we would like to do the algorithm just once based on full image data for all pipes.
To make this compliant with darktable's roi strategy without a significant performance drop this module must be very early in the pipe and requires some care as we want to support all image types. The segmentation is either done while developing in darkroom fullpipe or while exporting and keep results in per instance module->data for later usage.
The results are validated via the
dt_dev_pixelpipe_piece_hash()
including model versioning, if we find a differing hash we do the segmentation algorithm again and refresh the preview pipes. Please note, all read & write to segmentation data must be protected via a mutex as all pipes will share this.Note 2: the resulting rastermask is calculated from a selection of segment maps, the maximum number of possible segments is SEGMAP_MAXSEGMENTS. For all image locations we have data in segment maps, the number of generated segment maps depends on the algorithm. AI algorithms might do a segmentation - here the maps for each segment can overlap - providing multiple segment maps. Other algorithms might provide just one map or 3, maybe for each RGB channel.
To keep memory consumption within limits we
Note 3: the generated rastermask is combined from a list of selected segment maps. The module provides the user interface to select/deselect maps for the combined list.
Whenever the module has focus we are in UI visualizing mode showing a false color representation on a dark grayscale image background. A pixel is
We can select/deselect segments from the combined list via the mouse,
A double-click de-focus the module for convenience.
Note 4: the segment maps are kept in a dt_segmentation_t struct so we can save/read all data via files (after agreeing how/where) to keep edits after possibly changing a segmentation model.
EDIT: Later commits adds the tool to provide a rastermask from external image files.