Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advice on creating a simple SegmentAnything2 integration #18168

Open
kalmjasper opened this issue Jan 6, 2025 · 7 comments
Open

Advice on creating a simple SegmentAnything2 integration #18168

kalmjasper opened this issue Jan 6, 2025 · 7 comments

Comments

@kalmjasper
Copy link

I love to use darktable for editing my photos but my main annoyance is that I still need to hand-draw masks for portraits and this can get to a lot of masks quickly....

Therefore I thought I'd give it a shot to see if I could use the output of Segment Anything 2 in Darktable to save me from manually making all these masks! A screenshot of the ui that i whipped up:

image

r/DarkTable - Working on a simple Segment Anything 2 integration for Darktable, looking for input
My current approach is:

Select some points in the image and let Segment Anything do it's magic

Convert the points to the format that the Path mask uses (I'd change the `darktable:mask_points)

<rdf:li
darktable:mask_num="11"
darktable:mask_id="1736033909"
darktable:mask_type="2"
darktable:mask_name="path #2"
darktable:mask_version="6"
darktable:mask_points="gz03eJzL+hFttylZwj4Ljea6rmwDwowMDAwqr2PsEgX47NFpZDVHVyTaxYr+sUOnkdUAAAAHJBY="
darktable:mask_nb="3"
darktable:mask_src="0000000000000000"/>
</rdf:Seq>
  1. Reopen darktable to reload the xmp file with the newly added masks

Some problems / thoughts that I currently have:

  • I'm currently facing some issues when writing back the points to the file. Reading and editing points (translate a mask) is currently no problem for my code but when I replace the mask's path points with the output of the masking code I get masks in very weird shapes

  • It does not really feel right first extract the outline from the mask and use that as a mask. Using the mask defined on the pixels would be a lot nicer, is there any way that I can get a custom image into the intermediate masks in the rendering pipeline?

Some input from darktable devs who know a lot better then me how this all works would be really appreciated! I've also been looking if it would be nice to convert this into a darktable plugin when it's working reliably but I've had a a lot of trouble finding good resources...

I've also read through the LUA API docs but didn't find any functionality there to alter masks. Maybe there is a way to accomplish this that I didn't find in the docs??

Here is the code:
https://github.com/kalmjasper/segmentanything_darktable

@dterrahe
Copy link
Member

dterrahe commented Jan 6, 2025

  • is there any way that I can get a custom image into the intermediate masks in the rendering pipeline?

Well, no.

Unless you like really dirty hacks...

You can use the composite module to load the bitmap output of SAM2 into the image pipeline. (first import the mask into a separate image and then use that as the source for composite).

Then you can create a raster mask on the output of composite (by selecting parametric mask and enabling "show output channels" in the blending hamburger menu). With the first output four-slider selector (gray value) select only portions of the mask image that are near black. (if you are interested in the bright areas of the mask image, you can probably use negadoctor on the imported mask to reverse that). That mask will also be used to blend the composite itself (i.e. the mask would get blended into your original image) which you don't want, but it doesn't matter if you add more black. So you have to select "addition" blend mode. If your mask selects only black, then adding that to your original image will not have an impact.

Toggle "display mask" to see if you got the correct raster mask.
image

Now in the module where you want to use the mask, select "raster mask" and as source of your mask select "composite".

@wpferguson
Copy link
Member

wpferguson commented Jan 7, 2025

What if....

  • you have an image you want to mask, so you draw a crude path mask.
  • I access the mask and get the points
  • i have an exporter that exports the image to jpg and a points file for the sam2 engine
  • the sam2 engine generates a good mask from the crude set of points and jpg and outputs the points to a file
  • I import the file and replace the crude mask points with the sam2 points

I guess this fits the definition of dirty hack :-)

EDIT: What if...

  • we had a module, sam2 whose only purpose was to have/hold the mask.
  • you put it before the modules that need the mask so they can access it as a raster mask
  • you could have multiple instances at different places in the pipeline.
  • since the mask would truly be a mask, couldn't you just reuse the shape?

@MStraeten
Copy link
Collaborator

  • we had a module, sam2 whose only purpose was to have/hold the mask.

or a more generic module allowing to hand over the content of the processed image a that state to a customizable app (e.g. python scripts to play around with arbitrary ai applications) and receive a raster mask (maybe also a processed image). Since an increasing numbers of ai models can be used in a local environment this won't restrict use cases.

very important to avoid performance issues: that shouldn't be updated automatically on changes in the pixel pipe; just on an explicit command by the user.

@kalmjasper
Copy link
Author

kalmjasper commented Jan 7, 2025

@dterrahe Honestly not a completely terrible idea haha. I can't see a way in the lua docs to script modules though... Will try to get it going over the weekend.

@wpferguson @MStraeten Having something like a module that can handle this type of input / output and is nicely scriptable would be ideal. I couldn't really find a way to make a custom darktable plugin that works on this level, would be happy to give it a shot otherwise. How would you approach this?

I also posted the same question on the darktable reddit and people generally really like the idea, I think there'd be quite some interest if there is a not too hacky way to integrate this

@wpferguson
Copy link
Member

The boilerplate code for a module is src/iop/useless.c. Since all we are interested in is a mask, it might not need many changes.

Years ago AP created a module to call Krita from darkroom, pass an image, make changes in Krita, and return the changed data. It would be nice if someone had that code lying around.

@kalmjasper
Copy link
Author

Yeah that'd be amazing... How do people generally go about custom modules? Is it easy to have a github repo up where people can use the custom module? I haven't been able to find examples of this so far...

@MStraeten
Copy link
Collaborator

You can fork darktable repository and then play around. If it’s good enough for a field test, then you can do a pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants