Description
How to set the Annotator Resolution is always a difficult thing, and users are likely to get frustrating results if their Annotator Resolution is not very correct.
For example, to diffuse 1024 ×1024 images, the Annotator Resolution should be 1024 rather than the default 512 (excepting using depth as Annotator).
However again, because multiple resizing methods are available, the Annotator Resolution also depends on Crop and Resize v.s. Resize and Fill, and the correct Annotator resolution becomes really difficult to think about:
For example, if the A1111 resolution is 640 × 512, the input control image is 512 × 768, and we use “Crop and Resize” then the control image will be first resized by ControlNet to (512×640/512) × (768×640/512) = 640 × 960, and then it will be crpped to 640 × 512.
In this case, if we want the annotator (say canny) to be pixel-perfect, we need to use the short side of 640 × 960, say 640 (not the short side of 512×640 which is 512, !!), and then compute in our human mind that this number should have a closet neighbor to 64 factor, say 64×round(640/64) = 64×10=640. Lucky, it is still 640.
In this way, the final correct Annotator Resolution is 640. What the heck. Who is able do such computation in their mind? I am also confused from time to time.
I think we should have a solution to this, but I think it is a bad idea to force a correct value because we also want to allow users to control the resolution as they want.
Perhaps a better idea is to add some hints but I am not sure where to add such hints. And sometime users may be bored by too many crowded UI. (but if we can implement it in gradio I can have a try)
Anyone has ideas?