DiffSeg is an unsupervised zero-shot segmentation method using attention information from a stable-diffusion model. This repo implements the main DiffSeg algorithm and additionally includes an experimental feature to add semantic labels to the masks based on a generated caption.
Why do you think that https://github.com/BR-IDL/PaddleViT is a good alternative to diffseg