Posts by Collection

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

Deep Collocative Learning for Immunofixation Electrophoresis Image Analysis

Published in IEEE Transactions on Medical Imaging, 2021

We proposed a new method called collocative learning, in which a collocative tensor has been constructed to transform binary relations into unary relations that are compatible with conventional deep networks, and a location-label-free method that utilizes Grad-CAM saliency maps for evidence backtracking has been proposed for accurate localization. In addition, we have proposed Coached Attention Gates that can regulate the inference of the learning to be more consistent with human logic and thus support the evidence backtracking.

Recommended citation: Xiao-Yong Wei, Zhen-Qun Yang, Xu-Lu Zhang, et al., (2021). "Deep Collocative Learning for Immunofixation Electrophoresis Image Analysis." IEEE Transactions on Medical Imaging.
Download Paper

Identifying the kind behind SMILES—anatomical therapeutic chemical classification using structure-only representations

Published in Briefings in Bioinformatics, 2022

Anatomical Therapeutic Chemical (ATC) classification for compounds/drugs plays an important role in drug development and basic research. However, previous methods depend on interactions extracted from STITCH dataset which may make it depend on lab experiments. We present a pilot study to explore the possibility of conducting the ATC prediction solely based on the molecular structures. The motivation is to eliminate the reliance on the costly lab experiments so that the characteristics of a drug can be pre-assessed for better decision-making and effort-saving before the actual development. To this end, we construct a new benchmark consisting of 4545 compounds which is with larger scale than the one used in previous study. A light-weight prediction model is proposed. The model is with better explainability in the sense that it is consists of a straightforward tokenization that extracts and embeds statistically and physicochemically meaningful tokens, and a deep network backed by a set of pyramid kernels to capture multi-resolution chemical structural characteristics. Its efficacy has been validated in the experiments where it outperforms the state-of-the-art methods by 15.53% in accuracy and by 69.66% in terms of efficiency. We make the benchmark dataset, source code and web server open to ease the reproduction of this study.

Recommended citation: Yi Cao, Zhen-Qun Yang, Xu-Lu Zhang, et al., (2022). "Identifying the kind behind SMILES—anatomical therapeutic chemical classification using structure-only representations." Briefings in Bioinformatics.
Download Paper

Compositional inversion for stable diffusion models

Published in AAAI, 2024

Inversion methods, such as Textual Inversion, generate personalized images by incorporating concepts of interest provided by user images. However, existing methods often suffer from overfitting issues, where the dominant presence of inverted concepts leads to the absence of other desired concepts. It stems from the fact that during inversion, the irrelevant semantics in the user images are also encoded, forcing the inverted concepts to occupy locations far from the core distribution in the embedding space. To address this issue, we propose a method that guides the inversion process towards the core distribution for compositional embeddings. Additionally, we introduce a spatial regularization approach to balance the attention on the concepts being composed. Our method is designed as a post-training approach and can be seamlessly integrated with other inversion methods. Experimental results demonstrate the effectiveness of our proposed approach in mitigating the overfitting problem and generating more diverse and balanced compositions of concepts in the synthesized images. The source code is available at https://github.com/zhangxulu1996/Compositional-Inversion.

Recommended citation: Xulu Zhang et al., (2024). "Compositional inversion for stable diffusion models." AAAI 2024.
Download Paper

Generative active learning for image synthesis personalization

Published in ACM MM 2024, 2024

This paper presents a pilot study that explores the application of active learning, traditionally studied in the context of discriminative models, to generative models. We specifically focus on image synthesis personalization tasks. The primary challenge in conducting active learning on generative models lies in the open-ended nature of querying, which differs from the closed form of querying in discriminative models that typically target a single concept. We introduce the concept of anchor directions to transform the querying process into a semi-open problem. We propose a direction-based uncertainty sampling strategy to enable generative active learning and tackle the exploitation-exploration dilemma. Extensive experiments are conducted to validate the effectiveness of our approach, demonstrating that an open-source model can achieve superior performance compared to closed-source models developed by large companies, such as Google’s StyleDrop. The source code is available at https://github.com/zhangxulu1996/GAL4Personalization.

Recommended citation: Xulu Zhang, et al., (2024). "Generative active learning for image synthesis personalization." ACM MM 2024.
Download Paper

A Survey on Personalized Content Synthesis with Diffusion Models

Published in Arxiv, 2024

This paper presents a pilot study that explores the application of active learning, traditionally studied in the context of discriminative models, to generative models. We specifically focus on image synthesis personalization tasks. The primary challenge in conducting active learning on generative models lies in the open-ended nature of querying, which differs from the closed form of querying in discriminative models that typically target a single concept. We introduce the concept of anchor directions to transform the querying process into a semi-open problem. We propose a direction-based uncertainty sampling strategy to enable generative active learning and tackle the exploitation-exploration dilemma. Extensive experiments are conducted to validate the effectiveness of our approach, demonstrating that an open-source model can achieve superior performance compared to closed-source models developed by large companies, such as Google’s StyleDrop. The source code is available at https://github.com/zhangxulu1996/GAL4Personalization.

Recommended citation: Xulu Zhang, et al., (2024). "A Survey on Personalized Content Synthesis with Diffusion Models." Arxiv.
Download Paper