Minghui Hu

Currently, I am a Ph.D Candidate at Nanyang Technological University in Singapore, advised by Prof. P.N.Suganthan. Previousely, I received my MSc. degree in Electric and Electrical Engineering from NTU in 2019. I'm now serving as a research scientist at Temasek Lab @ NTU, under the supervision of Dr. Sirajudeen s/o Gulam Razul.

I am also fortunate to collaborate closely with Prof. T.J.Cham at SCSE,NTU, Dr. Chuanxia Zheng at VGG, University of Oxford and Dr. Chaoyue Wang at University of Sydney.

Email  /  CV  /  Google Scholar  /  Github  /  LinkedIn

profile photo
Research

My research focuses on generative models, multi-modality learning, and its applications in many domains, particularly 2D Image Generation. Prior to this, I was working on a network model with a simple topology named randomized neural networks.

Cocktail🍸: Mixing Multi-Modality Controls for Text-Conditional Image Generation
Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham
arxiv, 2023  
project page / arXiv / PDF / code

We develop a generalized HypreNetwork for multi-modality control based on text-to-image generation.

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis
Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, Dacheng Tao,
arxiv, 2023  
project page / arXiv / PDF

We introduce a Mixture-of-Modality-Tokens Transformer (MMoT) that adaptively fuses fine-grained multimodal control signals for multi-modality image generation.

Versatile LiDAR-Inertial Odometry with SE(2) Constraints for Ground Vehicles
Jiayang Chen, Han Wang, Minghui Hu, P.N.Suganthan,
RA-L  
IEEE

We propose a hybrid LiDAR-inertial SLAM framework that leverages both the on-board perception system and prior information such as motion dynamics to improve localization performance.

Class-Incremental Learning on Multivariate Time Series Via Shape-Aligned Temporal Distillation
Zhongzheng Qiao, Minghui Hu, Xudong Jiang, P.N.Suganthan, Ramasamy Savitha,
ICASSP, 2023  
IEEE

We propose to exploit Soft-Dynamic Time Warping (Soft-DTW) for knowledge distillation, which aligns the feature maps along the temporal dimension before calculating the discrepancy.

Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Minghui Hu, Chuanxia Zheng, Zuopeng Yang, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, Dacheng Tao, P.N.Suganthan
ICLR, 2023  
project page / arXiv / PDF

We construct a unified discrete diffusion model for simultaneous vision-language generation.

Global Context with Discrete Diffusion in Vector Quantised Modelling for Image Generation
Minghui Hu, Yujie Wang, Tat-Jen Cham, Jianfei Yang, P.N.Suganthan
CVPR, 2022  
arXiv / PDF

Instead of AutoRegresive Transformers, we use Discrete Diffusion Model to obtain a better global context for image generation.

Academic Services

Conference Reviewer

CVPR    2022, 2023
ICCV    2023
NeurIPS    2023
ICLR    2023
ICASSP    2023
IJCNN    2020; 2021; 2022, 2023

Journal Reviewer

PR, ASOC, EAAI, TNNLS, NeuNet, Neucom


Yep it's another Jon Barron website.
Last updated Feb. 2023.