ISIPLab

This is Intelligent Signal and Image Processing Lab at the Computer Science and Technogy Institute of Anhui University. Our research focuses on providing the state of the art technologies and exploring theories for modern signal and image processing. Specifically, we have proposed a series of fast real-value discrete Gabor transform theories and/or algorithms to efficiently perform signal transformation, the aim of which is to obtain a high resolution spectrum simultaneously in time and frequency domain. These technologies have been widely used in speech processing, time-frequency signal analysis, and image compression, et al.

In addition, we are interested in speech and image processing for the next generation of human-computer interface as well as for improving human health via intelligent signal processing methodology. For example, we have proposed efficient technologies for improving the communication quality of patients of laryngocarcinoma via transforming the whisper-like speech to the normal one. For improving the communication in very noisy environment, we provide the receiver a normal voiced speech which is obtained from the bone-conducted microphone, which can be widely used in noisy factory and military scenes. We also pay much attention to vein and palmprint recognition technology which has been successfully used in industry areas.

Currently, we also deliver part of our attention to physiological signal processing such as EEG for emotional conversion or recognition and brain PET signal for Alzheimer’s disease prediction.

news

Sep 21, 2023	Our Paper “Controllable Multi-Speaker Emotional Speech Synthesis With Generalization Module” got a “Major Revison” by IEEE Transactions on Affective Computing.

latest posts

Dec 2, 2023	Demo for "Controllable Multi-Speaker Emotional Speech Synthesis With Generalization Module"
Jul 12, 2023	a post with bibliography
Jul 4, 2023	a post with jupyter notebook

selected publications

A Novel Attention-Guided Generative Adversarial Network for Whisper-to-Normal Speech Conversion

Teng Gao, Qing Pan, Jian Zhou, and 3 more authors

Cognitive Computation, Jan 2023

Bib HTML

@article{20230115,
  author = {Gao, Teng and Pan, Qing and Zhou, Jian and Wang, Huabin and Tao, Liang and Kwan, Hon},
  year = {2023},
  month = jan,
  title = {A Novel Attention-Guided Generative Adversarial Network for Whisper-to-Normal Speech Conversion},
  volume = {15},
  journal = {Cognitive Computation},
  doi = {10.1007/s12559-023-10108-9},
}

SETransformer: Speech Enhancement Transformer

Weiwei Yu, Jian Zhou, HuaBin Wang, and 1 more author

Cognitive Computation, May 2022

Bib HTML

@article{20220514001,
  author = {Yu, Weiwei and Zhou, Jian and Wang, HuaBin and Tao, Liang},
  year = {2022},
  month = may,
  title = {SETransformer: Speech Enhancement Transformer},
  volume = {14},
  journal = {Cognitive Computation},
  doi = {10.1007/s12559-020-09817-2},
}

Multistage Model for Robust Face Alignment Using Deep Neural Networks

Huabin Wang, Rui Cheng, Jian Zhou, and 2 more authors

Cognitive Computation, May 2022

Bib HTML

@article{20220514,
  author = {Wang, Huabin and Cheng, Rui and Zhou, Jian and Tao, Liang and Kwan, Hon},
  year = {2022},
  month = may,
  title = {Multistage Model for Robust Face Alignment Using Deep Neural Networks},
  volume = {14},
  journal = {Cognitive Computation},
  doi = {10.1007/s12559-021-09846-5},
}