Shipeng Wang (王世鹏)

Assistant Professor
School of Life Science and Technology
Xi'an Jiaotong University

Research:

My research aims to develop advanced medical image reconstruction techniques by leveraging large language models, large vision models, and multimodal foundation models. The goal is to achieve more accurate and clinically relevant medical image reconstruction.

About Me:

I am an Assistant Professor at Xi’an Jiaotong University. My graduate advisor was Academician Prof. Zongben Xu, with joint guidance from Professor Prof. Jian Sun, and My postdoctoral advisor was Professor Jianhua Ma.

During doctoral studies, my research focused on deep learning paradigms in open-world scenarios, with notable achievements in continual learning and automated machine learning. In recent years, I have led a sub-project of the National Key R&D Program and a General Postdoctoral Project. I have published seven papers in top-tier international journals and conferences in artificial intelligence and computer vision, including IEEE TPAMI, CVPR, and AAAI. I also had the honor of delivering an oral presentation at CVPR, one of the leading conferences in computer vision.My work has been recognized as the Outstanding Paper at the 19th Young Scientists Conference of the China Society of Image and Graphics, and I received second place in the “Continual Learning for Sequential Tasks” track at the 2nd Guangdong-Hong Kong-Macao Greater Bay Area (Huangpu) International Algorithm Case Competition.

I also serve as a reviewer for leading journals and conferences, including Nature Communications, IEEE TPAMI, IEEE TNNLS, IEEE TIP, IEEE TMI, CVPR, ICCV, and others.

Open Positions:

The research group is well-funded. Prospective students interested in pursuing a Master’s, Ph.D., or Postdoctoral position are welcome to contact me by email with their CV (wangshipeng@xjtu.edu.cn).

selected publications [full list]

(*) denotes equal contribution

TPAMI

Training Networks in Null Space of Feature Covariance With Self-Supervision for Incremental Learning

Shipeng Wang, Xiaorong Li, Jian Sun, and Zongben Xu

IEEE Transactions on Pattern Analysis and Machine Intelligence

Abstract PDF Code

In the context of incremental learning, a network is sequentially trained on a stream of tasks, where data from previous tasks are particularly assumed to be inaccessible. The major challenge is how to overcome the stability-plasticity dilemma, i.e., learning knowledge from new tasks without forgetting the knowledge of previous tasks. To this end, we propose two mathematical conditions for guaranteeing network stability and plasticity with theoretical analysis. The conditions demonstrate that we can restrict the parameter update in the null space of uncentered feature covariance at each linear layer to overcome the stability-plasticity dilemma, which can be realized by layerwise projecting gradient into the null space. Inspired by it, we develop two algorithms, dubbed Adam-NSCL and Adam-SFCL respectively, for incremental learning. Adam-NSCL and Adam-SFCL provide different ways to compute the projection matrix. The projection matrix in Adam-NSCL is constructed by singular vectors associated with the smallest singular values of the uncentered feature covariance matrix, while the projection matrix in Adam-SFCL is constructed by all singular vectors associated with adaptive scaling factors. Additionally, we explore adopting self-supervised techniques, including self-supervised label augmentation and a newly proposed contrastive loss, to improve the performance of incremental learning. These self-supervised techniques are orthogonal to Adam-NSCL and Adam-SFCL and can be incorporated with them seamlessly, leading to Adam-NSCL-SSL and Adam-SFCL-SSL respectively. The proposed algorithms are applied to task-incremental and class-incremental learning on various benchmark datasets with multiple backbones, and the results show that they outperform the compared incremental learning methods.
TPAMI

Variational HyperAdam: A Meta-Learning Approach to Network Training

Shipeng Wang, Yan Yang, Jian Sun, and Zongben Xu

IEEE Transactions on Pattern Analysis and Machine Intelligence

Abstract PDF Code

Stochastic optimization algorithms have been popular for training deep neural networks. Recently, there emerges a new approach of learning-based optimizer, which has achieved promising performance for training neural networks. However, these black-box learning-based optimizers do not fully take advantage of the experience in human-designed optimizers and heavily rely on learning from meta-training tasks, therefore have limited generalization ability. In this paper, we propose a novel optimizer, dubbed as Variational HyperAdam, which is based on a parametric generalized Adam algorithm, i.e., HyperAdam, in a variational framework. With Variational HyperAdam as optimizer for training neural network, the parameter update vector of the neural network at each training step is considered as random variable, whose approximate posterior distribution given the training data and current network parameter vector is predicted by Variational HyperAdam. The parameter update vector for network training is sampled from this approximate posterior distribution. Specifically, in Variational HyperAdam, we design a learnable generalized Adam algorithm for estimating expectation, paired with a VarBlock for estimating the variance of the approximate posterior distribution of parameter update vector. The Variational HyperAdam is learned in a meta-learning approach with meta-training loss derived by variational inference. Experiments verify that the learned Variational HyperAdam achieved state-of-the-art network training performance for various types of networks on different datasets, such as multilayer perceptron, CNN, LSTM and ResNet.
CVPROral

Training Networks in Null Space of Feature Covariance for Continual Learning

Shipeng Wang, Xiaorong Li, Jian Sun, and Zongben Xu

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Oral Presentation [Top %4]
Abstract PDF Code

In the setting of continual learning, a network is trained on a sequence of tasks, and suffers from catastrophic forgetting. To balance plasticity and stability of network in continual learning, in this paper, we propose a novel network training algorithm called Adam-NSCL, which sequentially optimizes network parameters in the null space of previous tasks. We first propose two mathematical conditions respectively for achieving network stability and plasticity in continual learning. Based on them, the network training for sequential tasks can be simply achieved by projecting the candidate parameter update into the approximate null space of all previous tasks in the network training process, where the candidate parameter update can be generated by Adam. The approximate null space can be derived by applying singular value decomposition to the uncentered covariance matrix of all input features of previous tasks for each linear layer. For efficiency, the uncentered covariance matrix can be incrementally computed after learning each task. We also empirically verify the rationality of the approximate null space at each linear layer. We apply our approach to training networks for continual learning on benchmark datasets of CIFAR-100 and TinyImageNet, and the results suggest that the proposed approach outperforms or matches the state-ot-the-art continual learning approaches.
AAAIPoster Spotlight

HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Shipeng Wang, Jian Sun, and Zongben Xu

In Proceedings of the AAAI Conference on Artificial Intelligence
Poster Spotlight
Abstract PDF Code

Deep neural networks are traditionally trained using humandesigned stochastic optimization algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network parameters has emerged as a promising research topic. However, these learned black-box optimizers sometimes do not fully utilize the experience in human-designed optimizers, therefore have limitation in generalization ability. In this paper, a new optimizer, dubbed as <em>HyperAdam</em>, is proposed that combines the idea of “learning to optimize” and traditional Adam optimizer. Given a network for training, its parameter update in each iteration generated by HyperAdam is an adaptive combination of multiple updates generated by Adam with varying decay rates . The combination weights and decay rates in HyperAdam are adaptively learned depending on the task. HyperAdam is modeled as a recurrent neural network with AdamCell, WeightCell and StateCell. It is justified to be state-of-the-art for various network training, such as multilayer perceptron, CNN and LSTM.</p>