Multi-Scale Multi-Task distillation for incremental 3D medical image segmentation

preview_player
Показать описание
Authors: Mu Tian (Shenzhen University); Qinzhu Yang (Shenzhen University); Yi Gao (Shenzhen University)

Abstract: Automatic medical image segmentation is the core component for many clinical applications. Substantial number of deep learning based methods have been proposed in past years, but deploying such methods in practice faces certain difficulties, such as the acquisition of massive annotated data for training and the high latency of model iteration. In contrast to the conventional cycle of "data collection, offline training, model update", developing a system that continuously generates robust predictions will be critical. Recently, incremental learning was widely investigated for classification and semantic segmentation on 2D natural images. Existing work showed the effectiveness of data rehearsal and knowledge distillation in counteracting catastrophic forgetting. Inspired by these advances, we propose a multi-scale multi-task distillation framework for incremental learning with 3D medical images. Different from the task-incremental scenario in literature, our proposed strategy focuses on improving robustness against implicit data distribution shift. We introduce knowledge distillation as multi-task regularization to resolve prediction confusions. At each step, the network is instructed to learn towards both the new ground truth and the uncertainty weighted predictions from the previous model. Simultaneously, image features at multiple scales in the segmentation network could participate in a contrastive learning scheme, aiming at more discriminant representations that inherit the past knowledge effectively. Experiments showed that our method improved overall continual learning robustness under the extremely challenging scenario of "seeing each image once in a batch of one" without any pre-training. In addition, the proposed method could work on top of any network architectures and existing incremental learning strategies. We also showed further improvements by combining our method with data rehearsal using a small buffer.
Рекомендации по теме