We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Electrical Engineering and Systems Science > Image and Video Processing

Title: Mixed-supervised segmentation: Confidence maximization helps knowledge distillation

Abstract: Despite achieving promising results in a breadth of medical image segmentation tasks, deep neural networks require large training datasets with pixel-wise annotations. Obtaining these curated datasets is a cumbersome process which limits the application in scenarios where annotated images are scarce. Mixed supervision is an appealing alternative for mitigating this obstacle, where only a small fraction of the data contains complete pixel-wise annotations and other images have a weaker form of supervision. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. Combined with a standard cross-entropy loss over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions in the bottom branch; and (ii) a Kullback-Leibler (KL) divergence term, which transfers the knowledge of the strongly supervised branch to the less-supervised branch and guides the entropy (student-confidence) term to avoid trivial solutions. We show that the synergy between the entropy and KL divergence yields substantial improvements in performance. We also discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation, and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. Quantitative and qualitative results on two publicly available datasets demonstrate that our method significantly outperforms other strategies for semantic segmentation within a mixed-supervision framework, as well as recent semi-supervised approaches. Moreover, we show that the branch trained with reduced supervision and guided by the top branch largely outperforms the latter.
Comments: This article is a journal extension of our paper in IPMI 2021 arXiv:2012.08051 . Currently under review at Medical Image Analysis. Code available at this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as: arXiv:2109.10902 [eess.IV]
  (or arXiv:2109.10902v3 [eess.IV] for this version)

Submission history

From: Bingyuan Liu [view email]
[v1] Tue, 21 Sep 2021 20:06:13 GMT (2148kb,D)
[v2] Fri, 24 Sep 2021 01:05:52 GMT (2126kb,D)
[v3] Fri, 15 Oct 2021 05:02:33 GMT (2126kb,D)

Link back to: arXiv, form interface, contact.