A graph-transformer for whole slide image classification

Zheng, Yi; Gindra, Rushin H.; Green, Emily J.; Burks, Eric J.; Betke, Margrit; Beane, Jennifer E.; Kolachalama, Vijaya B.

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2205

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: A graph-transformer for whole slide image classification

Authors: Yi Zheng, Rushin H. Gindra, Emily J. Green, Eric J. Burks, Margrit Betke, Jennifer E. Beane, Vijaya B. Kolachalama

(Submitted on 19 May 2022)

Abstract: Deep learning is a powerful tool for whole slide image (WSI) analysis. Typically, when performing supervised deep learning, a WSI is divided into small patches, trained and the outcomes are aggregated to estimate disease grade. However, patch-based methods introduce label noise during training by assuming that each patch is independent with the same label as the WSI and neglect overall WSI-level information that is significant in disease grading. Here we present a Graph-Transformer (GT) that fuses a graph-based representation of an WSI and a vision transformer for processing pathology images, called GTP, to predict disease grade. We selected $4,818$ WSIs from the Clinical Proteomic Tumor Analysis Consortium (CPTAC), the National Lung Screening Trial (NLST), and The Cancer Genome Atlas (TCGA), and used GTP to distinguish adenocarcinoma (LUAD) and squamous cell carcinoma (LSCC) from adjacent non-cancerous tissue (normal). First, using NLST data, we developed a contrastive learning framework to generate a feature extractor. This allowed us to compute feature vectors of individual WSI patches, which were used to represent the nodes of the graph followed by construction of the GTP framework. Our model trained on the CPTAC data achieved consistently high performance on three-label classification (normal versus LUAD versus LSCC: mean accuracy$= 91.2$ $\pm$ $2.5\%$) based on five-fold cross-validation, and mean accuracy $= 82.3$ $\pm$ $1.0\%$ on external test data (TCGA). We also introduced a graph-based saliency mapping technique, called GraphCAM, that can identify regions that are highly associated with the class label. Our findings demonstrate GTP as an interpretable and effective deep learning framework for WSI-level classification.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2205.09671 [cs.CV]
	(or arXiv:2205.09671v1 [cs.CV] for this version)

Submission history

From: Yi Zheng [view email]
[v1] Thu, 19 May 2022 16:32:10 GMT (4175kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2205.09671v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: A graph-transformer for whole slide image classification

Submission history