Novel multiple instance learningmodels for digital histopathology
Başlık çevirisi mevcut değil.
- Tez No: 759083
- Danışmanlar: YRD. DOÇ. DR. LEE HWEE KUAN, PROF. SUNG WİNG-KİN,
- Tez Türü: Doktora
- Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
- Anahtar Kelimeler: Belirtilmemiş.
- Yıl: 2021
- Dil: İngilizce
- Üniversite: National University of Singapore (NUS)
- Enstitü: Yurtdışı Enstitü
- Ana Bilim Dalı: Belirtilmemiş.
- Bilim Dalı: Belirtilmemiş.
- Sayfa Sayısı: 312
Özet
Özet yok.
Özet (Çeviri)
Cancer is estimated to be responsible for 9.3 million deaths globally in 2019. For the early detection and successful treatment of cancer, histopathology is a crucial diagnostic tool. Recently, slide scanners have transformed histopathology into digital, where glass slides are digitized and stored as whole-slide-images (WSIs). WSIs provide us with precious data that powerful deep learning models can exploit. However, a WSI is a huge gigapixel image that traditional deep learning models cannot process. Besides, deep learning models require a lot of labeled data. Nevertheless, most WSIs are either unannotated or annotated with some weak labels indicating sample-level properties. The WSIs are seldom annotated with region-of-interests. This thesis develops novel multiple instance learning (MIL) models to address these challenges in digital histopathology. MIL is a machine learning paradigm that learns the mapping between bags of instances and bag labels. We use the MIL paradigm to tackle huge images (WSIs) and utilize weak labels. We treat a WSI as a bag of small patches cropped over the WSI and use the WSI's weak label as the bag label. We also test our models' usefulness on real-world tasks at the intersection of digital histopathology and genomics. Firstly, we show that digital histopathology tasks can be accomplished even only with weak labels. We develop a weakly supervised clustering framework based on a novel MIL task of predicting unique class count (ucc), which is the number of unique classes among all instances inside a bag. Note that ucc does not provide a label for each instance directly. We formally prove that a perfect ucc classifier1 can be used to cluster individual instances inside the bags perfectly. Furthermore, given only the weak labels of whether an image contains metastases or not, we successfully segment out breast cancer metastases in the lymph node 1The definition of perfect ucc classifier is given in Section 3.3.2.2. xii sections by formulating this task as a ucc task. We show that our framework using only weak labels approximates the performance of a fully supervised medical image segmentation model, which requires tedious and time-consuming exhaustive annotations showing metastases regions in the images. Secondly, we introduce a new family of MIL pooling filters, namely distribution based pooling filters. One common component in all MIL methods is the MIL pooling filter which summarizes extracted features of instances into a bag level representation. Distribution based pooling filters obtain a bag level representation by estimating marginal distributions of the extracted features. We formally prove that the distribution based pooling filters are superior to the point estimate based counterparts, like 'max' and 'mean' pooling, in terms of the amount of information captured while obtaining bag-level representations. Moreover, we empirically show that models with distribution based pooling filters perform equal or better than those with point estimate based ones on real-world MIL tasks. Thirdly, we show that a MIL model with a distribution pooling filter can successfully predict tumor purity from hematoxylin and eosin (H&E) stained WSIs. Tumor purity is the percentage of cancer cells within a tumor. An accurate tumor purity estimation is crucial for accurate pathologic evaluation and for sample selection to minimize normal cell contamination in genomic analysis. Tumor purity is routinely estimated by pathologists; however, pathologists' estimates suffer from inter-observer variability. Moreover, they do not correlate well with genomic tumor purity values, which are computationally inferred from genomic data and accepted as the golden standard. We show that our MIL models successfully predict tumor purity from H&E stained WSIs in eight TCGA cohorts and a local Singapore lung cancer cohort. The predictions are consistent with genomic tumor purity values. Besides, we obtain spatially resolved tumor purity maps showing the spatial variation of tumor purity within slides. Hence, our MIL models can be utilized for sample selection for genomic sequencing, which will help reduce pathologists' workload and decrease inter-observer variability. Moreover, spatial tumor purity maps can help better understand the tumor microenvironment as a key determinant in tumor formation and therapeutic response. Finally, we give a recipe to prepare machine learning datasets for digital xiii histopathology tasks. We show that incorrect data segregation during dataset preparation leads to data leakage, which seriously affects a machine learning model's performance on new patients during real-world deployment. The model can give illusory good results on the test set; however, it is probably not the case for a new patient walking into the clinic. We conclude that patient-level data segregation is necessary to avoid data leakage in digital histopathology tasks. Moreover, it ensures that each patient in the test set is like a new patient walking into the clinic. Hence, it is the correct way of preparing machine learning datasets for real-world clinical applications.
Benzer Tezler
- Security enhancement of image steganography using deepconvolutional neural network (DCNN)
Görüntü güvenliği geliştirme derin kullanarak steganografidönüşümlü sinir ağı (DCNN)
RAFAD IMAD KADHIM ABO KHUSHOOT
Yüksek Lisans
İngilizce
2022
Elektrik ve Elektronik MühendisliğiAltınbaş ÜniversitesiElektrik ve Bilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. GALİP CANSEVER
- Efficient machine learning models for cancer biology
Kanser biyolojisi için etkin yapay öğrenme modelleri
AYYÜCE BEGÜM BEKTAŞ
Doktora
İngilizce
2022
Endüstri ve Endüstri MühendisliğiKoç ÜniversitesiEndüstri Mühendisliği Ana Bilim Dalı
DOÇ. DR. MEHMET GÖNEN
- Generation and analysis of segmentation trees for natural images
Başlık çevirisi yok
EMRE AKBAŞ
Doktora
İngilizce
2011
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolUniversity of Illinois at Urbana-ChampaignElektrik ve Bilgisayar Mühendisliği Ana Bilim Dalı
PROF. NARENDRA AHUJA
- Utilizing multiple instance learning for computer vision tasks
Bilgisayarlı görü problemlerinin çoklu örnekle öğrenme ile değerlendirilmesi
FADİME ŞENER
Yüksek Lisans
İngilizce
2013
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİhsan Doğramacı Bilkent ÜniversitesiBilgisayar Mühendisliği Bölümü
YRD. DOÇ. DR. PINAR DUYGULU ŞAHİN
YRD. DOÇ. DR. NAZLI İKİZLER CİNBİŞ
- Distance-based learning approaches for multiple instance learning
Çoklu örnekle öğrenme problemleri için uzaklık tabanlı öğrenme yaklaşımları
ÖZGÜR EMRE SİVRİKAYA
Doktora
İngilizce
2022
Endüstri ve Endüstri MühendisliğiBoğaziçi ÜniversitesiEndüstri Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ MUSTAFA GÖKÇE BAYDOĞAN