Geri Dön

Novel multiple instance learningmodels for digital histopathology

Başlık çevirisi mevcut değil.

  1. Tez No: 759083
  2. Yazar: MUSTAFA UMIT ONER
  3. Danışmanlar: YRD. DOÇ. DR. LEE HWEE KUAN, PROF. SUNG WİNG-KİN,
  4. Tez Türü: Doktora
  5. Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
  6. Anahtar Kelimeler: Belirtilmemiş.
  7. Yıl: 2021
  8. Dil: İngilizce
  9. Üniversite: National University of Singapore (NUS)
  10. Enstitü: Yurtdışı Enstitü
  11. Ana Bilim Dalı: Belirtilmemiş.
  12. Bilim Dalı: Belirtilmemiş.
  13. Sayfa Sayısı: 312

Özet

Özet yok.

Özet (Çeviri)

Cancer is estimated to be responsible for 9.3 million deaths globally in 2019. For the early detection and successful treatment of cancer, histopathology is a crucial diagnostic tool. Recently, slide scanners have transformed histopathology into digital, where glass slides are digitized and stored as whole-slide-images (WSIs). WSIs provide us with precious data that powerful deep learning models can exploit. However, a WSI is a huge gigapixel image that traditional deep learning models cannot process. Besides, deep learning models require a lot of labeled data. Nevertheless, most WSIs are either unannotated or annotated with some weak labels indicating sample-level properties. The WSIs are seldom annotated with region-of-interests. This thesis develops novel multiple instance learning (MIL) models to address these challenges in digital histopathology. MIL is a machine learning paradigm that learns the mapping between bags of instances and bag labels. We use the MIL paradigm to tackle huge images (WSIs) and utilize weak labels. We treat a WSI as a bag of small patches cropped over the WSI and use the WSI's weak label as the bag label. We also test our models' usefulness on real-world tasks at the intersection of digital histopathology and genomics. Firstly, we show that digital histopathology tasks can be accomplished even only with weak labels. We develop a weakly supervised clustering framework based on a novel MIL task of predicting unique class count (ucc), which is the number of unique classes among all instances inside a bag. Note that ucc does not provide a label for each instance directly. We formally prove that a perfect ucc classifier1 can be used to cluster individual instances inside the bags perfectly. Furthermore, given only the weak labels of whether an image contains metastases or not, we successfully segment out breast cancer metastases in the lymph node 1The definition of perfect ucc classifier is given in Section 3.3.2.2. xii sections by formulating this task as a ucc task. We show that our framework using only weak labels approximates the performance of a fully supervised medical image segmentation model, which requires tedious and time-consuming exhaustive annotations showing metastases regions in the images. Secondly, we introduce a new family of MIL pooling filters, namely distribution based pooling filters. One common component in all MIL methods is the MIL pooling filter which summarizes extracted features of instances into a bag level representation. Distribution based pooling filters obtain a bag level representation by estimating marginal distributions of the extracted features. We formally prove that the distribution based pooling filters are superior to the point estimate based counterparts, like 'max' and 'mean' pooling, in terms of the amount of information captured while obtaining bag-level representations. Moreover, we empirically show that models with distribution based pooling filters perform equal or better than those with point estimate based ones on real-world MIL tasks. Thirdly, we show that a MIL model with a distribution pooling filter can successfully predict tumor purity from hematoxylin and eosin (H&E) stained WSIs. Tumor purity is the percentage of cancer cells within a tumor. An accurate tumor purity estimation is crucial for accurate pathologic evaluation and for sample selection to minimize normal cell contamination in genomic analysis. Tumor purity is routinely estimated by pathologists; however, pathologists' estimates suffer from inter-observer variability. Moreover, they do not correlate well with genomic tumor purity values, which are computationally inferred from genomic data and accepted as the golden standard. We show that our MIL models successfully predict tumor purity from H&E stained WSIs in eight TCGA cohorts and a local Singapore lung cancer cohort. The predictions are consistent with genomic tumor purity values. Besides, we obtain spatially resolved tumor purity maps showing the spatial variation of tumor purity within slides. Hence, our MIL models can be utilized for sample selection for genomic sequencing, which will help reduce pathologists' workload and decrease inter-observer variability. Moreover, spatial tumor purity maps can help better understand the tumor microenvironment as a key determinant in tumor formation and therapeutic response. Finally, we give a recipe to prepare machine learning datasets for digital xiii histopathology tasks. We show that incorrect data segregation during dataset preparation leads to data leakage, which seriously affects a machine learning model's performance on new patients during real-world deployment. The model can give illusory good results on the test set; however, it is probably not the case for a new patient walking into the clinic. We conclude that patient-level data segregation is necessary to avoid data leakage in digital histopathology tasks. Moreover, it ensures that each patient in the test set is like a new patient walking into the clinic. Hence, it is the correct way of preparing machine learning datasets for real-world clinical applications.

Benzer Tezler

  1. Security enhancement of image steganography using deepconvolutional neural network (DCNN)

    Görüntü güvenliği geliştirme derin kullanarak steganografidönüşümlü sinir ağı (DCNN)

    RAFAD IMAD KADHIM ABO KHUSHOOT

    Yüksek Lisans

    İngilizce

    İngilizce

    2022

    Elektrik ve Elektronik MühendisliğiAltınbaş Üniversitesi

    Elektrik ve Bilgisayar Mühendisliği Ana Bilim Dalı

    PROF. DR. GALİP CANSEVER

  2. Efficient machine learning models for cancer biology

    Kanser biyolojisi için etkin yapay öğrenme modelleri

    AYYÜCE BEGÜM BEKTAŞ

    Doktora

    İngilizce

    İngilizce

    2022

    Endüstri ve Endüstri MühendisliğiKoç Üniversitesi

    Endüstri Mühendisliği Ana Bilim Dalı

    DOÇ. DR. MEHMET GÖNEN

  3. Generation and analysis of segmentation trees for natural images

    Başlık çevirisi yok

    EMRE AKBAŞ

    Doktora

    İngilizce

    İngilizce

    2011

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolUniversity of Illinois at Urbana-Champaign

    Elektrik ve Bilgisayar Mühendisliği Ana Bilim Dalı

    PROF. NARENDRA AHUJA

  4. Utilizing multiple instance learning for computer vision tasks

    Bilgisayarlı görü problemlerinin çoklu örnekle öğrenme ile değerlendirilmesi

    FADİME ŞENER

    Yüksek Lisans

    İngilizce

    İngilizce

    2013

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİhsan Doğramacı Bilkent Üniversitesi

    Bilgisayar Mühendisliği Bölümü

    YRD. DOÇ. DR. PINAR DUYGULU ŞAHİN

    YRD. DOÇ. DR. NAZLI İKİZLER CİNBİŞ

  5. Distance-based learning approaches for multiple instance learning

    Çoklu örnekle öğrenme problemleri için uzaklık tabanlı öğrenme yaklaşımları

    ÖZGÜR EMRE SİVRİKAYA

    Doktora

    İngilizce

    İngilizce

    2022

    Endüstri ve Endüstri MühendisliğiBoğaziçi Üniversitesi

    Endüstri Mühendisliği Ana Bilim Dalı

    DR. ÖĞR. ÜYESİ MUSTAFA GÖKÇE BAYDOĞAN