Yapay zeka ile meme kanseri teşhisi

Breast cancer diagnosis with artificial intelligence

PDF İndir

Tez No: 931418
Yazar: İLKER ÇAKAR
Danışmanlar: DOÇ. DR. MUHAMMED KÜRŞAD UÇAR
Tez Türü: Yüksek Lisans
Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2025
Dil: Türkçe
Üniversite: Sakarya Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
Bilim Dalı: Elektronik Mühendisliği Bilim Dalı
Sayfa Sayısı: 177

Özet

Meme kanseri, memede bulunan hücrelerin kontrolden çıkaması olarak tanımlanan bir hastalıktır. Meme kanseri dünya genelinde kadınlar arasında rastlanılan en popüler hastalıklar arasındadır. Çoğu meme kitlesinin iyi huylu olduğunu veya kanser olmadığını anlamak erken teşhis açısından oldukça önemlidir. Meme kanserinin erken teşhisi hastalığın tedavisi açısından ise ayrı bir öneme sahiptir. Kontrol edilmezse kötü huylu tümörler vücuda yayılıp diğer organlara sıçrayabilir ve ölümcül boyutlara ulaşabilir. Tedavinin erken aşamada tespiti hastalıktan doğacak ölümlerin önüne geçmektedir. Meme kanserinin herkes için uygun olacak tek bir tedavi yaklaşımı yoktur. Tedavi için meme kanserinin tipi, evresi ve kişinin yaşamı gibi faktörler göz önünde alınmaktadır. Meme kanseri için genel olarak beş tedavi seçeneği vardır ve çoğu tedavi planı aşağıdakilerin bir kombinasyonunu içerir: ameliyat, radyasyon, hormon tedavisi, kemoterapi ve hedefe yönelik tedavilerdir. Bazıları lokaldir ve yalnızca tümörün etrafındaki alanı hedef alır. Diğerleri sistemiktir ve kanserle mücadele eden ajanlarla tüm vücudunuzu hedef alır. Tüm bu tedavi yöntemlerine rağmen kanser vücudun diğer bölgelerine yayılmışsa genellikle tedavi edilemez, ancak normalde uzun süre etkili bir şekilde kontrol edilebilir. Son zamanlarda meme kanserinin erken evrede teşhisini ve doğruluğunu sağlamak amacıyla bilgisayar destekli sistemler geliştirilmektedir. Makine öğrenimi yaklaşımlarıyla geliştirilen bilgisayar destekli sistemler meme kanserinin teşhisindeki sürece büyük katkı sağlamaktadır. Makine öğrenimi yaklaşımlarıyla geliştirilen bilgisayar destekli akıllı ve otomatik tanı sistemleri, analizde önemli araçlardır ve tıp uzmanlarını meme kanseri konusunda destekleyebilmekte ve tıbbi karar verme sürecinde yer almaktadır. Çok sayıda yöntem gösterilmiş olmasına rağmen, yöntemlerin çoğu doğru ve tutarlı bir sonuç sağlayamamaktadır. Ayrıca mevcut sistemler daha yüksek doğruluk oranı ve daha az hesaplama süresi gerektirmektedir. Ancak tüm bu mevcut çalışmalar gene de tutarlı bir doğruluk oranı sağlayamamaktadır. Makine öğreniminin meme kanseri üzerinde temel amacı, bir sistemin insan müdahalesi olmadan öğrenmesini sağlamaktır, bu da karar verme için otomatik bir sistem tasarlamaya yardımcı olmaktadır. Ancak makine öğrenimi modelinin tahmin doğruluğunun iyileştirilmesi büyük bir zorluk ve araştırma boşluğu olduğu görülmüştür. Tüm bunlara rağmen meme kanserinin sınıflandırma zorluğu için çok sayıda makine öğrenimi tekniği sunulmuştur. Bu çalışmadaki veri setleri herkese açık olarak sunulan internet sitelerinden elde edilmiştir. Çalışmada kullanılan 1. veri seti setinde 4024 farklı kişiden 14 özellik bilgisi alınmıştır. Bu bilgiler doğrultusunda 4024 farklı kişinin yaşama ve ölme durumları sınıflandırılmıştır. Çalışmada kullanılan 2. veri seti setinde 286 farklı kişiden 9 özellik bilgisi alınmıştır. Bu bilgiler doğrultusunda 286 farklı kişinin meme kanserinin tekrarlama ve tekrarlamama durumları sınıflandırılmıştır. Çalışmada kullanılan 3. veri seti setinde 116 farklı kişiden 9 özellik bilgisi alınmıştır. Bu bilgiler doğrultusunda 116 farklı kişinin sağlıklı ya da hasta olma durumları sınıflandırılmıştır. Çalışmada kullanılan 4. veri seti setinde 198 farklı kişiden 33 özellik bilgisi alınmıştır. Bu bilgiler doğrultusunda 198 farklı kişinin meme kanserinin tekrarlama ve tekrarlamama durumları sınıflandırılmıştır. Çalışmada kullanılan 5. veri seti setinde 699 farklı kişiden 9 özellik bilgisi alınmıştır. Bu bilgiler doğrultusunda 699 farklı kişinin meme kanserinin iyi huylu veya kötü huylu olma durumları sınıflandırılmıştır. Çalışmada kullanılan 6. veri seti setinde 569 farklı kişiden 30 özellik bilgisi alınmıştır. Bu bilgiler doğrultusunda 569 farklı kişinin meme kanserinin iyi huylu veya kötü huylu olma durumları sınıflandırılmıştır. Bu çalışmada öncelikle veri önişleme ile tüm veri setlerindeki özellik değerleri tanım ve ifadelerine göre incelenip, sayısal ve nominal değerler olarak ayrılmıştır. Sayısal olmayan nominal değerler veri işleme açısından sayısal değerlere atanılmıştır. Sayısal değerler ise olduğu gibi bırakılmış ya da kendi içlerinde gruplandırma yapılmıştır. Daha sonra tüm sisteme Eta özellik seçme algoritması uygulanılmıştır. Daha sonra 6 veri seti üzerinde de sistemin daha verimli çalışabilmesi için veri dengeleme işlemi yapılmıştır. Veri dengeleme işleminden sonra ise veriler eğitim-test olacak şekilde 1. veri setinde, 2. veri setinde ve 4. veri setinde %50-%50, 3. veri setinde %85-%15, 5. veri setinde ve 6. veri setinde ise %80-%20 şeklinde bölünme oranlarına ayrılmıştır. Daha sonra tüm 6 veri seti üzerinde makine öğrenmesi doğrultusunda Ensemble, kNN (k En Yakın Komşu), SVMs (Destek Vektör Makineleri) ve Hibrit Yapay Zeka gibi makine öğrenmesi algoritmaları uygulanılmıştır. Elde edilen sonuçlar ile meme kanseri algoritması oluşturulmuştur ve görüntüler elde edilmiştir. 6 farklı veri setinde de makine öğrenmesi algoritmaları ile doğruluk, özgüllük, duyarlılık, kappa sayısı, F-Score gibi performans değerlendirme kriterleri elde edilmiştir. Tüm veri setlerinde bulunan değerlendirme kriterleri öncelikle hangi veri seti üzerinde çalışma yapıldıysa o veri seti üzerindeki makine öğrenmesi teknikleriyle daha sonra ise tüm veri setleri üzerinde bulunan değerlendirme kriterleri ile karşılaştırma yapılmıştır. Bu çalışmada elde edilen tüm veri setlerinde bulunan sonuçlar karşılaştırıldığında ise en yüksek doğruluk oranı, F-Score, Kappa ve AUC değerleri çalışmada 6. veri seti olarak nitelendirilen Wisconsin Diagnostic Breast Cancer veri setinde Ensemble yöntemi ile %100, en yüksek duyarlılık oranı çalışmada bulunan 6 veri setinden 1. veri seti olarak nitelendirilen Breast Cancer Registry of University Malaya Medical Centre veri setinde Hibrit yöntem ile %100, 2. veri seti olarak nitelendirilen veri setinde SVMs, Ensemble, kNN ve Hibrit yöntemleri ile %100, 3. veri seti olarak nitelendirilen Coimbra veri setinde Ensemble ve Hibrit yöntemleri ile %100, 6. veri seti olarak nitelendirilen Wisconsin Diagnostic Breast Cancer veri setinde Ensemble yöntemi ile %100 ve en yüksek özgüllük oranı çalışmada bulunan 6 veri setinden 3. veri seti olarak nitelendirilen Coimbra veri setinde SVMs, Ensemble ve Hibrit yöntemleri ile %100, 5. veri seti olarak nitelendirilen Wisconsin Breast Cancer veri setinde SVMs, kNN, Ensemble ve Hibrit yöntemleri ile %100, 6. veri seti olarak nitelendirilen Wisconsin Diagnostic Breast Cancer veri setinde Ensemble yöntemi ile %100 olarak bulunmuştur. Bu sonuçlar doğrultusunda Matlab ortamında 6 farklı veri seti üzerinde uygulaması yapılan bu çalışmanın makine öğrenmesi algoritmalarını etkili şekilde kullanıldığı görülmüştür. Sonuç olarak bu çalışmanın farklı makine öğrenmesi teknikleri ile daha yüksek yüzdelerde doğruluk, özgüllük ve duyarlılık oranları bulunabileceğini ispatlanmıştır. Bu durum da çalışmamızın meme kanseri teşhisinde hastalıklı ve sağlıklı kişilerin bulunmasında güvenilir bir çalışma olduğunu ortaya koyarak sağlık alanında daha uygulanabilir ve elverişli bir çalışma olduğunu göstermiştir.

Özet (Çeviri)

Breast cancer is a disease characterized by the uncontrolled growth of cells in the breast. It is among the most commonly diagnosed illnesses in women worldwide. Recognizing that most breast lumps are benign or non-cancerous is vital for early diagnosis. Early detection of breast cancer holds particular significance in terms of effective treatment. If left unchecked, malignant tumors may spread to other parts of the body, potentially reaching life-threatening stages. Detecting and addressing the disease at an early stage plays a crucial role in preventing cancer-related deaths. There is no universal treatment approach suitable for every individual diagnosed with breast cancer. Factors such as the type and stage of cancer, as well as the individual's overall lifestyle and health, are taken into consideration when determining the best course of treatment. Recently, computer-aided systems have been developed to improve the early detection and diagnostic accuracy of breast cancer. These systems, powered by machine learning approaches, have made substantial contributions to the process of diagnosing breast cancer. Machine learning-based computer-aided intelligent and automated diagnostic systems are essential tools in analyzing breast cancer data. They not only assist healthcare professionals in understanding and diagnosing breast cancer but also play a pivotal role in the medical decision-making process. Despite the development of numerous methods, many fail to provide accurate and consistent results. Current systems also demand higher accuracy rates and faster computational times. Even with these improvements, achieving consistent accuracy across varying datasets and conditions remains a significant challenge. The primary aim of machine learning in breast cancer research is to enable systems to learn and adapt without direct human intervention, facilitating the development of automated decision-making tools. However, improving the predictive accuracy of machine learning models continues to present a significant challenge and identifies a notable research gap in the field. Despite these difficulties, many machine learning techniques have been proposed to address the challenges associated with breast cancer classification. Expanding on this, the importance of machine learning in breast cancer cannot be overstated. It offers an opportunity to revolutionize how we approach diagnosis and treatment. Machine learning algorithms can process vast datasets, identify patterns that may not be immediately apparent to human experts, and deliver predictions with remarkable speed. For example, deep learning algorithms have been applied to mammographic images, significantly improving the ability to detect anomalies that could indicate early-stage cancer. Moreover, these systems can incorporate patient data such as genetic information, lifestyle factors, and treatment histories to personalize diagnostics and predict treatment outcomes. This individualized approach ensures that each patient receives care tailored to their unique circumstances, potentially improving prognosis and quality of life. However, the integration of these technologies into routine clinical practice is not without obstacles. Ethical considerations, data privacy issues, and the need for extensive validation across diverse populations remain critical areas of concern. Furthermore, the computational demands of advanced machine learning models require robust hardware and software infrastructures, which may not be readily available in all healthcare settings. Despite these challenges, the field continues to advance, driven by the promise of better outcomes for patients. Future research should focus on developing models that not only achieve high accuracy but also exhibit generalizability across different datasets and populations. Additionally, collaborative efforts between machine learning researchers and medical practitioners will be essential in translating theoretical advancements into practical solutions that benefit patients worldwide. In conclusion, machine learning holds the potential to transform breast cancer diagnosis and treatment. By overcoming existing limitations and addressing current gaps, these technologies can provide a powerful ally in the fight against breast cancer, ultimately leading to earlier diagnoses, better treatment options, and improved survival rates. The datasets used in this study were obtained from publicly available online sources. In the first dataset used in the study, data on 14 attributes were collected from 4024 individuals. Based on this information, the survival and mortality status of these 4024 individuals were classified. In the second dataset, data on 9 attributes were collected from 286 individuals. Based on this information, the recurrence and non-recurrence status of breast cancer among these 286 individuals were classified. In the third dataset, data on 9 attributes were collected from 116 individuals. Based on this information, the health status of these 116 individuals (healthy or diseased) was classified. In the fourth dataset, data on 33 attributes were collected from 198 individuals. Based on this information, the recurrence and non-recurrence status of breast cancer among these 198 individuals were classified. In the fifth dataset, data on 9 attributes were collected from 699 individuals. Based on this information, the benign or malignant status of breast cancer among these 699 individuals was classified. In the sixth dataset, data on 30 attributes were collected from 569 individuals. Based on this information, the benign or malignant status of breast cancer among these 569 individuals was classified. In this study, all feature values in the datasets were initially examined through data preprocessing according to their definitions and expressions, and were categorized as numerical or nominal values. Non-numerical nominal values were assigned numerical equivalents for processing purposes, while numerical values were either left as they were or grouped internally. Subsequently, the Eta feature selection algorithm was applied to the entire system. Data balancing was performed on all six datasets to enhance system efficiency. After balancing, the data were divided into training and testing sets with specific split ratios: 50%-50% for the 1st, 2nd, and 4th datasets; 85%-15% for the 3rd dataset; and 80%-20% for the 5th and 6th datasets. Following these preprocessing steps, machine learning algorithms such as Ensemble, k-Nearest Neighbors (kNN), Support Vector Machines (SVMs), and Hybrid Artificial Intelligence were applied to all six datasets. The results obtained were used to develop a breast cancer algorithm, and visual representations were generated based on the findings. Performance evaluation metrics such as accuracy, specificity, sensitivity, kappa statistic, and F-Score were obtained for all six datasets using machine learning algorithms. These evaluation metrics were initially analyzed for each dataset individually, focusing on the machine learning techniques applied to that specific dataset. Subsequently, the evaluation metrics across all datasets were compared to assess the overall performance of the applied techniques. When the results obtained from all datasets in this study were compared, the highest accuracy, F-Score, Kappa, and AUC values were achieved with the Ensemble method on the 6th dataset, referred to as the Wisconsin Diagnostic Breast Cancer dataset, reaching 100%. The highest sensitivity rate was observed on the 1st dataset, known as the Breast Cancer Registry of University Malaya Medical Centre dataset, with the Hybrid method achieving 100%. Similarly, the 2nd dataset achieved a sensitivity rate of 100% with SVMs, Ensemble, kNN, and Hybrid methods. The 3rd dataset, referred to as the Coimbra dataset, also reached 100% sensitivity using Ensemble and Hybrid methods. The 6th dataset, the Wisconsin Diagnostic Breast Cancer dataset, achieved 100% sensitivity with the Ensemble method. The highest specificity rate was found in the 3rd dataset, the Coimbra dataset, where SVMs, Ensemble, and Hybrid methods achieved 100% specificity. Likewise, the 5th dataset, referred to as the Wisconsin Breast Cancer dataset, achieved 100% specificity with SVMs, kNN, Ensemble, and Hybrid methods. The 6th dataset, the Wisconsin Diagnostic Breast Cancer dataset, also achieved 100% specificity with the Ensemble method. Based on the results obtained in this study, it is evident that the machine learning algorithms applied across six distinct datasets using the MATLAB environment were utilized with exceptional efficiency. This research highlights the significant potential of integrating advanced computational methods into medical applications, particularly in the field of breast cancer diagnosis. By leveraging diverse datasets and sophisticated algorithms, the study underscores the ability of machine learning techniques to deliver accurate and reliable outcomes. These outcomes are not only scientifically significant but also have practical implications for real-world healthcare scenarios. In conclusion, this study has convincingly demonstrated that higher accuracy, specificity, and sensitivity rates can be achieved through the implementation of various machine learning techniques. The results obtained serve as a testament to the capability of machine learning systems to distinguish between diseased and healthy individuals with precision. The high performance of these algorithms reinforces the reliability of this study as an innovative and effective approach in breast cancer diagnosis. Furthermore, it underlines the importance of adopting computational tools in medical research, showcasing their applicability and practicality in addressing critical healthcare challenges. One of the most commendable aspects of this study is its use of six different datasets, each presenting unique characteristics and challenges. By systematically analyzing these datasets and applying advanced algorithms, the research demonstrates its robustness and adaptability. The inclusion of datasets with varying attributes ensures that the findings are not limited to a single context, enhancing their generalizability and relevance across diverse scenarios. This comprehensive approach reflects the study's commitment to rigor and excellence in research. Moreover, the integration of techniques such as Ensemble methods, k-Nearest Neighbors (kNN), Support Vector Machines (SVMs), and Hybrid Artificial Intelligence showcases the breadth of machine learning methodologies explored in this study. These algorithms, known for their strong predictive capabilities, have been effectively applied to achieve high-performance metrics, including accuracy, specificity, and sensitivity. This demonstrates not only the technical expertise of the research but also its focus on delivering results that are both meaningful and actionable in clinical settings. The study also addresses a critical gap in current breast cancer diagnostic methods by emphasizing the importance of consistent and reliable outcomes. While traditional diagnostic approaches may suffer from variability and limited scalability, the machine learning-driven framework proposed in this study offers a more standardized and scalable solution. By automating the diagnostic process and reducing reliance on subjective human interpretations, the research paves the way for more objective and reproducible results. This advancement has the potential to significantly improve early detection rates, leading to better patient outcomes and more effective treatment planning. Additionally, this study's contribution extends beyond its technical achievements. It serves as a powerful example of how interdisciplinary collaboration between computational science and medicine can yield transformative solutions to pressing healthcare problems. By aligning the strengths of machine learning with the needs of medical practitioners, the research provides a pathway for future innovations in diagnostic and therapeutic strategies. In the context of breast cancer, where early detection and accurate classification are crucial for successful treatment, the significance of this study cannot be overstated. Breast cancer remains one of the leading causes of morbidity and mortality worldwide, making the development of reliable diagnostic tools a priority. This study not only meets this need but also sets a benchmark for future research in the field. Its findings contribute to the growing body of evidence supporting the use of machine learning in oncology, reinforcing its role as a critical tool in the fight against cancer. In summary, this research exemplifies the power and promise of machine learning in healthcare. Through meticulous design, comprehensive analysis, and innovative application of advanced algorithms, the study has achieved outcomes that are both scientifically valuable and clinically relevant. It stands as a testament to the potential of computational methods to revolutionize breast cancer diagnosis, offering hope for earlier detection, more accurate classifications, and ultimately, improved survival rates. This study is not only a significant academic achievement but also a meaningful contribution to the ongoing efforts to enhance healthcare delivery and patient outcomes worldwide.

Benzer Tezler

Tez No
547247
Yapay zeka ile meme kanseri lenf nodu analizi
Breast cancer lymph node analysis using artificial intelligence
KULİLİK SÜER
Yüksek Lisans
Türkçe
2019
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Beykent Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ ATINÇ YILMAZ
Tez No
488078
Makine öğrenmesi sınıflandırma yöntemleri ile meme kanserinin erken teşhisi
Early diagnosis of breast cancer with machine learning classification methods
MELİHA NUR DURAK
Yüksek Lisans
Türkçe
2017
İstatistik Yıldız Teknik Üniversitesi
İstatistik Ana Bilim Dalı
DOÇ. DR. İBRAHİM DEMİR
Tez No
751683
Derin öğrenme ile sınıflandırma: Meme kanseri teşhisi
Classification with deep learning: Breast cancer diagnosis
ZAINAB SUBHI MAHMOOD HAWRAMI
Yüksek Lisans
Türkçe
2022
İstatistik Gazi Üniversitesi
İstatistik Ana Bilim Dalı
PROF. DR. HACI HASAN ÖRKCÜ
Tez No
958033
Meme kanseri tanısında yapay zeka uygulamaları üzerine yapılan araştırmaların bibliyometrik analizi
Bibliometric analysis of research on artificial intelligence applications in breast cancer diagnosis
BENGÜNUR EKİNCİ
Yüksek Lisans
Türkçe
2025
Biyoistatistik Gazi Üniversitesi
Sağlık Bilişimi Ana Bilim Dalı
PROF. DR. HAKAN TEKEDERE
Tez No
339890
Yapay zekâ yöntemleri ile veri analizi ve tıbbi teşhis için uzman sistem geliştirme
Developing expert system for medical diagnosis and data analysis with artificial intelligence methods
ALİ KELEŞ
Doktora
Türkçe
2009
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Atatürk Üniversitesi
Matematik Ana Bilim Dalı
DOÇ. DR. UĞUR YAVUZ

Geri Dön