Yüksek boyutlu verı̇ kümelerı̇ndekı̇ öznı̇telı̇klerı̇n hı̇brı̇t yöntemle seçı̇lmesı̇

Hybrid feature selection in high dimensional data clusters

PDF İndir

Tez No: 444135
Yazar: ABDÜLKADİR GÜMÜŞÇÜ
Danışmanlar: PROF. DR. RAMAZAN TAŞALTIN
Tez Türü: Doktora
Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2016
Dil: Türkçe
Üniversite: Harran Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 73

Özet

Günümüzde mikro-dizilim veri setleri hastalık teşhisine önemli katkılar sağlamaktadır. Mikro-dizilim veri setlerini, makina öğrenme algoritmaları ile anlamlandırmak hasta sayısının azlığı gen sayısının fazlalığından ötürü oldukça çok zordur. Bu açıdan bakıldığında gen analizinde öznitelik seçme algoritmaları çok önemli bir işlem adımıdır. Literatürde genel olarak öznitelik seçme algoritmaları filtre, sarmal ve gömülü modeller olmak üzere 3 ana başlıkta incelenmektedir. Mikro-dizilim veri analizi için kullanılabilecek metotlar incelendiğinde; filtre modelli öznitelik seçme algoritmaları hızlı olmasına karşın her zaman istenilen başarı oranını sağlayamamaktadır, diğer taraftan sarmal modelli öznitelik seçme algoritmaları ise başarılı sonuçlar vermesine rağmen yavaş sonuçlar vermesi kullanım zorluğu yaşatmaktadır. Bahsedilen dezavantajları ortadan kaldırmak amacıyla bu tez çalışmasında filtre modelli öznitelik seçme algoritmalarının hızını, sarmal modelli öznitelik seçme algoritmalarının başarılı sonuçlarını harmanlayan hibrit bir öznitelik seçme algoritması önerilmiştir. Önerilen metodun filtre kısmında Ki-Kare, ReliefF ve F-Skor olmak üzere 3 farklı filtre modelli öznitelik seçme algoritması kullanıldı ve bu sonuçlar kombine edilerek genetik algoritmaya gönderildi. Genetik algoritma, filtrelenmiş veri setinin içinden en ideal veri setini seçmektedir. Seçilen final veri seti k-en yakın komşuluk (k-EK) sınıflandırma algoritması uygulanarak birini dışarıda bırak çapraz doğrulama (BDBÇD) ile değerlendirilmektedir. Önerilen metodun (CFR-GA) sınıflandırma başarı oranı, öznitelik seçme işlemi yapılmamış 7 sınıflandırma, 2 filtre modelli öznitelik seçme, 2 sarmal modelli öznitelik seçme ve 2 hibrit modelli öznitelik seçme algoritmaları ile kıyaslanmıştır. Deneysel sonuçlar, önerilen metodun kıyaslanan metotlara önemli ölçüde iyileştirmeler yaptığını göstermektedir.

Özet (Çeviri)

Nowadays, microarray data is an important contribution to the diagnosis of illness. In the way of machine learning classification, microarray data is a hard process task because of its large size of features and the low size of patient. In this regard feature selection is a pre-processing technique with great importance in microarray classification. In the literature, feature selection techniques in terms of classification can be examined under three titles. When we review the methods to be used for gene analysis; we can say that even though the filter methods are fast, they cannot reach to the required levels of success all the time and on the other hand, although wrapper models provide successful results, the slowness of this causes problems in utilization. The utilization of wrapper models in datasets with thousands of gene data such as microarray data is getting harder. In order to remove the disadvantages mentioned, a hybrid feature selection algorithm is suggested which combines the speed of filter models and the successful results of wrapper models. With the suggested method; three different filters of Chi-Squared, Fisher-Score and ReliefF (CFR) are implemented and this results are put together for genetic algorithm. Genetic algorithm chooses the most optimal feature set through filtered data set. This optimal feature set is evaluated with leave one out cross validation (LOOCV) by using k-nearest neighbor (k-NN) method. The classification success rate of the suggested method (CFR-GA) is compared with 7 classifications, 2 filter feature selection, 2 wrapper feature selection and 2 hybrid feature selection algorithms. The experimental results indicate that the suggested method has provided significant enhancements for the compared methods.

Benzer Tezler

Tez No
541786
Deep convolutional neural network based unconstrained ear recognition
Derin evrişimsel sinir ağı tabanlı kısıtsız kulak tanıma
FEVZİYE İREM EYİOKUR
Yüksek Lisans
İngilizce
2018
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DOÇ. DR. HAZIM KEMAL EKENEL
Tez No
949434
Radar target detection using improved transformer neural networks
Geliştirilmiş transformer sinir ağları ile radar hedef tespiti
SENA ÇAYBAŞI
Yüksek Lisans
İngilizce
2025
Elektrik ve Elektronik Mühendisliği İstanbul Teknik Üniversitesi
Elektronik ve Haberleşme Mühendisliği Ana Bilim Dalı
PROF. DR. IŞIN ERER
Tez No
964929
Anomaly detection in ınternet of medical things using deep learning
Anomaly detect ionin internet of medical things using deep learning
AYŞE BETÜL BÜKEN
Yüksek Lisans
İngilizce
2025
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Sakarya Üniversitesi
Yazılım Mühendisliği Ana Bilim Dalı
PROF. DR. DEVRİM AKGÜN
Tez No
730318
Büyük veri ve makine öğrenmesi kullanılarak elektrik tüketim örüntülerinin çıkarılması
Extracting electricity consumption patterns using big data and machine learning
FATİH ÜNAL
Doktora
Türkçe
2022
Enerji Fırat Üniversitesi
Enerji Sistemleri Mühendisliği Ana Bilim Dalı
PROF. DR. SAMİ EKİCİ
Tez No
863929
Medikal veri setleri için JAYA algoritması tabanlı öznitelik seçimi
Feature selection based on JAYA algorithm for medical datasets
MOHAMMAD KARRARI MOGHANJOUGHI
Yüksek Lisans
Türkçe
2024
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Kütahya Dumlupınar Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DOÇ. DR. GÜRCAN YAVUZ

Geri Dön