'R' programlama dilinde tahmin edici veri madenciliği algoritmalarının modellenmesi ve performanslarının karşılaştırılması

Modeling of predictive data mining algorithms in the 'R' programming language and comparison of their performances

PDF İndir

Tez No: 734837
Yazar: ŞENGÜL CAN
Danışmanlar: DOÇ. DR. MUSTAFA GERŞİL
Tez Türü: Doktora
Konular: İşletme, Business Administration
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2022
Dil: Türkçe
Üniversite: Manisa Celal Bayar Üniversitesi
Enstitü: Sosyal Bilimler Enstitüsü
Ana Bilim Dalı: İşletme Ana Bilim Dalı
Bilim Dalı: İşletme Bilim Dalı
Sayfa Sayısı: 151

Özet

Günümüz ekonomisi dinamik bir yapıdadır. Gelişen bilgi teknolojileriyle birlikte kayıt altında tutulan veri sayısı da artmıştır. Artan veri miktarı, veriler arasındaki sarmal ilişkileri görmeyi zorlaştırmaktadır. Ham veriden bilgi elde edilmesi ve elde edilen bilginin gelecek tahminlerinde kullanılması işletmeler için kritik öneme sahiptir. Veri tahmini kesinlik içermeyen ve karmaşık bir süreçtir. Ancak doğruya en yakın tahmin işletmelerin stratejik karar almasında oldukça önemlidir. Veri tahmini ekonomi alanında yaygın olarak kullanılmaktadır. Bu çalışmada ekonomik kalkınma için büyük öneme sahip ihracat verileri incelenmiştir. Türkiye İstatistik Kurumu ve Merkez Bankası istatistikleri kullanılarak veri ambarı oluşturulmuştur. İstatistiksel analizlerde sıklıkla tercih edilen R programlama dili kullanılarak algoritmalar geliştirilmiştir. R programlama dilinde yapay sinir ağı, regresyon ve zaman serisi algoritmaları geliştirilmiştir. Çalışmanın birinci aşamasında; R programında yapay sinir ağı algoritması geliştirilmiştir. Bu aşamada farklı ağ topolojileri test edilerek en başarılı yapay sinir ağı belirlenmiştir. Buna göre (5,3) topolojisine sahip ağın en başarılı performansa sahip olduğu görülmüştür. Çalışmanın ikinci aşamasında R programında regresyon algoritması geliştirilmiştir. Çalışmanın son aşamasında R programında zaman serisi algoritması geliştirilmiştir. Naive Bayes ve ARIMA modelleri test edilmiş ve ARIMA(1,0,0)(2,0,0) modelinin daha başarılı olduğu görülmüştür. Yapay sinir ağı (5,3), regresyon ve ARIMA(1,0,0)(2,0,0) algoritmaları veri ambarındaki eğitim verisi üzerinde denenmiştir. Algoritmaların başarıları istatistiksel hata oranları hesaplanarak karşılaştırılmıştır. Buna göre en başarılı tahmin algoritmasının yapay sinir ağı olduğu görülmüştür.

Özet (Çeviri)

Today's economy is dynamic. With the developing information technologies, the number of recorded data has also increased. The increasing amount of data makes it difficult to see the relationships among the data. Obtaining information from raw data and using the obtained information in future predictions are of critical importance for businesses. Data estimation is an imprecise and complex process. However, estimation which is the closest to right is very important for businesses to make strategic decision. Data forecasting is widely used in economics. In this study, export data, which is of great importance for economic development, was examined. A data warehouse was created using statistics from the Turkish Statistical Institute and The Central Bank of the Republic of Turkey. Algorithms were developed using the R programming language, which is frequently preferred in statistical analysis. Artificial neural network, regression and time series algorithms were developed in R programming language. In the first stage of the study, an artificial neural network algorithm was developed in the R program. At this stage, different network topologies were tested and the most successful artificial neural network was determined. Accordingly, it was detected that the network with the (5,3) topology had the most successful performance. In the second stage of the study, a regression algorithm was developed in the R program. In the last stage of the study, the time series algorithm was developed in the R program. Naive Bayes and ARIMA models were tested and it was detected that the ARIMA(1,0,0)(2,0,0) model was more successful. Artificial neural network (5,3), regression and ARIMA(1,0,0)(2,0,0) algorithms were tested for the training data in the data warehouse. The success of the algorithms was compared by calculating the statistical error rates. Accordingly, it was concluded that the most successful prediction algorithm was the artificial neural network.

Benzer Tezler

Tez No
579904
A dynamic risk assessment methodology (Dy-RAM) in port waters
Liman sularında dinamik risk değerlendirme (Dy-RAM) metodolojisi
ÜLKÜ ÖZTÜRK
Doktora
İngilizce
2019
Denizcilik İstanbul Teknik Üniversitesi
Deniz Ulaştırma Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ KADİR ÇİÇEK
Tez No
744738
Veri madenciliğinde lojistik regresyon modellerinin incelenmesi
Investigation of logistics regression models in data mining
RECEP ÖZSÜRÜNÇ
Doktora
Türkçe
2022
İşletme İstanbul Üniversitesi
İşletme Ana Bilim Dalı
PROF. DR. ÇİĞDEM ARICIGİL ÇİLAN
Tez No
620567
A new rna-seq data classifier based on quantile transformation
Kuantil transformasyon tabanlı yeni bir rna-sekans veri sınıflandırıcısı
NECLA KOÇHAN
Doktora
İngilizce
2020
Biyoistatistik İzmir Ekonomi Üniversitesi
Matematik Ana Bilim Dalı
PROF. DR. GÖZDE YAZGI TÜTÜNCÜ AŞÇI
Tez No
913729
Bilgisayarda bireyselleştirilmiş test uygulamalarında kapsam dengelemenin ölçme kesinliğine etkisi
Effects of content balancing on measurement precision in computerized adaptive tests
İLKAY ÜÇGÜL ÖCAL
Doktora
Türkçe
2024
Eğitim ve Öğretim Hacettepe Üniversitesi
Eğitim Bilimleri Ana Bilim Dalı
PROF. DR. NURİ DOĞAN
Tez No
800569
Yeni zelanda GPS zaman serileri verisinin bayesci istatistik ile incelenmesi
Investigation of the New Zealand time series data with bayesian statistics
KUBİLAY ÖZCAN
Yüksek Lisans
Türkçe
2023
Jeoloji Mühendisliği İstanbul Teknik Üniversitesi
Jeoloji Mühendisliği Ana Bilim Dalı
PROF. DR. GÜRSEL SUNAL
PROF. DR. MEHMET SİNAN ÖZEREN

Geri Dön