Sınıflandırma için makine öğrenim yöntemlerinin performans değerlendirmesi: Futbol verisi uygulaması

Performance evaluation of machine learning methods for classification: An application of football data

PDF İndir

Tez No: 831884
Yazar: DUYGU TOPCU
Danışmanlar: DOÇ. DR. ÖZGÜL VUPA ÇİLENGİROĞLU
Tez Türü: Yüksek Lisans
Konular: İstatistik, Statistics
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2023
Dil: Türkçe
Üniversite: Dokuz Eylül Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: İstatistik Ana Bilim Dalı
Bilim Dalı: Veri Bilimi Bilim Dalı
Sayfa Sayısı: 76

Özet

Futbol, hem dünya genelinde hem de Türkiye'de en popüler sporlardan biridir. Bu yaygınlık, bilgi teknolojilerinde kullanılmakta ve gelişen veri bilimi ile birlikte maç istatistiklerinin kolaylıkla analiz edilmesini sağlamaktadır. Futbol müsabakalarında en çok ilgi gören konu genellikle maç sonucudur. Maç sonucunu etkileyen birçok farklı faktör (gol sayısı, takımın kart sayısı, hava koşulları, deplasmanda oynama durumu vb.) bulunmaktadır. Bu araştırmada, Türkiye Futbol Federasyonu Süper Ligi'nde 2018-2019, 2019-2020 ve 2020-2021 sezonlarında oynanan maçlardan elde edilen iki farklı veri seti kullanılarak sınıflandırma ve karar ağacı yöntemleriyle analiz yapılmıştır. İlk veri seti iki sezonu kapsamaktadır ve maçlarda ev sahibi ve rakip takımların aldığı kırmızı veya sarı kartlar, yabancı oyuncu sayısı ve atılan gol sayıları gibi değişkenler bağımsız değişkenler olarak kullanılmıştır. Bu değişkenlere bağlı olarak ev sahibi takımın kazanma veya kaybetme durumu, Lojistik Regresyon ve Karar Ağacı (CART, QUEST ve CHAID) algoritmalarıyla modellenmiştir. İkinci veri seti ise üç sezonu kapsamaktadır ve takım ikili mücadele başarı oranı, rakip isabetli şut yüzdesi, takım isabetli pas yüzdesi, rakip isabetli pas yüzdesi, rakibin kırmızı kart görüp görmemesi, takımdaki yabancı oyuncu sayısı, takımın ofansif gücü ve rakibin defansif gücü gibi değişkenler kullanılarak ev sahibi takımın kazanma veya kaybetme durumları, Lojistik Regresyon, Karar Ağaçları (CART, QUEST ve CHAID) ve Rastgele Orman algoritmalarıyla modellenmiştir. Bu çalışma kapsamında birinci veri seti için altı, ikinci veri seti için ise beş farklı model oluşturulmuştur. Oluşturulan modellerin doğruluk yüzdeleri, duyarlılıkları, seçicilikleri ve F-skor değerleri karşılaştırılarak en iyi modeller belirlenmiştir.

Özet (Çeviri)

Football is one of the most followed sports in the world and in Turkey. This prevalence of football is used in information technologies and with the developing data science, match statistics can be determined easily. The most important subject in football competitions is the match result. There are many different criteria (number of goals scored, number of cards received by the team, weather conditions, playing away etc.) that affect the match result. In this study, two data sets obtained from the matches played in the Turkish Football Federation Super League 2018-2019, 2019-2020 and 2020-2021 seasons were examined by classification and decision tree methods. The first data set covers two seasons. In the matches played, the red or yellow cards received by the host and the opposing team, the number of foreign players in the teams and the number of goals scored were determined as independent variables by bringing them into a categorical format. Depending on these variables, the winning or losing situation of the home team is modeled using Logistic Regression and Decision Tree (CART, QUEST and CHAID) algorithms. The second dataset includes all three seasons. For the second data set, the variables of the tackle success rate of the home team, the shot on target percentage of the away team, the good pass percentage of the home team, the good pass percentage of the away team, whether the away team has received a red card, the number of foreign players in the home team, the offensive power of the home team and the defensive power of the away team were taken into consideration for the home team to win. and losing cases are modeled with Logistic Regression, Decision Trees (CART, QUEST and CHAID) and Random Forest algorithms. Within the scope of this study, six different models were created for the first data set and five different models for the second data set. The accuracy percentages, sensitivities, selectivity and F-score values of the models were compared and the best models were decided.

Benzer Tezler

Tez No
832446
Dağıtık üretim güç sistemlerinde geliştirilmiş oylama modeli tabanlı arıza tespiti ve sınıflandırması
Improved voting model based fault detection and classification in distributed generation power systems
FEVZEDDİN ÜLKER
Doktora
Türkçe
2023
Elektrik ve Elektronik Mühendisliği Sakarya Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ AHMET KÜÇÜKER
Tez No
954003
Afet sonrası yapı hasarlarının uzaktan algılama görüntüleri ile semantik segmentasyon tabanlı değerlendirilmesi
Post-disaster structural damage assessment based on semantic segmentation using remote sensing images
SERHAT ALPERGİN
Yüksek Lisans
Türkçe
2025
Elektrik ve Elektronik Mühendisliği Dicle Üniversitesi
Elektrik ve Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. MEHMET SİRAÇ ÖZERDEM
DOÇ. DR. HASAN POLAT
Tez No
951950
Design of a wearable sensor system for artificial intelligence based motion analysis in telerehabilitation
Telerehabilitasyon amaçlı yapay zekâ tabanlı hareket analizi içingiyilebilir sensör sistemi tasarımı
AHMED ABDELWAHAB MAHGOUB HAKIM
Yüksek Lisans
İngilizce
2025
Elektrik ve Elektronik Mühendisliği Hacettepe Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
DOÇ. DR. ŞÖLEN KUMBAY YILDIZ
PROF. DR. ATİLA YILMAZ
Tez No
888530
Development of operation and maintenance strategies for offshore wind industry based on big data management
Büyük veri yönetimi ile açık deniz rüzgar endüstrisinde işletme ve bakım stratejilerinin geliştirilmesi
UWE LUETZEN
Doktora
İngilizce
2024
Enerji İstanbul Teknik Üniversitesi
Gemi ve Deniz Teknoloji Mühendisliği Ana Bilim Dalı
PROF. DR. SERDAR BEJİ
Tez No
909789
Uydu görüntüleri ve makine öğrenimi yöntemlerini kullanarak arazi örtüsü ve arazi kullanım haritalarının üretilmesi: Haçmaz-Şabran örneği
Generation of land cover and land use maps using satellite imagery and machine learning methods: The case of Khachmaz-Shabran
NARIMAN IMRANLI
Yüksek Lisans
Türkçe
2023
Astronomi ve Uzay Bilimleri Milli Savunma Üniversitesi
Uçak ve Uzay Mühendisliği Ana Bilim Dalı
PROF. DR. ELİF SERTEL

Geri Dön