Makine öğrenmesi yöntemleri ile yangın verilerinin analizi ve sınıflandırılması

Analysis and classification of fire data using machine learning methods

PDF İndir

Tez No: 962554
Yazar: ZEYNEP NAZLI ASLAN
Danışmanlar: DOÇ. DR. BEYTULLAH EREN
Tez Türü: Yüksek Lisans
Konular: İlk ve Acil Yardım, Emergency and First Aid
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2025
Dil: Türkçe
Üniversite: Sakarya Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Afet Yönetimi Ana Bilim Dalı
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 113

Özet

Yangınlar, hem doğal süreçlerin hem de insan kaynaklı faaliyetlerin sonucu olarak ortaya çıkabilen ve yerleşim yerlerinden sanayi alanlarına kadar geniş bir coğrafyada ciddi can ve mal kayıplarına neden olan afetler arasında yer almaktadır. Artan kentleşme, iklim değişikliği ve sanayileşme yangın riskini artırmakta; bu durum, geleneksel yöntemlerin ötesine geçerek veri odaklı karar destek sistemlerine olan ihtiyacı ortaya koymaktadır. Bu çalışmanın temel amacı, yangın verileri üzerinde makine öğrenmesi algoritmalarını uygulayarak yangın türlerinin sınıflandırılması ve elde edilen modellerin performanslarının karşılaştırılmasıdır. Araştırma kapsamında, 2015–2023 yılları arasında bir ilin yerel itfaiye kayıtlarından elde edilen 29.135 adet yangın verisi kullanılmıştır. Veriler ön işleme tabi tutulmuş, ardından Ki-Kare testi ile en etkili öznitelikler belirlenmiştir. Sınıflandırma sürecinde üç farklı makine öğrenmesi algoritması olan Karar Ağacı (Fine Tree), Gaussian Naive Bayes (GNB) ve Lineer Destek Vektör Makineleri (Linear SVM) kullanılmıştır. Tüm modeller, 5 katlı çapraz doğrulama yöntemiyle test edilmiş; doğruluk oranı, karışıklık matrisi, F1-Skoru ve ROC eğrisi (AUC) gibi performans ölçütleriyle değerlendirilmiştir. Analizler sonucunda, en yüksek doğruluk oranı %61,4 ile Linear SVM modelinde elde edilmiş; onu %61,2 ile Karar Ağacı ve %58,5 ile GNB modeli takip etmiştir. AUC değerleri açısından da Linear SVM modeli 0,7511 ile en yüksek ayırt etme gücüne sahip model olarak öne çıkmıştır. Karar Ağacı modeli, yorumlanabilirliği sayesinde karar vericiler için avantaj sağlamış; GNB ise düşük hesaplama maliyeti ile dikkat çekmiştir. Modellerin“Tamamen”yangın sınıfını sınıflandırmada zorlandığı gözlemlenmiş ve bu durum, örnek dengesizliği ile açıklanmıştır. Çalışma, literatüre; kentsel ve kırsal yangınları birlikte ele alan, gerçek saha verilerine dayalı geniş kapsamlı bir sınıflandırma sunması; farklı makine öğrenmesi algoritmalarının aynı veri seti üzerinde karşılaştırmalı olarak analiz edilerek en uygun modelin belirlenmesi; ve özellik seçiminin model başarısına etkisinin somut olarak ortaya konması açısından katkı sağlamaktadır. Sonuç olarak bu tez, yangın tahmin süreçlerine yönelik makine öğrenmesi tabanlı yaklaşımıyla yerel yönetimlerin müdahale stratejilerine bilimsel bir temel sunmakta; özellikle kaynak planlaması, risk yönetimi ve erken uyarı sistemlerinin geliştirilmesine destek olmaktadır. İleride yapılacak çalışmalarda, farklı coğrafi bölgelerin verileriyle desteklenmiş ve sınıf dengesi sağlanmış veri kümeleriyle modelin genellenebilirliğinin artırılması önerilmektedir.

Özet (Çeviri)

Fires represent one of the most catastrophic types of disasters that can have profound and far-reaching consequences for human life, infrastructure, public health, and the natural environment. As societies continue to evolve under the pressures of climate change, increasing population density, urban sprawl, and accelerating industrialization, fire-related incidents have not only become more frequent, but also more destructive, unpredictable, and complex in nature. The interconnection between anthropogenic activities and environmental changes has exacerbated fire risks, thereby demanding a paradigm shift from conventional fire response mechanisms to more proactive, data-driven, and intelligent systems for prevention, early detection, and rapid mitigation. In traditional fire management systems, decision-making is predominantly reactive, relying heavily on human expertise, past experiences, and manual assessments of risk. While such approaches have served well in the past, they are no longer sufficient in today's data-rich and dynamically evolving environments. The sheer volume, variety, and velocity of fire-related data necessitate more sophisticated analytical tools capable of processing complex datasets, extracting hidden patterns, and making timely, reliable predictions. In this context, machine learning (ML)—a subfield of artificial intelligence that focuses on algorithms capable of learning from data and improving over time—has emerged as a powerful solution for enhancing fire risk analysis and emergency response planning. This study seeks to develop and evaluate a machine learning-based classification framework that can accurately analyze and predict the severity of fire incidents using real-world fire occurrence data. Specifically, the research is centered on building supervised classification models that categorize fire events into three severity-based levels:“Initial,”“Partial,”and“Complete.”The dataset employed in this research consists of 29,135 individual fire incident records obtained from a municipal fire department in a province of Turkey, covering a comprehensive nine-year period from 2015 to 2023. The dataset is rich in features and includes spatial variables (e.g., district, neighborhood), temporal variables (e.g., date, time of day, season), causal attributes (e.g., electrical faults, human negligence, unknown causes), and impact-related metrics (e.g., damage extent, number of casualties, distance to nearest fire station). The methodological approach adopted in this study followed a structured, multi-phase process, beginning with extensive data preprocessing. This stage involved identifying and addressing missing or anomalous values, converting categorical variables into numerical representations through encoding techniques, and applying normalization where appropriate to standardize feature scales. Following preprocessing, a feature selection phase was conducted using the Chi-Square statistical test. This method was employed to determine the degree of association between independent variables and the categorical target variable (fire severity level), thereby enabling the exclusion of less relevant or redundant features and improving both model interpretability and computational efficiency. Subsequently, the study implemented three widely recognized supervised learning algorithms to perform the classification task: Decision Tree (specifically, the Fine Tree variant), Gaussian Naive Bayes (GNB), and Linear Support Vector Machines (SVM). Each of these models offers distinct advantages and challenges in terms of computational complexity, interpretability, and classification performance. To ensure the robustness and generalizability of the results, a 5-fold cross-validation approach was employed during model training. This method involves dividing the dataset into five equally sized subsets and iteratively using four subsets for training and one for validation, thus minimizing the risk of overfitting and providing a more reliable assessment of each model's performance. The classification performance of the models was evaluated using a comprehensive set of metrics, including overall accuracy, precision, recall, F1-score, confusion matrix analysis, and the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve. Among the models tested, the Linear SVM classifier demonstrated the highest classification accuracy, achieving a rate of 61,4%, closely followed by the Decision Tree model at 61,2%. The GNB model, although slightly less accurate with a score of 58,5%, exhibited superior computational efficiency and fast inference times. The AUC analysis further reinforced the superior performance of the Linear SVM model, which attained an AUC value of 0,7511. Notably, the model was particularly effective at correctly identifying fire incidents within the“Partial”severity class, demonstrating strong discriminative power for this category. However, the performance of all models was less satisfactory when classifying“Complete”severity fires. This shortfall is likely attributed to an imbalanced class distribution in the dataset, where the number of“Complete”fire cases was significantly lower compared to the other categories. Additionally, overlapping feature characteristics between severity levels may have contributed to classification ambiguity. These limitations highlight the importance of addressing class imbalance through advanced sampling methods such as Synthetic Minority Over-sampling Technique (SMOTE) or by collecting more granular and representative data for underrepresented fire types. Beyond numerical performance, the interpretability of the models was also a critical consideration in this study. The Decision Tree model, for instance, offered easily understandable, rule-based decision pathways that can be directly interpreted and utilized by emergency planners, fire department personnel, and municipal decision-makers. On the other hand, while the GNB model did not achieve the highest accuracy, its simplicity and computational speed make it a favorable choice for deployment in environments with limited processing capabilities or real-time operational requirements. This research offers several original contributions to the existing body of literature on fire risk prediction and machine learning applications in disaster management. First, the study introduces a large-scale, real-world fire classification framework that integrates both urban and rural fire data—an important yet often overlooked dimension in current fire-related studies, which typically focus on forest fires or specific building types. Second, by systematically comparing three machine learning algorithms under uniform conditions, the research provides valuable insights into model selection for similar use cases. Third, the application of the Chi-Square feature selection method effectively demonstrates how careful attribute selection can enhance model performance and reduce unnecessary computational overhead. In addition to model development and evaluation, the study delves into exploratory data analysis to uncover meaningful temporal and spatial patterns within the fire dataset. The findings indicate that fire incidents tend to peak during the summer and autumn seasons, with industrial areas and densely populated urban neighborhoods showing higher fire frequencies. Furthermore, human-related causes such as electrical failures, careless behavior, and unsafe disposal of flammable materials emerged as predominant factors in fire ignition. These insights align with national and international fire statistics and reinforce the need for more targeted fire prevention strategies that consider both behavioral and environmental risk factors. One of the key lessons drawn from this research is the importance of high-quality, comprehensive datasets in machine learning-driven fire management. The challenges encountered due to class imbalance and feature overlap underscore the necessity for future studies to incorporate more diverse and balanced data sources. Enhancing the dataset with external variables—such as meteorological conditions (e.g., temperature, humidity, wind speed), land use data, topographical information, and remote sensing imagery—could significantly improve the granularity and predictive accuracy of classification models. Ultimately, this thesis illustrates that machine learning offers a powerful and scalable approach for enhancing fire risk assessment and supporting data-informed decision-making in emergency management. By integrating these models into operational workflows, local governments, fire brigades, and public safety agencies can proactively allocate resources, anticipate periods and regions of elevated fire risk, and implement timely interventions to reduce casualties and property loss. The findings of this study not only contribute to the academic discourse on artificial intelligence applications in public safety but also offer practical pathways toward the development of intelligent, adaptive fire management systems tailored to the evolving needs of modern societies.

Benzer Tezler

Tez No
455400
Süne ve kımıl zararlılarının ses işleme yöntemleri ile sınıflandırılması ve bir gömülü sistem gerçeklemesi
Classification of sunn pests using sound processing methods and an embedded system realization
BİLGİ GÖRKEM YAZGAÇ
Yüksek Lisans
Türkçe
2017
Elektrik ve Elektronik Mühendisliği İstanbul Teknik Üniversitesi
Elektronik ve Haberleşme Mühendisliği Ana Bilim Dalı
DOÇ. DR. MÜRVET KIRCI
Tez No
555669
Artificial intelligence based detection schemes for secure wireless communication
Güvenli telsiz iletişimin sağlanmasına yönelik yapay zeka tabanlı sınıflandırma metotları
SELEN GEÇGEL
Yüksek Lisans
İngilizce
2019
Elektrik ve Elektronik Mühendisliği İstanbul Teknik Üniversitesi
Elektronik ve Haberleşme Mühendisliği Ana Bilim Dalı
PROF. DR. GÜNEŞ ZEYNEP KARABULUT KURT
Tez No
921122
Makine öğrenmesi algoritmaları ile şifreli trafiğin sınıflandırılması
Classification of encrypted traffic with machine learning algorithms
NUR BETÜL DEMİREL
Yüksek Lisans
Türkçe
2025
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Marmara Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ AYDIN ERDEN
Tez No
910857
Üç eksenli robot kol tasarımı ve robot kolun kontrolü için EEG verilerinin makine öğrenmesi ile sınıflandırılması
Design of a 3-axis robot arm and classification of EEG data with machine learning to control the robot arm
KAAN ASLIM
Yüksek Lisans
Türkçe
2024
Elektrik ve Elektronik Mühendisliği Haliç Üniversitesi
Makine Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ GÖKÇE AKGÜN
Tez No
917311
Explainable deep learning classification of tree species with very high resolution VHRTreeSpecies dataset
Açıklanabilir derin öğrenme yöntemleri ile çok yüksek çözünürlüklü VHRTreeSpecies veri seti kullanılarak ağaç türlerinin sınıflandırılması
ŞULE NUR TOPGÜL
Yüksek Lisans
İngilizce
2025
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
İletişim Sistemleri Ana Bilim Dalı
PROF. DR. ELİF SERTEL

Geri Dön