Video üzerinde derin öğrenme ile nesne sansürlüme

Sensor processing on video with deep learni̇ng

PDF İndir

Tez No: 898318
Yazar: YERNIYAZ BAKHYTOV
Danışmanlar: PROF. DR. CEMİL ÖZ
Tez Türü: Yüksek Lisans
Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2024
Dil: Türkçe
Üniversite: Sakarya Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Bilgisayar Mühendisliği Ana Bilim Dalı
Bilim Dalı: Bilgisayar Mühendisliği Bilim Dalı
Sayfa Sayısı: 83

Özet

Dünya Sağlık Örgütü'ne göre sigara içenler sayısı 1,3 milyara ulaşmış ve bunların 8 milyondan fazlası sigaraya bağlı hastalıklardan ölmüştür. Her yıl bu sayılar sürekli artmaktadır. Bazı ülkeler sigaranın zararlarını önlemek ve sigara kullanımını azaltmak için açık alanlarda sigara içmeyi yasaklamıştır. Buna rağmen sigara içenler sayısı düşmemektedir. Sigara alışkanlığı genellikle erken yaşlarda, çocuk diye bileceğimiz yaşlarda başlamaktadır. Çünkü çevrelerinde, sosyal medya, izlenilen film ve videolarda sigara tüketimini özendirilecek formatta görmektedirler. Tüm bu faktörler sigaraya özentiyi tetikliyor, insanlar sigaraya başlıyor ve zamanla bağımlı hale geliyorlar. Bu çalışmanın temel amacı, günümüz internet videolarının çocuklar üzerindeki olumsuz etkisini azaltmak ve sigara içmeyle ilişkilendirilen nesneleri tanımanın doğruluğunu ve güvenilirliğini artırmak için bilgisayar görme ve derin öğrenme tekniklerini kullanarak güvenli ve eğitici dijital deneyimler oluşturmaktır. Böylece sigara kullanımı açısından daha iyi medya içerik analizini yaparak bu sahnelerin sansürlenmesini sağlayarak sigara tüketimine yönlendiren videoların etkisini azaltmak ve sigara karşıtı kampanyaları desteklemektir. Çalışmanın önemli bir yönü, sigara algılama görevi için YOLOv5, YOLOv8 ve YOLOv9 modellerinin geliştirilmesi ve eğitilmesidir. YOLO, yüksek doğruluğu ve hızıyla bilinen, nesne algılama görevleri için popüler bir algoritmadır. Bu çalışmada, çeşitli aydınlatma koşulları ve arka planlar da dahil olmak üzere görüntülerdeki sigaraları doğru ve hızlı bir şekilde tespit edebilen bir sinir ağı uygulanmıştır. Sigara tanıma doğruluğu, Gerçek dünya ortamlarında sigara tespit eden güvenilir izleme sistemleri oluşturmak, sigara ile mücadelede kritik öneme sahiptir. Çalışma da sigara ve kül tablasını içeren iki tane veri seti oluşturulmuştur. Veri setindeki sigara ile kül tablasının konumlarını Roboflow web sitesin üzerinde sınırlayıcı kutu olarak işarettendi. YOLOv5, YOLOv8 ve YOLOv9 modellerinin performansların yükseltmek için hiper parametre optimizasyonunu kullanarak hazırlanan veri setleri üzerinde eğilmiştir. Modeller kesinlik, duyarlılık, Birleşim Üzerinden Kesişme ve genel ortalama kesinlik puanına göre değerlendirildi. Hiper parametre optimizasyonu, derin öğrenme modelleri için optimum performansa ulaşmada kritik bir rol oynamaktadır. Bu yüzden, çalışmada YOLO modellerine ince ayar yapmak için çeşitli optimizasyon teknikleri kullanılmıştır. İyi sonuçları elde etmek için seçildiği modellerin hiper parametreleri SGD, Adam ve AdamW gibi çeşitli optimize edicileri kullanılarak optimize edildi. Optimizasyon sonucunda SGD optimizeni kullanan YOLOv9 modeli diğer modellerden daha yüksek performansa ulaştığı görülmektedir. Ayar dağılım grafikleri, bu kombinasyonun doğruluk ve yakınsama hızı açısından üstünlüğünü göstererek onu uygulamalarımız için tercih edilen seçenek haline getirdi. Varsayılan parametre ile ilk eğitilmiş modellerin sonuçları: YOLOv5 %89'a, YOLOv8 %87'ye ve YOLOv9 %88'e ulaşmıştır. Hiper parametre optimizasyonundan sonra SGD optimize edicinin kullanımının en etkili olduğu kanıtlanmaktadır ve diğer optimize edicilerle karşılaştırıldığında üstün performans göstermektedir. YOLOv5'in performansı %89'da değişmeden kalırken, YOLOv8 ve YOLOv9'un performansı %93'e çıkmaktadır. YOLOv8 ve YOLOv9, SGD ile optimizasyondan sonra aynı mAP(50) değerine sahip olsa da, YOLOv9 mAP(50-95) metriğinde YOLOv8'den %2 doğruluk oranında daha iyi sonuç vermiştir. Çalışmamızın önemli bir yönü kül tablası tespitinin görüntülere dâhil edilmesidir. Kül tablası, sigara içmenin önemli bir göstergesidir ve bunun tanınması, sigara içme olaylarının tespitinin doğruluğunu önemli ölçüde artırır. Modellerimize başarılı bir şekilde entegre edilen kül tablasını tespit etmeye yönelik yaklaşımlar geliştirilmiştir. Bu yaklaşım ile genel sonuçlar iyileştirilmiş ve tanıma sistemi daha kararlı ve başarılı hale getirilmiştir. Sigara ve kül tablasını tespit etmek için bir algoritma geliştirilerek test edilmiştir; bu algoritma %96'lık bir doğruluğa ulaşarak daha iyi sonuçlar vermiştir. Bu algoritma, sigaraları ve kül tablasını başarıyla tanıyarak yanlış pozitif sayısını en aza indirerek eğitilen modellerin ilk sonucundan daha iyi performans göstermiştir. Çalışmamızın sonuçları, en yeni derin öğrenme modellerinin ve hiper parametre optimizasyonunun kullanılmasının, görüntülerde sigara ve kül tablası tespitinin verimliliğini ve doğruluğunu önemli ölçüde artırabileceğini göstermektedir. Geliştirilen model ve algoritmaların kullanımı, tütünün halk sağlığı üzerindeki zararlı etkilerinin izlenmesi, analiz edilmesi ve önlenmesine yönelik etkili araçlar sunarak tütün kullanımıyla mücadeleye önemli katkılar sağlayacaktır.

Özet (Çeviri)

According to the World Health Organization, the number of smokers reached 1.3 billion last year, and more than 8 million of them died from smoking-related diseases. These numbers are constantly growing every year, so to prevent this, several countries have introduced a ban on smoking in open areas. But still, the number of smokers does not decrease. Today, most smokers start smoking at a young age because they see people smoking in their environment, in films and videos on the Internet, and all this leads to the onset of smoking. Therefore, the main goal of the study is to create a safe and educational digital experience using computer vision and deep learning techniques to reduce the negative impact of modern Internet videos on children and improve the accuracy and reliability of smoking-related object recognition. This will significantly improve media content analysis and support anti-smoking campaigns. The rise of digital media and the internet has presented new challenges in tobacco control. Children and adolescents are frequently exposed to content that glorifies smoking, which can influence their attitudes towards tobacco use. There is a critical need for technologies that can automatically detect and analyze media content related to smoking-related images to reduce these effects. Therefore, it aims to examine modern technologies in the field of computer vision that allow to automate the process of recognizing smoking-related objects in images and videos, and solutions such as automatic content filtering and providing age-appropriate recommendations are being developed to solve problems such as exposure to harmful content in the video and excessive screen time. An important aspect of the work is the development and training of YOLOv5, YOLOv8 and YOLOv9 models for the cigarette detection task. YOLO is a popular algorithm for object detection tasks, known for its high accuracy and speed. YOLOv5 is an open-source object detection model that offers significant improvements in speed and accuracy over its predecessors. It is highly optimized for real-time detection tasks, making it suitable for applications requiring quick response times. Compared to previous versions, YOLOv8 is faster and more accurate, and requires fewer parameters to achieve its performance. It utilizes novel techniques in feature extraction and bounding box regression to achieve higher accuracy in object detection tasks. YOLOv9 is one of the latest versions of the YOLO algorithm, published in February 2024. A distinctive feature of YOLOv9 from previous versions is the use of the Generalized Efficient Layer Aggregation Network (GELAN) and Programmable Gradient Information (PGI). This project will create and tune a neural network that can accurately and quickly detect cigarettes in images, including various lighting conditions and backgrounds. Cigarette recognition accuracy is critical to creating reliable monitoring systems that can be used in real-world environments. We compiled a comprehensive dataset of images featuring cigarettes and ashtrays. In the study, we collected two datasets for the cigarette detection task and the ashtray detection task. It contains 1200 images of people smoking and 368 images of ashtrays. Each image in the dataset is annotated to indicate the presence of a cigarette in the photo, in the form of bounding boxes around the cigarette. The dataset includes various scenes, lighting and poses of people smoking, as well as background options such as streets, cafes and houses, making it representative of different cigarette detection scenarios. Bounding boxes were added to the dataset to indicate the locations of objects of interest. First, the model was trained using different sizes of YOLOv5, YOLOv8 and YOLOv9. When comparing the results of YOLO models, YOLOv5 models have higher accuracy than YOLOv8 and YOLOv9. More precisely, the highest result was 89%. For YOLOv8, the highest accuracy is 87%. For YOLOv9, the accuracy is 88%. Then we trained YOLOv5, YOLOv8 and YOLOv9 models on the prepared dataset using hyperparameter optimization to fine-tune their performance. Models were evaluated based on accuracy, precision, recall, and mAP score. Hyperparameter optimization plays a critical role in achieving optimal performance for deep learning models. Therefore, in the study, we tried various optimization techniques to fine-tune the YOLO models. To obtain good results, the hyperparameters of YOLOv5, YOLOv8 and YOLOv9 models were optimized using various optimizers such as SGD, Adam and AdamW. As a result of the optimization, it was seen that using the SGD optimizer with the YOLOv9 model allowed us to obtain the best results. Setting distribution plots demonstrated the superiority of this combination in terms of accuracy and convergence speed, making it the preferred choice for our applications. Results of the first trained models with default parameters: YOLOv5 achieves 89%, YOLOv8 achieves 87%, and YOLOv9 achieves 88%. After hyperparameter optimization, using the SGD optimizer proves to be the most effective and shows superior performance compared to other optimizers. While the performance of YOLOv5 remains unchanged at 89%, the performance of YOLOv8 and YOLOv9 increases to 93%. Although YOLOv8 and YOLOv9 have the same mAP(50) value after optimization with SGD, YOLOv9 outperforms YOLOv8 by 2% in the mAP(50-95) metric. An important aspect of our study was the inclusion of ashtray detection in the images. Ashtray is an important indicator of smoking, and its recognition significantly increases the accuracy of detecting smoking incidents. We developed approaches to detecting ashtrays that were successfully integrated into our models, improving the overall results and making the system more versatile. In addition, a special algorithm for detecting cigarettes and ashtrays has been developed and tested; This algorithm produced impressive results, reaching an accuracy of 96%. This algorithm successfully recognized cigarettes and ashtrays, minimizing the number of false positives and outperforming the initial result of models YOLOv5, YOLOv8 and YOLOv9. The results of our study show that the use of state-of-the-art deep learning models and hyperparameter optimization can significantly increase the efficiency and accuracy of cigarette and ashtray detection in images. The use of the developed models and algorithms provides significant contributions to the fight against tobacco use by providing effective tools for monitoring, analyzing and preventing the harmful effects of tobacco on public health. The use of state-of-the-art deep learning models and hyperparameter optimization has significantly increased the efficiency and accuracy of cigarette and ashtray detection in images. The developed models and custom algorithm provide effective tools for monitoring, analyzing, and preventing the harmful effects of tobacco on public health. The integration of these technologies into digital media platforms can play a crucial role in reducing the exposure of smoking-related content to vulnerable populations, particularly children. The promising results of this study pave the way for further research and development in this field, with the potential to make a significant impact on public health and safety.

Benzer Tezler

Tez No
805747
Zararlı video içeriklerinin derin öğrenme teknikleri ile tespiti ve filtrelenmesi için bir yazılım aracı geliştirilmesi
Development of a software tool for detecting and filtering harmful video content with deep learning techniques
FATMA GÜLŞAH TAN
Doktora
Türkçe
2023
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Süleyman Demirel Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DOÇ. DR. ASIM SİNAN YÜKSEL
Tez No
684409
Derin öğrenme ile İHA görüntülerinden nesne tespitinin yapılması
Object detection from UAV images with deep learning
EMİR ALBAYRAK
Yüksek Lisans
Türkçe
2021
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Bilecik Şeyh Edebali Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. UĞUR YÜZGEÇ
Tez No
778019
Sensör füzyonuna dayalı derin öğrenme yöntemleri ile nesne tanıma başarısının artırılması
Increasing object detection success with deep learning methods based on sensor fusion
AHMET ÖZCAN
Doktora
Türkçe
2023
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Milli Savunma Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ ÖMER ÇETİN
Tez No
759996
Derin öğrenme ile cerrahi video anlama
Surgical video understanding with deep learning
ABDISHAKOUR ABDILLAHI AWALE ABDISHAKOUR ABDILLAHI AWALE
Yüksek Lisans
İngilizce
2022
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Gazi Üniversitesi
Bilişim Sistemleri Ana Bilim Dalı
DR. ÖĞR. ÜYESİ DUYGU SARIKAYA
Tez No
784480
İnsansız hava aracından çekilen videolar kullanılarak derin öğrenme yaklaşımı ile nesne tespiti
Object detection by deep learning approach using videos taken from unmanned aerial vehicle
AYŞAN USTA
Yüksek Lisans
Türkçe
2022
Elektrik ve Elektronik Mühendisliği Dicle Üniversitesi
Elektrik ve Elektronik Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ MUHAMMET ALİ ARSERİM

Geri Dön