Derin öğrenme algoritmaları ile personel geri bildirimlerinin sınıflandırılması ve analizi
Classification and analysis of employee feedback with deep learning algorithms
- Tez No: 947073
- Danışmanlar: DR. ÖĞR. ÜYESİ SERAP ÇAKAR KAMAN
- Tez Türü: Yüksek Lisans
- Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
- Anahtar Kelimeler: Belirtilmemiş.
- Yıl: 2025
- Dil: Türkçe
- Üniversite: Sakarya Üniversitesi
- Enstitü: Fen Bilimleri Enstitüsü
- Ana Bilim Dalı: Bilgisayar Mühendisliği Ana Bilim Dalı
- Bilim Dalı: Bilgisayar Mühendisliği Bilim Dalı
- Sayfa Sayısı: 61
Özet
Bu çalışma, kurumsal sürdürülebilirlik ve verimlilikte kritik rol oynayan personel memnuniyeti ve motivasyonunun, yapay zeka destekli analiz yöntemleriyle sistematik biçimde değerlendirilmesini amaçlamaktadır. Geri bildirimlerin manuel olarak analiz edilmesinin zorlukları, özellikle büyük veri kümeleri karşısında zaman ve kaynak açısından verimsiz hale gelmekte, bu da kurumların stratejik karar alma mekanizmalarını sekteye uğratmaktadır. Bu noktadan hareketle, çalışmada NLP ve derin öğrenme mimarileri kullanılarak personel geri bildirimlerinin sınıflandırılması hedeflenmiştir. Derin öğrenme tabanlı modeller olan Zamansal Evrişimsel Ağlar (TCN), Evrişimsel Sinir Ağları (CNN), Uzun Kısa Süreli Bellek (LSTM) ve çift yönlü bağlam anlayışı sunan BERT mimarileri kullanılmış; bu modellerin metin sınıflandırma performansları karşılaştırmalı olarak analiz edilmiştir. Veri kümesi, Türkiye'de özel bir şirkette yürütülen toplantılar, anketler ve birebir görüşmeler sonucunda toplanan 386 özgün Türkçe geri bildirim cümlesi ve bunlara dayanarak oluşturulan 6.614 sentetik örnekten oluşan dengeli ve etiketlenmiş 7.000 cümlelik bir veri setinden oluşmaktadır. Veriler 14 farklı kategoriye ayrılarak etiketlenmiş, ardından veri temizleme, tokenizasyon, sayısallaştırma, padding ve encoding gibi ön işleme adımları gerçekleştirilmiştir. Model eğitim süreci 5 katlı çapraz doğrulama yöntemiyle yapılandırılmış ve doğruluk, kayıp, kesinlik, geri çağırma ve F1 puanı gibi performans metrikleri üzerinden değerlendirme yapılmıştır. Analiz sonuçlarına göre, CNN modeli %96,40 doğruluk ve %96,41 F1 puanı ile en yüksek sınıflandırma performansını sergilemiştir. BERT modeli, %94,91 doğruluk oranı ile ikinci sırada yer almış ve bağlamsal anlayış gücü sayesinde güçlü sonuçlar üretmiştir. Ancak BERT'in eğitim süresi (245,56 sn) ve işlem maliyeti diğer modellere göre daha yüksektir. TCN modeli %94,36 doğruluk oranı ile tatmin edici bir performans göstermiş ve en kısa eğitim süresi (43,51 sn) ile dikkat çekmiştir. LSTM ise %93,70 doğruluk ile göreli olarak daha düşük performans sergilemiştir. Bu bulgular, CNN'in personel geri bildirimlerini sınıflandırmada en etkili model olduğunu ortaya koymakta; BERT'in ise bağlamsal derinliği ile sınıflandırma kalitesini artırabileceğini göstermektedir. Çalışma, yalnızca model karşılaştırması sunmakla kalmayıp, aynı zamanda Türkçe kurumsal veri seti üzerinde metin sınıflandırma uygulamalarının önemini ortaya koymaktadır. Elde edilen bulgular, insan kaynakları yönetimi, organizasyonel gelişim ve iç iletişim gibi alanlarda veri temelli yaklaşımların etkinliğini desteklemektedir. Ayrıca, veri ön işleme adımlarının metin sınıflandırmadaki etkisi ayrıntılı biçimde değerlendirilmiş, her modelin mimarisi ve avantajları derinlemesine ele alınmıştır. Gelecek çalışmalar için öneriler arasında; CNN ve BERT mimarilerinin hibrit kullanımı ile daha yüksek doğruluk oranlarına ulaşılması, model sıkıştırma teknikleriyle eğitim süresinin azaltılması, farklı sektörlerden veri kullanımıyla modellerin genellenebilirliğinin test edilmesi ve duygu analizi ile konu modelleme gibi ileri NLP görevlerinin entegre edilmesi yer almaktadır. Bu bağlamda çalışma, kurumsal ölçekte personel verilerinin anlamlandırılmasında etkili ve sürdürülebilir bir yöntem sunmaktadır.
Özet (Çeviri)
This study aims to analyze employee feedback using AI automatically supported deep learning methods, based on the critical influence of employee satisfaction and motivation on organizational success. The insufficiency of traditional analysis methods when faced with large datasets has highlighted the need for novel and scalable approaches. In this context, employee feedback is categorized through text classification algorithms, producing deep insights that contribute to managerial decision-making processes. This facilitates the development of data-driven strategies and supports institutions in achieving their sustainability and efficiency goals. The study specifically focuses on analyzing Turkish language employee feedback to provide systematic and data-oriented contributions to organizational decision-making. Through the AI-based text classification approaches developed within the scope of the research, employee opinions are transformed into meaningful insights and strategic information, yielding outputs that support executive-level guidance. Machine learning and deep learning techniques are employed to effectively analyze large volumes of unstructured data and integrate the results into decision support systems. In doing so, the study supports a digital transformation process aligned with modern governance needs such as transparency and rapid decision-making. Furthermore, the study goes beyond merely focusing on model accuracy; it also conducts a comprehensive evaluation of processing time, resource consumption, and the compatibility of model architectures with the task at hand. This multidimensional analysis provides a clearer picture of each model's advantages and disadvantages in practical applications. In contrast to most existing research that focuses on customer reviews, this study seeks to fill a gap in the literature by concentrating directly on employee feedback. Real feedback data anonymized and collected from a private company in Turkey was supplemented with synthetically generated sentences via ChatGPT and categorized accordingly. Various deep learning models TCN, CNN, LSTM, and BERT were applied to these categories, and a comparative performance analysis was conducted. Additionally, the impact of NLP processes such as data cleaning and tokenization on model performance was evaluated. The thesis thus offers both a solid theoretical foundation and a practical AI-based model proposal. Recent academic studies in the field of text classification aim to categorize data obtained from various sources into thematic classes. In studies involving Turkish content, customer reviews, news texts, social media posts, and product evaluations have been commonly used to achieve objectives such as sentiment analysis, topic classification, and user demand interpretation. Accordingly, both traditional machine learning methods and advanced deep learning models have been analyzed in detail. Among the most frequently used algorithms in the literature are traditional machine learning techniques such as Naive Bayes, Decision Trees, Random Forest, CatBoost, XGBoost, Logistic Regression, and Support Vector Machines (SVM). Additionally, deep learning architectures like LSTM, CNN, GRU, BERT, and TCN have gained prominence. Word2Vec, TF-IDF, and FastText are commonly used for vector representation of texts, while techniques like SMOTE are applied to address data imbalance. Notably, BERT's contextual understanding and LSTM's success in time series data have been shown to significantly improve accuracy rates. Some studies have also reported that hybrid models like CNN LSTM yield superior performance. The shared goal of these studies is to compare the accuracy, speed, and overall effectiveness of various classification algorithms to identify the most efficient model structures and to render text analytics faster, more accurate, and more automated. The results demonstrate that deep learning approaches outperform traditional methods, especially in processing large and complex datasets. This represents a significant advancement in the processing and classification of Turkish texts for both academic and industry applications. In this context, modern deep learning models such as TCN, CNN, LSTM, and BERT are notable for their ability to provide tailored solutions for different data types and problem domains. Each model, with its unique structural features and learning capabilities, achieves significant success in natural language processing tasks. The choice of model depends on factors such as data type, processing time, contextual analysis needs, and computational resources. TCN is a convolutional deep learning architecture designed to model sequential dependencies in time series data. Unlike traditional RNNs, TCN employs causal convolutions to process temporal information without feedback mechanisms. Dilated convolutions enable learning of long-term dependencies, while residual connections accelerate training by improving information flow. These properties make TCN an effective choice for large-scale data environments due to its parallelization capability and short training time. While CNNs are primarily known for their performance on image data, they have also demonstrated success in text classification tasks. By leveraging convolution and pooling layers, CNNs capture local patterns in text which are then processed by fully connected layers for classification. Activation functions like ReLU and Softmax enhance the model's ability to learn non-linear relationships. With high accuracy, fast training, and stable outputs, CNN stands out as a powerful tool in text analytics. LSTM is a recurrent neural network architecture well suited for sequential data due to its ability to learn long-term dependencies. Through forget, input, and output gates, it manages information flow effectively and can capture both short and long-range relationships. LSTM is widely used in domains such as NLP speech recognition, and financial forecasting. When paired with attention mechanisms, it demonstrates enhanced contextual interpretation in more complex tasks. BERT is a modern language model based on the Transformer architecture, capable of learning contextual representations through bidirectional processing. Pretrained using Masked Language Modeling (MLM) and Next Sentence Prediction (NSP), BERT utilizes self-attention mechanisms to analyze relationships between all tokens in a sentence. These features allow it to achieve high accuracy in tasks such as classification, sentiment analysis, and information extraction. However, its high computational cost and long training duration must be considered in practical applications. All four models were implemented in Python using the Google Colab environment. To ensure class balance, each category was enriched with an equal number of synthetic and real sentences. The models were trained using Stratified 5-Fold Cross Validation and compared across several performance metrics. While CNN delivered the highest performance across all criteria, BERT stood out with its contextual analysis capabilities. TCN was notable for its low computational cost, and LSTM lagged in both accuracy and training efficiency. In terms of comparative performance, CNN achieved the best results with 96.40% accuracy, an F1 score of 96.41%, and the lowest test loss at 14.74%. BERT followed closely with 94.91% accuracy and a 23.37% test loss but required a significantly longer training time of 245.56 seconds. TCN, with 94.36% accuracy and the fastest training time of 43.51 seconds, proved efficient but less accurate. LSTM, though structurally suited for sequential data, performed worst with 93.70% accuracy, a 38.92% test loss, and a relatively high training time of 98.84 seconds. In conclusion, CNN emerged as the most reliable model for text classification due to its high accuracy, speed, and generalization capacity. BERT showed comparable performance in contextual understanding but was hindered by high computational demands. TCN offered a fast and practical solution but lacked classification precision, while LSTM fell short in both resource efficiency and accuracy despite its strength in sequential data processing. Future research could explore hybrid models combining CNN's accuracy with BERT's contextual learning to further enhance classification performance. Model compression, transfer learning, and training optimization techniques may improve computational efficiency. Expanding datasets with feedback from various industries and integrating advanced NLP applications such as sentiment analysis and topic modeling could further amplify the strategic value of employee feedback in organizational decision-making.
Benzer Tezler
- Klasik makine öğrenme algoritmaları ve transformer modeli ile Türkçe tweet duygu analizi
Turkish tweet sentiment analysis with classical machine learning algorithms and transformer model
ASLI GÜRSOY
Yüksek Lisans
Türkçe
2024
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolHasan Kalyoncu ÜniversitesiElektronik ve Bilgisayar Mühendisliği Ana Bilim Dalı
DOÇ. DR. ABDUL HAFIZ ABDULHAFIZ
- Yassı alüminyum üretiminde kalite sınıflarının makine öğrenmesi yöntemleri ile tahminlenmesi
Prediction of quality classification in flat rolled aluminium production using machine learning methods
ALPEREN AYTATLI
Doktora
Türkçe
2024
Endüstri ve Endüstri MühendisliğiSakarya ÜniversitesiEndüstri Mühendisliği Ana Bilim Dalı
DOÇ. DR. ALPER KİRAZ
- Bütünleme sınavına girecek öğrenci sayılarının aşırı öğrenme makinesi tabanlı yaklaşımlar ile tahmin edilmesi
Prediction of the number of students who will take the make-up exam by extreme learning machine-based approaches
EYÜP SIRAMKAYA
Doktora
Türkçe
2022
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolKonya Teknik ÜniversitesiBilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. MUSTAFA SERVET KIRAN
- Yapay zeka yöntemleri ile el damar deseni tanıma
Hand vein pattern recognition with artificial intelligence methods
HASAN TUTUMLU
Yüksek Lisans
Türkçe
2011
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolSelçuk ÜniversitesiElektrik-Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. NOVRUZ ALLAHVERDİ
- The effect of visual and text interfaces in teaching robot programming
Robot programlama öğretiminde görsel ve metin arayüzlerin etkisi
BESİM BARANSEL BAĞCI
Yüksek Lisans
İngilizce
2017
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik ÜniversitesiBilgisayar Mühendisliği Ana Bilim Dalı
YRD. DOÇ. DR. GÖKHAN İNCE