Geri Dön

Eğitimsel veri madenciliği ve bir uygulaması

Educational data mining and an application

  1. Tez No: 510691
  2. Yazar: YASEMİN YAKUPOĞLU
  3. Danışmanlar: DOÇ. DR. BAŞAR ÖZTAYŞİ
  4. Tez Türü: Yüksek Lisans
  5. Konular: Endüstri ve Endüstri Mühendisliği, Industrial and Industrial Engineering
  6. Anahtar Kelimeler: Belirtilmemiş.
  7. Yıl: 2018
  8. Dil: Türkçe
  9. Üniversite: İstanbul Teknik Üniversitesi
  10. Enstitü: Fen Bilimleri Enstitüsü
  11. Ana Bilim Dalı: Endüstri Mühendisliği Ana Bilim Dalı
  12. Bilim Dalı: Endüstri Mühendisliği Bilim Dalı
  13. Sayfa Sayısı: 137

Özet

Büyük veri kümeleri içinde saklı desenlerin belirlenmesiyle, anlamlı bilgi keşfini amaçlayan ve birçok alanda uygulaması bulunan veri madenciliği yaklaşımının eğitim alanındaki kullanımı son yılların oldukça ilgi uyandıran konusudur. Dijital dönüşüm ve otomasyonda sağlanan ilerlemeler, daha büyük boyutta ve daha fazla çeşitlilikte veriyi elde etme ve saklama imkanı yaratmıştır. Aynı zamanda, giderek daha iyi sonuçlar verecek şekilde geliştirilen makine öğrenmesi algoritmaları da, elde edilen bu verilerin daha iyi analiz edilmesini mümkün kılmıştır. Veri madenciliği tekniklerinin uygulanmasıyla gerçekleştirilen bu tip analizler, ilgili çalışma alanının paydaşlarına, daha doğru karar vermeleri ve süreç tasarımı konularında yardımcı olmaktadır. Veri madenciliği yaklaşımının eğitim alanındaki uygulamaları konusunda ise, özellikle son 10 yılda bir çok çalışma yapılmıştır. Bu tezin amacı, bilgisayar bilimi, eğitim ve istatistiğin bir kesişimi olan eğitimsel veri madenciliği ve öğrenme analitiği alanlarında yapılan çalışmaları ve uygulamaları inceleyerek, daha etkili bir eğitim ve öğrenim süreci tasarlamak, öğrencilerini daha iyi tanıyarak, onlara kişiselleştirilmiş bir öğrenim deneyimi sunmak ve eğitim süreçlerinin otomasyonu sağlamak isteyen eğitim dünyası paydaşlarına ve eğitim araştırmacılarına fikir vermek ve gelişim alanlarını göstermektir. Bu amaçla, öncelikle eğitimsel veri madenciliği ve öğrenme analitiği konularının araştırma alanlarına değinilmiş, öğrenim yönetim sistemleri, çevrimiçi öğrenim ortamları, kitlesel çevrimiçi açık dersler, veri kaynakları ve veri elde etme yolları tanıtılmış, sonrasında literatür araştırmasıyla bu alanlarda yapılan çalışmalara değinilerek uygulamalar örneklendirilmiştir. Bir sonraki bölümde, özel bir orta okulun öğrencileri üzerinde eğitimsel veri madenciliğinin bir uygulaması yapılmıştır. Bir akademik yıl boyunca okulun bilgi yönetim sisteminde kayıt altına alınan veriler kullanılarak, öğrencilerin 5 ana ders üzerindeki a-, a, b, c ve c+ yetkinlik sınıfları, eğitimsel veri madenciliği araştırmalarında en sık kullanılan 4 farklı algoritmayla tahmin edilmeye çalışılmıştır ve algoritmaların veri ön işleme öncesi ve sonrasında gösterdiği performanslar birbirleriyle karşılaştırılmıştır. Sonuçlar, tahmin edilen sınıf üzerinde 0,1 korelasyon değerinin altında değer veren değişkenlerin modelden çıkartılması yoluyla yapılan veri ön işleme yönteminin tahminleme performanslarını yükselttiğini göstermiştir. Veri ön işleme sonrası nispeten küçük veri setleri üzerinde %75 üzerinde doğruluk performansı gösteren bu veri madenciliği tekniklerinin, çok daha büyük veri setleri üzerinde, farklı eğitimsel veri madenciliği amaçları doğrultusunda göstereceği performans ve elde edilecek sonuçlar, yapılacak olan diğer çalışmalar için umut vaadetmektedir.

Özet (Çeviri)

The use of the data mining approach, which aims at exploring meaningful information with many hidden applications in large data sets, has been a topic of interest in recent years. Progress in digitization and automation has allowed for the acquisition and storage of increasingly larger and more diverse data. At the same time, machine learning algorithms, which have been developed to provide better results, have made it possible to analyze these obtained data better. This type of analysis, conducted through the application of data mining techniques, helps stakeholders in the relevant field of study to make better decision-making and process design issues. As for the applications of the data mining approach in the field of education, many studies have been done in the last 10 years. The aim of this thesis is to show the development areas and gives an idea to educational world stakeholders and education researchers who want to give their students a personalized learning experience by getting to know their students better, to design a more effective education and learning process and to automate their education processes by examining the studies and practices carried out in the field of educational data mining and learning analytics, which is an intersection of computer science, education and statistics. For this purpose, the research areas of educational data mining and learning analytics topics are firstly mentioned and learning management systems, online learning environments, mass online open classes, data sources and ways of obtaining data are introduced. Then, the applications were mentioned by referring to literature studies in these fields. In the next section, an application of educational data mining was done using the data of the students of a private middle school. The a-, a, b, c, and c + competency classes on 5 main courses of students were tried to be predicted with 4 different algorithms most frequently used in educational data mining researches, using the school's information management system with recorded data for one academic year, then, before and after data preprocessing, the performances of the algorithms are compared with each other. The results show that the data preprocessing method, which is modeled by subtracting variables that give a value below the correlation value of 0,1 on the estimated class, increases the estimation performance. The performance of these data mining techniques, which exhibit an accuracy performance over 75% on relatively small data sets after data preprocessing, on the much larger data sets, for the purposes of different educational data mining purposes, and the results to be achieved, are promising for further work. There are many application of data analytics in education such as prediction of student performance, the identification of students who are at risk of leaving the school by analyzing the behaviors of the student, visualization of related relationships and trends in the education and learning process through various reports, improving their performance by providing instant and intelligent feedback to students with their intelligent learning systems, analyzing student activities and making course suggestions according to the determined areas of interest, identification of student behaviors in gaming or community-based activities to help develop the student model, grouping of students with social network analysis. Data analytics studies applied in the field of education are generally gathered in the titles of educational data mining and learning analytics. Educational data mining is a discipline that aims to discover ways to use data that come from educational environments and become increasingly larger, and to use these methods to better understand students and their learning styles. Learning analytics is a discipline that focuses on collecting, analyzing and visualizing a large amount of data related to learning processes, using large data and data mining techniques to understand and improve learning effectiveness. The school that has been implemented uses a web based information management system application that has been managing academic processes since 2014-2015 academic year, recording the demographic and academic data belonging to the students. In this specific application developed for the the school, students, teachers, parents and other administrative staff can input information and view information in the system with their own user account. The school emphasizes respect for individual differences and aims to offer learning opportunities to students according to their needs and tries to support the learning process of the student by determining the appropriate study plan for each student. One of the studies carried out for this purpose is the manual assignment of the students by the decision given by the teachers to the competency classes (a-, a, b, c, c+) determined for each course. In each academic year, the cognitive and behavioral attitudes of the students in Mathematics, Turkish, Social Sciences, Science and English courses are observed by the teachers. Considering the successes of the lessons, students are assigned to a competency class for each course. This classification is also a preliminary information for the new branch teacher who does not know the student at all and will work with the student in the next academic term, just as the subject teacher identifies the student-oriented approach. The main purpose of classification is to support the learning experience with strategies for the class of competency to which the student belongs. This study researches whether students will be able to estimate the competency classes they belong to on the basis of classifications with the data obtained from the knowledge management system during an academic year. The data used for analysis were collected from three different sources: the information management system reporting module, MS SQL database and Sharepoint platform. All the data obtained are combined in a new database and ready for analysis, starting with a total of 84 nominal and numerical variables. In the initial analysis, the performance of the algorithms was examined before any preprocessing operation was performed on the dataset. In this analysis before data pre-processing, decision tree and SVM algorithms performed better than the other two algorithms. However, in the tree structures constructed by the decision tree algorithm, which has higher accuracy performance than the others, it has been seen that as the hierarchical lower borders are turned, the algorithm makes too many small branches, thus placing very few samples in the classes in the final branches. Taking this into consideration, two different types of data preprocessing were carried out and the results were observed: the noisy and missing data in the data set were cleared at the next stage: 1. Decreasing the number of variables by modifying the least descriptive variables according to the decision tree algorithm. 2. Using Attribute Selection algorithm to reduce variance by subtracting variables that are below the correlation value of 0.1 over predicted classes from the model. After removal from the model least explanatory variables, as a result of reapplication of the classification algorithms, it has been shown that no significant success has been achieved in other algorithms except SVM algorithm. Using the feature selection algorithm, the variables that correlate with the predicted class below the value of 0.1 were removed from the model and the performance values of all classification algorithms increased at the lagged rates. In this method, the nominal variables indicating the student's demographic, familial, health, behavior were extracted from the model so that at least 90% of the variables in the model consisted of numerical variables indicating the assessment tools. In all the analyzes made, there were two common observations: 1. The prediction success of the students at the 5th, 6th and 7th grade levels in the case of collective analysis is well below the prediction success achieved in the case where these classes are analyzed separately. Moreover, kappa values in collective analyzes yielded statistically insignificant results. 2. This conclusion makes the course-based competency classes of 6th and 7th grade students more predictable. It is considered to be one of the reasons for the result that the 5th class is an adaptation period for the students who transfer from the primary school system to the middle school system.

Benzer Tezler

  1. Bulut Tabanlı Çevrimiçi Öğrenme Ortamında Etkinlik Öneri Sistemi Tasarımı: Eğitimsel Veri Madenciliği Uygulaması

    Activity Suggestion System Design In Cloud Based Online Learning Environment: Educational Data Mining Application

    HAKAN KÖR

    Doktora

    Türkçe

    Türkçe

    2017

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolKırıkkale Üniversitesi

    Bilgisayar Mühendisliği Ana Bilim Dalı

    PROF. DR. HASAN ERBAY

  2. ABİDE 2016 fen başarısının yordanmasında MARS ve BRT veri madenciliği yöntemlerinin karşılaştırılması

    Predicting the ABIDE 2016 science achievement: The comparison of MARS and BRT data mining methods

    HİKMET ŞEVGİN

    Doktora

    Türkçe

    Türkçe

    2020

    Eğitim ve ÖğretimGazi Üniversitesi

    Eğitimde Ölçme ve Değerlendirme Ana Bilim Dalı

    DOÇ. DR. EMİNE ÖNEN

  3. HLM ve YSA Yöntemlerinin PISA 2018 okuduğunu anlama becerilerini yordama düzeylerinin incelenmesi

    Investigation of prediction accuracy of HLM and ANN Methods on PISA 2018 reading literacy

    EDA AKDOĞDU YILDIZ

    Doktora

    Türkçe

    Türkçe

    2022

    Eğitim ve ÖğretimHacettepe Üniversitesi

    Eğitim Bilimleri Ana Bilim Dalı

    DOÇ. DR. KÜBRA ATALAY KABASAKAL

  4. Eğitimde veri madenciliği ve öğrenci akademik başarı öngörüsüne ilişkin bir uygulama

    Educational data mining and an application related to prediction of student academic success

    ŞEBNEM ÖZDEMİR

    Doktora

    Türkçe

    Türkçe

    2016

    Bilim ve Teknolojiİstanbul Üniversitesi

    Enformatik Ana Bilim Dalı

    PROF. DR. MEHMET ERDAL BALABAN

  5. Öğrencilerin akıllı tahtaya ilişkin tutumlarının incelenmesine yönelik bir veri madenciliği uygulaması

    Data mining applications in education

    CENGİZ HARK

    Yüksek Lisans

    Türkçe

    Türkçe

    2013

    Eğitim ve ÖğretimFırat Üniversitesi

    Bilgisayar ve Öğretim Teknolojileri Eğitimi Ana Bilim Dalı

    DOÇ. DR. YALIN KILIÇ TÜREL