Kayıp veriler ve kayıp veriler için bir çoklu veri atama yöntemi: Propensity skor

Missing data and a multiple imputation method for missing data: Propensity score

PDF İndir

Tez No: 234789
Yazar: ELİF ÇİĞDEM KASPAR
Danışmanlar: DOÇ. DR. DİLEK ALTAŞ
Tez Türü: Doktora
Konular: Ekonometri, Econometrics
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2011
Dil: Türkçe
Üniversite: Marmara Üniversitesi
Enstitü: Sosyal Bilimler Enstitüsü
Ana Bilim Dalı: Ekonometri Ana Bilim Dalı
Bilim Dalı: İstatistik Bilim Dalı
Sayfa Sayısı: 141

Özet

Kayıp veriler istatistiğin tüm uygulamalarında karşılaşılan ortak bir problemdir. Kayıp veri problemini giderebilmek için çeşitli çözüm ve veri atama yöntemleri geliştirilmiştir. Propensity Skor Yöntemi ise son yirmi yılda gözleme dayalı çalışmalarda kullanılan önemli bir yöntem olup, özelliği iki karşılaştırılacak grup arasında ortak değişkenlerdeki farklılığı gidererek sistematik hatayı azaltmak ve hatta düzeltebilmektir. Yöntemin bu dengeleme özelliğinden faydalanılarak da Propensity Skor veri atama yöntemi kayıp veri problemini giderebilmek için geliştirilmiştir. Çalışmada Propensity Skor veri atama yöntemleri ile diğer veri atama yöntemlerinin birbirlerine olan üstünlüklerinin incelenmesi amaçlanmıştır. Bu amaçla iki farklı veri setine uygulama yapılmıştır. Birinci uygulamada, Dünya Bankası'ndan elde edilen 2008 yılına ait kayıp gözlemi olmayan 80 ülkenin tarım, ihracat, gayrisafi milli hasıla, gayri safi yurtiçi hasıla ve endüstri değişkenlerinden ve ikinci uygulamada, normal dağılıma uyan 5 değişkenli türetilmiş bir veri setinden faydalanılmıştır. Bu tam veri setlerinden sırasıyla iadesiz ve rassal olarak çeşitli sayılarda birimler silinerek farklı sayıda eksik gözlem içeren örneklemler oluşturulmuş ve kayıp veri setlerinin her birine kayıp veri problemini giderebilmek için; Ortalama, Medyan, EM, Regresyon, Hot-Deck ve Propensity Skor veri atama yöntemleri uygulanmıştır. Bu yöntemlerin etkinliği gerçek veri ile atama yapılmış veri arasındaki farka bakılarak ve veri ataması yapılmış veri setleri ile tam veri setinin ortalamaları ve standart sapmaları karşılaştırılarak değerlendirilmiştir. Ayrıca veri ataması yapılmış veri setlerinin tam veri setine göre değişkenliğinin değişip değişmediğini test etmek için Box-M testi yapılmıştır. Sonuç olarak, Propensity Skor veri atama yöntemlerinin az sayıda kayıp veri içeren veri setlerinde diğer veri atama yöntemlerine göre daha tutarlı sonuçlar verdiği, bunun yanında kayıp veri sayısı arttıkça yöntemlerin üstünlüklerinin değiştiği tespit edilmiştir.

Özet (Çeviri)

Missing data is a common problem in all applications of statistics. Various solution and imputation methods were developed for dealing with the missing data problem. The Propensity Score is a very important method in observational studies which have been used to balance the covariates, to reduce or even to correct the bias between two groups since last twenty years. Utilizing this balancing feature of the propensity score, Propensity Score imputation method is developed for handling missing data problem. In this study it is aimed to compare the advantages of the imputation methods with each other. For this purpose, the applications were performed in two different data sets. In the first application: non-missing observations of agriculture, exports, gross national product, gross domestic product and industry variables of eighty countries were used, obtained from the data of World Bank 2008. In the second application: a normally distributed and generated data set was used. Data sets having different missing values were evolved from these complete data sets by deleting various numbers of units as respectively, random and without replacement. For each missing data sets; Mean, Median, EM, Regression, Hot-Deck and Propensity Score imputation methods were applied to handle missing data problem. The efficiency of the imputation methods was evaluated by comparing real values with the imputed values and also by comparing the means and the standard deviations of the complete data sets with the the means and the standard deviations of the imputed data sets. Furthermore, Box-M test was applied to see the difference in variability between the imputed data sets and the real data sets. As a result, it was found that Propensity Score imputation methods provided more consistent results than other imputation methods in data sets consisting of small number of missing values. Besides, it was also determined that advantages of the imputation methods differ as the number of missing values increases.

Benzer Tezler

Tez No
809903
Methods for handling missing data for observational studies with repeated measurements
Tekrarlayan ölçümlü gözlemsel araştırmalarda kayıp veri ile baş etme yöntemleri
OYA KALAYCIOĞLU
Doktora
İngilizce
2015
Biyoistatistik University of London - University College London
Biyoistatistik Ana Bilim Dalı
PROF. DR. RUMANA OMAR
Tez No
314412
Çoklu atama yöntemlerinin Rasch modelleri için performansının benzetim çalışması ile incelenmesi
Assessing the performance of multiple imputation techniques for Rasch models with a simulation study
BEYZA DOĞANAY ERDOĞAN
Doktora
Türkçe
2012
Biyoistatistik Ankara Üniversitesi
Biyoistatistik Ana Bilim Dalı
PROF. DR. ATİLLA HALİL ELHAN
Tez No
342477
Kayıp verilerin varlığında iki kategorili puanlanan maddelerden oluşan testlerin psikometrik özelliklerinin incelenmesi
Psychometric properties of tests composed of dichotomous items in the presence of missing data
ERGÜL DEMİR
Doktora
Türkçe
2013
Eğitim ve Öğretim Ankara Üniversitesi
Ölçme ve Değerlendirme Ana Bilim Dalı
PROF. DR. NİZAMETTİN KOÇ
Tez No
456676
Kayıp veriyle baş etme yöntemlerinin madde tepki kuramı bir parametreli lojistik modelinde model veri uyumuna ve standart hataya etkisi
The effect of mi̇ssi̇ng data tecni̇ques i̇n one parameter logi̇sti̇c model of i̇tem response theory on model fi̇t and standard error
DUYGU KOÇAK
Doktora
Türkçe
2016
Eğitim ve Öğretim Ankara Üniversitesi
Ölçme ve Değerlendirme Ana Bilim Dalı
DOÇ. DR. ÖMAY ÇOKLUK BÖKEOĞLU
Tez No
484106
Kayıp veri ile baş etme yöntemlerinin ölçme değişmezliğine etkisi açısından karşılaştırılması
Comparison of influence of the missing data handling methods on measurement invariance
MEHMET ALİ IŞIKOĞLU
Yüksek Lisans
Türkçe
2017
Eğitim ve Öğretim Hacettepe Üniversitesi
Eğitim Bilimleri Ana Bilim Dalı
DOÇ. DR. BURCU ATAR

Geri Dön