Geri Dön

Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor

Başlık çevirisi mevcut değil.

  1. Tez No: 402754
  2. Yazar: ALTAN KARA
  3. Danışmanlar: DR. NARCIS FERNANDEZ-FUENTES, DR. DAVID WHITWORTH
  4. Tez Türü: Doktora
  5. Konular: Biyoloji, Biyomühendislik, Biyoteknoloji, Biology, Bioengineering, Biotechnology
  6. Anahtar Kelimeler: Belirtilmemiş.
  7. Yıl: 2016
  8. Dil: İngilizce
  9. Üniversite: Aberystwyth University / Prifysgol Aberystwyth
  10. Enstitü: Prifysgol Aberystwyth
  11. Ana Bilim Dalı: Yurtdışı Enstitü
  12. Bilim Dalı: Belirtilmemiş.
  13. Sayfa Sayısı: 628

Özet

Özet yok.

Özet (Çeviri)

Two-component systems (TCSs) are signalling complexes composed of a histidine kinase (receptor) and a response regulator (effector). They are the most abundant signalling pathways in prokaryotes. They control a wide range of biological processes. The pairing of these two components is highly specific, and the interactions between them are fast and transient. This makes their prediction quite challenging, especially when an orphan protein, whose encoding gene is at least 200bp further from any other TCS protein coding gene, involved in the interaction. Thus, determining TCS proteinprotein interactions (PPIs) is often requiring a costly and time-consuming experimental characterisation. Therefore, there is considerable interest in developing accurate computational prediction tools to lessen the burden of experimental work and cope with the ever-increasing amount of genomic information available and also to be able to accurately map TCS PPIs even if an orphan TCS protein involved in the interaction. In this work, a novel meta-predictor, MetaPred2CS, was developed specifically to predict prokaryotic TCS PPIs based on a support vector machine. MetaPred2CS integrates six sequence-based prediction methods, namely in-silico two-hybrid, mirrortree, gene fusion, phylogenetic profiling, gene neighbourhood and gene operon, of orthogonal nature. These methods are selected based on their advantages, disadvantages and characteristics of the TCS PPIs. More detailed information related to this selection can be found in Section 3.5.1. To benchmark MetaPred2CS, a novel training dataset of experimentally validated TCS protein pairs, which are composed of 113 positive (P+) and 1134 negative (P-) interaction pairs, was compiled for k-fold cross validation to act as a gold standard dataset for TCS predictions. Creation of this dataset is required as there is currently no database that provides experimentally proved information, especially regarding negative TCS PPI pairs. MetaPred2CS was also compared against the current state of the art (a Bayesian Network (BN) based method and STRING). Combining individual predictors of different nature improved the overall prediction accuracy, and as a result, MetaPred2CS performed better when compared to the individual methods and outperformed the current state-of-the-art. The prediction performance of MetaPred2CS was compared against the current state of the art based on AUC values. According to these tests, AUC values for MetaPred2CS, STRING and BN based methods obtained 92.8, 88.4 and 83.5, respectively. Among the components of MetaPred2CS, the in-silico two-hybrid method contributed most to its performance (5.93%). Besides performing better than the current state-of-the-art, MetaPred2CS is also the only available option that allows its users to perform de-novo predictions. This thesis will argue that MetaPred2CS is also effective in predicting orphan TCS PPIs, which is the one of the main challenges in the field. A publicly available web server was developed to interface the method and was employed in genome-wide TCS PPI predictions for E. coli K-12 MG1655, M. xanthus DK 1622, P. aeruginosa UCBPP-PA14 and E. amylovora ATCC 49946. Finally, forty novel predictions, which were outputted by MetaPred2CS for these organisms, are evaluated in detail at the end of Chapter 5. The biological relevance of the components of these novel pairs suggests that some of these predictions might be valuable targets for researchers who are interested in understanding the life cycle of these organisms.The MetaPred2CS web server is available at http://metapred2cs.ibers.aber.ac.uk along with newly created gold standard dataset (P+/P-) of TCS interaction pairs. The source code for the MetaPred2CS can be downloaded from https://github.com/martinjvickers/MetaPred2CS and also can be obtained as an OVA file of an implemented Virtual Machine (which provides a preinstalled version of MetaPred2cs) at http://metapred2cs.ibers.aber.ac.uk/MetaPred2CS.ova.

Benzer Tezler

  1. Genom-boyu ilişki çalışmalarında poligenik risk skorunun makine öğrenimi ve derin öğrenme yöntemleri ile tahmin edilmesi

    Prediction of polygenic risk score by machine learning and deep learning methods in genome-wide association studies

    RAGIP ONUR ÖZTORNACI

    Doktora

    Türkçe

    Türkçe

    2021

    BiyoistatistikMersin Üniversitesi

    Biyoistatistik ve Tıbbi Bilişim Ana Bilim Dalı

    PROF. DR. BAHAR TAŞDELEN

    PROF. DR. CEMİL ÇOLAK

  2. Isı şoku protein genlerinin (HSP) bazı populus taksonlarında fonksiyonel genom analizi ve abiyotik stres koşullarında HSP genlerinin ifade seviyelerinin belirlenmesi

    Genome-wide survey of heat shock proteins (HSP) and expression analysis of HSP genes under abiotic stress conditions in some populus taxons

    ESRA NURTEN YER

    Doktora

    Türkçe

    Türkçe

    2017

    GenetikKastamonu Üniversitesi

    Orman Mühendisliği Ana Bilim Dalı

    PROF. DR. SEZGİN AYAN

    DOÇ. DR. MEHMET CENGİZ BALOĞLU

  3. Ağırlıklı çoklu sınıflandırıcı kullanarak biyolojik verilerin tahmini

    Prediction of biological data by using weighted ensemble classifiers

    TAYLAN İYİDOĞAN

    Yüksek Lisans

    Türkçe

    Türkçe

    2013

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolTOBB Ekonomi ve Teknoloji Üniversitesi

    Bilgisayar Mühendisliği Ana Bilim Dalı

    YRD. DOÇ. DR. TANSEL ÖZYER

  4. İnsan gen yolaklarında ikâme modelleme ve makine öğrenmesi kullanarak varyant analizi

    Variant analysis in human gene networks using surrogate modelling and machine learning

    FURKAN AYDIN

    Yüksek Lisans

    Türkçe

    Türkçe

    2024

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik Üniversitesi

    Bilgisayar Bilimleri Ana Bilim Dalı

    DR. ÖĞR. ÜYESİ SÜHA TUNA

  5. Machine learning methods for detecting genetic and infectious diseases

    Genetik ve enfeksiyon hastalıklarının tespiti için makine öğrenmesi yöntemleri

    YUNUS EMRE IŞIK

    Doktora

    İngilizce

    İngilizce

    2024

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolAbdullah Gül Üniversitesi

    Elektrik ve Bilgisayar Mühendisliği Ana Bilim Dalı

    DOÇ. DR. ZAFER AYDIN