Geri Dön

Micro-architectural support for improving synchronization and efficiency of SIMD execution on GPUs

Başlık çevirisi mevcut değil.

  1. Tez No: 401272
  2. Yazar: AYŞE YILMAZER
  3. Danışmanlar: PROF. DAVID KAELI
  4. Tez Türü: Doktora
  5. Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
  6. Anahtar Kelimeler: Belirtilmemiş.
  7. Yıl: 2013
  8. Dil: İngilizce
  9. Üniversite: Northeastern University
  10. Enstitü: Yurtdışı Enstitü
  11. Ana Bilim Dalı: Belirtilmemiş.
  12. Bilim Dalı: Belirtilmemiş.
  13. Sayfa Sayısı: 181

Özet

Özet yok.

Özet (Çeviri)

GPUs dedicate a majority of their transistor budgets to compute units rather than control logic. As a result, they can achieve excellent data-parallel power/performance. Given the continual demands for performance and power effciency, GPUs have become todays compute accelerators for many application domains. The general purpose community has been focusing on developing strategies to move a broader class of applications to these powerful devices. The underlying GPU architecture has been adapted to run a limited class of general purpose computations present across a range of applications. Many applications have already been ported to GPU platforms to take advantage of the potential data-parallel performance that GPUs afford. But there still remain barriers to migrating a broader class of applications onto GPUs. Being originally designed to run 3-D graphics, GPUs are highly optimized for graphics workloads. Graphics workloads possess a high degree of uniformity in their execution. Therefore, GPU architectures are optimized for effcient uniform execution. GPUs achieve high performance with data-parallel applications possessing regular control ow (i.e., predictable loops) and data access patterns that can effectively exploit high off-chip memory bandwidth. However, many general-purpose real world applications differ from graphics workloads { they come with large input sets exhibiting irregular access and synchronization patterns, and they possess varying computational granularity and irregular control ow. The current requirements for uniformity and predictability present barriers to moving a broader range of applications to GPUs. We believe if GPUs are going to become a mainstream computing device that it is necessary to relax some of these constraints. Only then can a wider variety of applications exploit the computational power of GPUs. One critical barrier present in non-uniform data-parallel applications is the need to synchronize between threads. Fine-grained synchronization is needed to support shared data access, especially when faced with irregular access and communication patterns. This dissertation presents a new approach to enhance the efficiency and scalability of GPU synchronization. The proposed scheme can enable applications that work on shared data to effectively communicate at finer levels of granularity. To achieve this ambitious goal, we propose a new synchronization approach called Hierarchical Queuing Locks (HQL). HQL is a novel hardware-based synchronization mechanism which provides ecient use of resources through execution blocking and hierarchical queuing. To provide a queue-based locking mechanism, HQL extends current GPU L1 and L2 cache management protocols by adding a synchronization protocol. Integration of HQL's synchronization protocol simplifies the synchronization, but adds a level of complexity to the cache management protocol. Given this added complexity to the cache management scheme, as part of this dissertation we provide a formal verification of the proposed HQL synchronization protocol. To evaluate the benefits of HQL, we start with studying a set of micro-benchmarks that represent highly irregular applications that require frequent synchronization. We additionally evaluate macro-benchmarks that utilize synchronization. We report on both the performance benefits and the savings in terms of instructions executed. Building upon the efficient fine-grained synchronization support provided for by HQL, we explore ScalarWaving (SW) and Simultaneous Scalar and SIMD groupWaving (SSSW) architectures to further improve efficiency of SIMD execution on GPUs. These two mechanisms attempt to reduce the amount of redundant computations performed by the threads in a SIMD group. SW and SSSW improve SIMD eciency for both irregular and regular applications. We motivate this work by reporting on the percent of redundant computations present in a range of workloads. We then quantitatively evaluate the benefits of SW and SSSW architectures using programs taken from four different benchmark suites. The impact of this dissertation design architectural features that can make the benefits of GPU computing available to a much wider range of applications. These kind of enhancements can only further accelerate the adoption of GPUs as a first-class computing device.

Benzer Tezler

  1. Yüklenici firmaların yenileşim yaklaşımlarının değerlendirilmesi

    Evaluation of contracting firms' innovation approaches

    AKIN TOLGA İLTER

    Doktora

    Türkçe

    Türkçe

    2011

    Mimarlıkİstanbul Teknik Üniversitesi

    Mimarlık Ana Bilim Dalı

    PROF. DR. ATTİLA DİKBAŞ

  2. An uninterrupted urban walk: 3d analysis methods for supporting the design of walkable streets

    Kentte kesintisiz bir yürüyüş: Yürünebilir sokakların tasarım desteği için 3b analiz yöntemleri

    ELİF ENSARİ SUCUOĞLU

    Doktora

    İngilizce

    İngilizce

    2020

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik Üniversitesi

    Bilişim Ana Bilim Dalı

    PROF. DR. MİNE ÖZKAR KABAKÇIOĞLU

  3. Java virtual machine implementation on micro-C/OS-II real-time operating system

    Micro-C/OS-II gerçek zamanlı işletim dizgesi üzerinde java sanal makinesi gerçekleştirimi

    ALP BÜLENT BURÇ SÜRMELİ

    Yüksek Lisans

    İngilizce

    İngilizce

    2005

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolÇankaya Üniversitesi

    Bilgisayar Mühendisliği Ana Bilim Dalı

    PROF.DR. TURHAN ALPER

  4. Sürdürülebilir kentleşme sürecinde, İstanbul Kağıthane deresi çevresindeki kent içi konut yerleşimlerinin ekolojik koridor yerleşim ilkeleri bağlamında analizi

    Analysis of urban residential settlements around İstanbul Kağithane stream in the context of ecological corridor settlement principles for sustainable urbanization

    DİLARA ŞİMŞEK

    Yüksek Lisans

    Türkçe

    Türkçe

    2024

    Mimarlıkİstanbul Teknik Üniversitesi

    Kentsel Tasarım Ana Bilim Dalı

    PROF. DR. HATİCE AYATAÇ

  5. Kentsel alanlarda kullanılan odunsu bitki taksonlarının ekosistem hizmetleri bağlamında incelenmesi; Rize kenti örneği

    Investigation of woody plant taxa used in urban areas in context of ecosystem services; case of Rize

    YEŞİM ÖZCAN

    Yüksek Lisans

    Türkçe

    Türkçe

    2022

    Peyzaj MimarlığıArtvin Çoruh Üniversitesi

    Peyzaj Mimarlığı Ana Bilim Dalı

    DOÇ. DR. DERYA SARI