Geri Dön

Dağıtılmış veri yönetimi

Başlık çevirisi mevcut değil.

  1. Tez No: 56033
  2. Yazar: HIRAÇ KASAPOĞLU
  3. Danışmanlar: DOÇ.DR. NADİA ERDOĞAN
  4. Tez Türü: Yüksek Lisans
  5. Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
  6. Anahtar Kelimeler: Belirtilmemiş.
  7. Yıl: 1996
  8. Dil: Türkçe
  9. Üniversite: İstanbul Teknik Üniversitesi
  10. Enstitü: Fen Bilimleri Enstitüsü
  11. Ana Bilim Dalı: Belirtilmemiş.
  12. Bilim Dalı: Belirtilmemiş.
  13. Sayfa Sayısı: 71

Özet

ÖZET Bu çalışmada, bir dağıtılmış veri yönetici sistemi geliştirilmiştir. Geliştirilen bu dağıtık sisteme dahil olan ve birbirlerine bir ağ ile bağlanmış UNIX iş istasyonları, bir C dili kütüphanesi yardımıyla ortak değişken paylaşımı ger çekleştirebilmektedir. DRM (Distributed Raw-Data Manager - Dağıtılmış Veri Yöneticisi) adı verilen proje çerçevesinde geliştirilen alt programlar ve bazı yöne tim yazılımlarını kullanan bir programcı, başka hiçbir iletişim veya ağ yazılım ve altprogramma gerek duyulmaksızın veri paylaşımını gerçekleştirebilir. Dağıtılmış işlemciler (iş istasyonları) arasındaki mesaj aktarım ve dağıtılmış verilere ilişkin yönetim işlemleri kullanıcıya yansıtılmadan gerçekleştirilmektedir. Kullanıldığı, bu alt yapı üzerinde, tüm ayrıntılardan uzak bir dağıtılmış ortak veri paylaşım mekanizmasının tüm işlemlerinden yararlanabilmektedir. Tüm bu yönetim ve mesaj aktarım işlemlerini, sisteme dahil olan her bir iş istasyonunda çalışması gereken drmd adı verilen sunucu yönetim programı yürüt mektedir, drmd, diğer iş istasyonlarıyla olan iletişimini Berkeley soketleri, aynı iş istasyondaki süreçlerle olan iletişimini IPC mesaj kuyruklarıyla gerçekleştirmektedir. DRM sistemine dahil olan iş istasyonlarını birbirine bağlayan ağ protokolü olarak TCP/IP seçilmiştir. TCP/IP ağ protokolü bölüm-2'de, IPC mesaj kuyr ukları bölüm-3'de, Berkeley soketleri bölüm-4'de açıklanmıştır. Bölüm-5'de or tak bellek kullanımı ve bölüm-6'da ise DRM sistemi açıklanmıştır. ıx

Özet (Çeviri)

SUMMARY DISTRIBUTED DATA MANAGEMENT This study introduces the design and implementation phases of Distributed Raw-Data Manager (DRM). Aim of project is to create a software tool which facilitates the development of efficient parallel / distributed software. DRM is designed to connect UNIX workstations that has TCP/IP network via a set of C programming language function calls. DRM uses TCP/IP networking and Berkeley datagram socket protocols [3] to communicate between workstations. So theoretically all UNIX based work stations supports both protocols should work with DRM. INTRODUCTION Creating a super-computer that has high processing power is one of the biggest challenges of computer science and computer technology. Such systems can be created in several ways: Massively parallel and loosely coupled parallel systems are the most important ones. These systems use shared memory via a dedicated switch network hardware and, generally, hardware de pendent operating system and software are necessary. Because of the high cost and importability of these systems, only a number of research centers and uni versities could have them. The lack of availability of such parallel systems and the fall of workstation costs together with the rise of their performance makes“network of worksta tions”term very popular. During the last several years, several distributed shared memory (DSM) systems were designed. Some of them use a special shared memory detecting hardware. [10] gives a brief introduction and good comparison between hard-ware and software implementations of DSM systems. There are several projects that enable a network of workstations use shared memory without any property hardware. TreadMarks [9, 10] is one of them. Another method that is used to create a shared memory system is message passing. PVM (Parallel Virtual Machine) [7] is a project that creates a virtual network of workstations through message passing. Even though PVM facilitates the use of message primitives, the programmer code the message passing himself. This makes the program different from the original and also brings additional load to PVM programmer. In TreadMarks [9], a programmer accesses shared memory as an ordinary memory location. System automaticly senses shared memory access and does the job. In PVM [7], there is no direct access to shared memory, everything is done with pre-defined message passing calls thus defined in a pvm library. DRM is between TreadMarks and PVM, programmer can use shared memory as shared variables. Programmers can access shared variables via pre-defined funtion calls defined in drm library. We made tests with CRACK1. This gives us a good example of distributing a power-hungry application into workstations. IMPLEMENTATION DRM is a simple distributed data management system. DRM connects UNIX- based workstations with a TCP/IP network. Each workstation has a DRM server process or drmd daemon2. This daemon waits for messages from several sources, process or route them to desired location if necessary. All messaging processes done within a workstation with Interprocess Communication (IPC) message queues. drmd contains three different processes. One of them is the main process. This is responsible for maintaining Shared Memory Table (SMT), for waiting messages from other two forks and other user processes that use drm libraries. Second process is Command Line Interpreter (CLI). CLI is a debug and ad ministration interface between drm and drm administrator. Each CLI command generates an IPC message to drmd main process. *A UNIX password attacker system that uses brute-force method. 2In UNIX world, server processes are called as daemons XIThe last process is responsible for listening to datagram socket port number 10000. This process accepts messages from daemons of other workstations. If it receives a message, passes it directly to main process with an IPC message. Figure-6.5 on page 49 shows commi cations between DRM processes. Each drmd process on workstation has a Shared Memory Table (SMT) which holds information about variables that are shared on DRM system. Each shared variable occupies one entry in every workstation's SMT. So, every workstation's drmd daemon has a copy of information about shared variables, but only one of them has the current value of shared variable, while others have the location (IP Address) of owner workstation. Table-6.1 on page 52 shows SMT structure. descriptor is a character string which identifies a shared variable on DRM system. It can contains any character without limitations most programming language variable names have, type represents the data is local or remote to workstation. Local means to a drmd daemon, this variable's value is in it's local memory. Remote means to a drmd daemon, this shared variable's value is not in same workstation's local memory, so there is need to communication between workstations and the source of shared variable's value is represented by hostid field, hostid is IP address of workstation that holds the value of shared variable, pid is process ID of the process that creates shared variable. When a shared variable is created for the first time, it will be created in local workstation's SMT table and this drmd has the value of the shared vari able. Than this workstation broadcasts a datagram socket message to all other workstations that a new variable has just created. Other drmd daemons listen this message and they also create this variable in their SMTs. But they sign this variable's type as remote and hostid as first host. DRM PROGRAMMING Function Calls DRM function calls are summarized in Table-6.2 on page 53 and each is de scribed in detail below. The names of the functions are prefixed with the letter 'd', as dput, to differentiate them from other C library functions. All function calls, except for dget and dlock, execute in asyncronous mode, so that a process continues to execute without blocking after issuing a function xucall, dget and dlock suspend the processes while the requests are served, after which the processes resume execution. dassign(struct DDATA *d, char »descriptor) ; das sign library function assigns an identifier string to local variable d. dcreat e (struct DDATA *d, int length, void *data) ; dcreate library function, creates a shared variable of length bytes in DRM system. This call sends IPC messages to local drmd daemon. If this variable hasn't been created before drmd appends it to its SMT; initialize its value with data and broadcasts this information message to daemons of other workstations with datagram BSD sockets. The workstation on which the call is executed becomes the owner of the variable. If this variable has already been created on another workstation, an entry already exists in the local SMT, In this case drmd daemon ignores this funtion call. dremove (struct DDATA *d) ; dremove library call, removes a shared variable from the DRM system. This call has a restricted use. Only the creator of a variable can delete it from the entire DRM system. If the referred variable exists, local drmd daemon marks its validity as“removed”in the local SMT, and broadcasts a message to servers of other workstations so that they can update their tables. dget (struct DDATA *d, void *data) ; dget call sends a request to local server daemon to get the current value of a shared variable. If this variable is owned by local daemon, it sends back the value through a message to calling process via IPC. If this is a remote vari able, the server sends a socket packet via TCP/IP network to daemon of owner workstation and requests the current value of the shared variable. Owner work station responds and sends back the requested value to the daemon of the local workstation which transmits it to the calling process, dget is a blocking call where the calling process hangs until it receives the value of a shared variable. dput(struct DDATA *d, void *data) ; dput call sends a request to local server daemon to alter the value of a shared variable. If a remote variable is referenced the daemon of the workstation of calling process, directs the request to the owner. If the shared variable is protected by dlock function call issued by another process, dput request is ignored by the daemon and the calling process does not receive any warning or Xlllerror messages. It is the programmer's responsible to protect a dput call with dlock / dunlock calls. dlock(stmct DDATA *d) ; dlock call, sends a request to local server daemon to protect the shared vari able against alteration from other processes, dlock call guaranties protection of shared variable against write acceses. So, following a dlock call, no other process can alter its value except for the calling process. As dlock is a blocking call, the calling process hangs until locking operation is completed. dunlock(struct DDATA *d) ; dunlock call, sends a request to local server daemon to release protection of shared variable against alteration. Sample DRM program Figure- 6. 4 on page 59 shows a simple DRM program which uses DRM func tion calls. The purpose of the program is to create a shared counter named“Counter”. The first time it is executed, it creates the shared variable, sets its initial value to one. Insuccesive executions, the shared counter value is incre mented by one and printed to screen. In DRM programs, shared variables one declared to be of structure DDATA, which is pre-defined in a header file, named data.h. An identifier string, which is a DRM system wide unique identifier, should be assigned to each shared variable. Thus, naming the same identifier string, all DRM programs reference the same shared variable. A shared variable comes into existance through a dcreate call. With this call, the size of variable and its default value is passed to local drmd server daemon. Server daemon looks up its SMT, if a variable with same descriptor has already been created, no action will be taken. Otherwise, the daemon creates a new table entry, initializes it with the default value and broadcasts to daemons of other worksatations. Other daemons update tables with newly created variable, marking its location as the idendity of the local server daemon. Processes can get the current value of shared variable through dget, and al ter its value through dput function calls. It is safer to protect the shared memory against multiple write acceses through dlock and dunlock library calls. For ex ample, the following code that increments a shared memory integer variable, could be used as a shared counter. xivdlock(&d) ; dget(&d, &i) ; İ++; dput(&d, &i); dunlock(&d) ; If a shared variable is no longer needed it can be deleted with dremove library call. But dremove has a restricted use, so only the creater (owner) of the shared variable is permitted to delete it. DRM programs use drm. o library to communicate with drmd server daemon. Programs use DRM library function calls should include ddata.h. C header file and link with drm. o library file. TESTING We made tests with 6 equal PC based workstations running Linux operating system. Each PC has a 120 MHz Pentium processor, 16 MB RAM, 256 KB L2-cache and PCI based SCSI disk controller. The workstatons are connected with 10 Mbps ethernet network. We developped a password attacker system named CRACK. This program uses several loops, trying all password combinations until it finds the right password. CRACK starts with 1 character long passwords, if not succesfull it starts to generate 2 character long passwords like aa, ab, ac,..., az, ba, bb, be,..., zz. We test the system with a single CPU (a Pentium-120 based PC) and re corded the time requried for completion of test. Then, we repeat the same test with 2, 3,..., 6 CPUs. Test results are shown in Table-6.3 on page 60. First column of table shows the number of CPU'ss used in the test, second column shows total time, in seconds, taken by the completion of test. The test program performs 1,083,660 password check operations for a specific password. Column- 3 shows how many checks were performed per second and column-4 shows how many checks were performed per second per CPU. Column-5 shows speeding of DRM. The last two columns are the representation of performance of DRM. As seen on table, there are some inconsistencies with test results. For ex ample if we add third CPU, performance fell down. This caused by some PCs have more CPU load for daily operations than others. xvCONCLUSION DRM is much simpler in design than the other DSM implementations [9, 10, 7]. Our tests show that overall performance of the system is quite acceptable, if we consider that the project is in its early stage. There are a number of modules to be improved and this will slightly increase the general performance. xvi

Benzer Tezler

  1. An Intelligent interface for a distributed database

    Dağıtık veri tabanı için akıllı arayüz tasarlanması

    İLKNUR ŞANSLI

    Yüksek Lisans

    İngilizce

    İngilizce

    2000

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolDokuz Eylül Üniversitesi

    Bilgisayar Mühendisliği Ana Bilim Dalı

    DOÇ. DR. ALP KUT

  2. Blokzincir bazlı hasta kan yönetim sistemi mimarisi geliştirme ve devreye alınması

    System architecture and implementation of blockchain enabled patient blood management system

    ALİ EMRE MANAV

    Yüksek Lisans

    Türkçe

    Türkçe

    2022

    Endüstri ve Endüstri MühendisliğiHacettepe Üniversitesi

    Endüstri Mühendisliği Ana Bilim Dalı

    DR. ÖĞR. ÜYESİ VOLKAN SÖNMEZ

  3. Design and management of globally-distributed network caches

    Küresel-dağıtıtılmış ağ belleklerinin tasarım ve yönetimi

    İSMAİL ARI

    Doktora

    İngilizce

    İngilizce

    2004

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolUniversity of California, Santa Cruz

    Bilgisayar Bilimleri Ana Bilim Dalı

    PROF. ETHAN L. MILLER

  4. Improving data-centric decision-making process in EU supply chains data networks by enhancing data management using distributed ledger technology with EU common dataspaces

    AB tedarik zinciri veri ağlarında veri merkezli karar alma süreçlerini güçlendirmek için AB ortak veri alanları kullanılarak dağıtılmış defter teknolojisi ile veri yönetimini geliştirmek

    IBRAHIM OSAMA ABDELWAHAB MOHAMED RAYIS

    Yüksek Lisans

    İngilizce

    İngilizce

    2024

    EkonomiBahçeşehir Üniversitesi

    Finansal Teknolojiler Bilim Dalı

    DR. ÖĞR. ÜYESİ LEVENT AKSOY

  5. Uzaktan algılama sistemleri için performans bilinçli büyük veri yönetimi

    Performance-aware big data management for remote sensing systems

    MUSTAFA KEMAL PEKTÜRK

    Doktora

    Türkçe

    Türkçe

    2023

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolGazi Üniversitesi

    Bilgisayar Mühendisliği Ana Bilim Dalı

    PROF. DR. HADİ GÖKÇEN

    DR. MUHAMMET ÜNAL