Geri Dön

Reınforcement learnıng and evolutıonary algorıthms for contaıner loadıng problem

Reinforcement learning and evolutionary algorithms for container loading problem

  1. Tez No: 382780
  2. Yazar: SANI TIJJANI
  3. Danışmanlar: YRD. DOÇ. DR. ARMAĞAN ÖZKAYA
  4. Tez Türü: Yüksek Lisans
  5. Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
  6. Anahtar Kelimeler: Belirtilmemiş.
  7. Yıl: 2014
  8. Dil: İngilizce
  9. Üniversite: Mevlana Üniversitesi
  10. Enstitü: Fen Bilimleri Enstitüsü
  11. Ana Bilim Dalı: Bilgisayar Mühendisliği Ana Bilim Dalı
  12. Bilim Dalı: Belirtilmemiş.
  13. Sayfa Sayısı: 95

Özet

Container Loading Problem (CLP) is a space utilization problem subject to various constraints. An example of it is the placement of containers in storage so as to minimize the waste of space. Other constraints that may be imposed include a certain loading order and an even weight distribution. Although evolutionary algorithms have been extensively studied to solve this problem, Reinforcement Learning (RL) which is a means of learning optimal behaviors by interacting with the environment, has not received enough attention in this respect. This work explores the use of RL as an alternative for tackling CLP so as to minimize the waste of space while maximizing the number of containers. We have applied five different RL algorithms (Q-learning, TD( ), Monte-Carlo, TDQ-learning and SARSA), and two types of evolutionary algorithms (Genetic Algorithm and Ant Colony Optimization) to solve this problem. Simulations have been carried out using MATLAB to compare these algorithms based on space utilization, number of containers, simulation time and speed of convergence. The simulation parameters are set so that the algorithms are allowed to fill in storage yards 100% with containers of different sizes. Results show that, in general, RL may not guarantee the best results, but can minimize the computational difficulty providing a simple way to solve this problem. Genetic Algorithm (GA) on the other hand gives best speed of convergence for small storage yards, and, unlike other approaches that may require complex computations, four RL algorithms, namely Q-learning, TD( ), Monte-Carlo and SARSA, give better speed of convergence than GA for larger storage yards used in our simulations. Ant Colony Optimization (ACO), while generally being worse than the others in terms of average convergence time, performs better than the ordinary Q-learning for the largest storage yard area used in our simulations and gives a result similar to GA. Growth of ACO's average convergence time seems to be slower than those of others, indicating that it has the potential to have better convergence times with increasing storage yard area sizes. In terms of the number of containers that can be packed into storage yard when the number of containers is not restricted, all RL algorithms give approximately the same result. As the storage yard size becomes larger, however, Q-learning starts performing better than all the other RL algorithms in this respect and Monte-Carlo gradually worsens even though it is the best of all for filling small storage yards. But in terms of simulation time (for RL algorithms only) TDQ performs the best for all storage yard area combinations, except for the smallest area in which it comes in second position after TD( ). This is because learning optimal value in TDQ can take place under any policy and learning rate effect. TD( ) comes second for all remaining combinations while MC and SARSA come third and fourth respectively.

Özet (Çeviri)

Container Loading Problem (CLP) is a space utilization problem subject to various constraints. An example of it is the placement of containers in storage so as to minimize the waste of space. Other constraints that may be imposed include a certain loading order and an even weight distribution. Although evolutionary algorithms have been extensively studied to solve this problem, Reinforcement Learning (RL) which is a means of learning optimal behaviors by interacting with the environment, has not received enough attention in this respect. This work explores the use of RL as an alternative for tackling CLP so as to minimize the waste of space while maximizing the number of containers. We have applied five different RL algorithms (Q-learning, TD( ), Monte-Carlo, TDQ-learning and SARSA), and two types of evolutionary algorithms (Genetic Algorithm and Ant Colony Optimization) to solve this problem. Simulations have been carried out using MATLAB to compare these algorithms based on space utilization, number of containers, simulation time and speed of convergence. The simulation parameters are set so that the algorithms are allowed to fill in storage yards 100% with containers of different sizes. Results show that, in general, RL may not guarantee the best results, but can minimize the computational difficulty providing a simple way to solve this problem. Genetic Algorithm (GA) on the other hand gives best speed of convergence for small storage yards, and, unlike other approaches that may require complex computations, four RL algorithms, namely Q-learning, TD( ), Monte-Carlo and SARSA, give better speed of convergence than GA for larger storage yards used in our simulations. Ant Colony Optimization (ACO), while generally being worse than the others in terms of average convergence time, performs better than the ordinary Q-learning for the largest storage yard area used in our simulations and gives a result similar to GA. Growth of ACO's average convergence time seems to be slower than those of others, indicating that it has the potential to have better convergence times with increasing storage yard area sizes. In terms of the number of containers that can be packed into storage yard when the number of containers is not restricted, all RL algorithms give approximately the same result. As the storage yard size becomes larger, however, Q-learning starts performing better than all the other RL algorithms in this respect and Monte-Carlo gradually worsens even though it is the best of all for filling small storage yards. But in terms of simulation time (for RL algorithms only) TDQ performs the best for all storage yard area combinations, except for the smallest area in which it comes in second position after TD( ). This is because learning optimal value in TDQ can take place under any policy and learning rate effect. TD( ) comes second for all remaining combinations while MC and SARSA come third and fourth respectively.

Benzer Tezler

  1. İmalat sistemlerinin tasarlanması ve öncelik kurallarının belirlenmesinde yapay sinir ağlarının kullanılması

    Başlık çevirisi yok

    TARIK ÇAKAR

    Doktora

    Türkçe

    Türkçe

    1997

    Mühendislik Bilimleriİstanbul Teknik Üniversitesi

    İşletme Mühendisliği Ana Bilim Dalı

    PROF. DR. AYHAN TORAMAN

  2. Derin pekiştirmeli öğrenme yöntemi ile görüntü hash kodlarını oluşturma

    Generating image hash codes with deep reinforcement learning method

    ELİF AKKAYA

    Yüksek Lisans

    Türkçe

    Türkçe

    2024

    Elektrik ve Elektronik MühendisliğiSakarya Üniversitesi

    Elektrik ve Elektronik Mühendisliği Ana Bilim Dalı

    DR. ÖĞR. ÜYESİ BURHAN BARAKLI

  3. Entegre fotonik cihazların tasarımına yönelik hesaplama tabanlı yaklaşımlar

    Integrated photonic device designs based on computational approaches

    EMRE BOR

    Doktora

    Türkçe

    Türkçe

    2020

    Elektrik ve Elektronik MühendisliğiTOBB Ekonomi ve Teknoloji Üniversitesi

    Elektrik-Elektronik Mühendisliği Ana Bilim Dalı

    PROF. DR. HAMZA KURT

    DOÇ. DR. MIRBEK TURDUEV

  4. Hyper-heuristics in dynamic environments

    Dinamik ortamlarda üst-sezgiseller

    BERNA KİRAZ

    Doktora

    İngilizce

    İngilizce

    2014

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik Üniversitesi

    Bilgisayar Mühendisliği Ana Bilim Dalı

    DOÇ. DR. AYŞE ŞİMA ETANER UYAR

  5. Monte Carlo tree search with temporal-difference learning for general video game playing

    Zamansal-fark öğrenmeli Monte Carlo ağaç araması ile genel video oyunu oynama

    ERCÜMENT İLHAN

    Yüksek Lisans

    İngilizce

    İngilizce

    2017

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik Üniversitesi

    Bilgisayar Mühendisliği Ana Bilim Dalı

    DOÇ. DR. AYŞE ŞİMA UYAR