Loading...
Multi Agent Reinforcement Learning Approach for Autonomous Fleet Management
Alhusin, Mohammed Omer Alamin
Alhusin, Mohammed Omer Alamin
Date
2019-12
Authors
Advisor
Type
Thesis
Degree
Citations
Altmetric:
Description
A Master of Science thesis in Computer Engineering by Mohammed Omer Alamin Alhusin entitled, “Multi Agent Reinforcement Learning Approach for Autonomous Fleet Management”, submitted in December 2019. Thesis advisor is Dr. Michel Pasquier and thesis co-advisor is Dr. Gerassimos Barlas. Soft copy is available (Thesis, Approval Signatures, Completion Certificate, and AUS Archives Consent Form).
Abstract
The Taxi Dispatch problem is a well-known and important problem in the field of transportation and logistics, that has many similarities with other fleet management problems. The objective of the taxi dispatch system is to assign idle taxis to passengers waiting at different geographical locations in a way that maximizes resource utilization while minimizing their operating cost. Traditionally, heuristic rules are used in dispatch problems, mainly because of the simplicity and scalability of the approach. However, at high demand rates, rule-based approaches perform poorly. This encouraged many researchers to build more complex models to tackle the dispatch problem, but most of these models are computationally expensive and cannot scale to handle large fleets. Additionally, most of these approaches are not robust enough for a stochastic environment, which is usually the case with real-world traffic. In this work we model the problem as a Markov Game and solve it using Model-Free Multi-Agent Deep Reinforcement Learning, which is the best approach when the environment is stochastic and there is otherwise no good model for it. The main drawback of reinforcement learning is that it requires too much time and data to learn the optimal policy. In this work we address this issue and strive to improve the efficiency of this algorithm. The curse of dimensionality was broken by representing the state variable as an image which made the complexity independent from the number of taxis and requests and only dependent on the size of the map thus allowing the algorithm to handle large fleets with ease. Using a residual convolutional neural network as Q function approximator allowed the agents to learn complex spatial patterns while seeing only few training samples. We have also found that we can reduce the resolution of the state variable by more than half while losing only 3% of the performance. The proposed algorithm was validated against a rule-based heuristic under different supply-demand ratios, and found to outperform the rule-based technique by a large margin when there is a lack of supply.
