We propose to control handoffs (HOs) in user-centric cell-free massive MIMO networks through a partially observable Markov decision process (POMDP) with the state space representing the discrete versions of the large-scale fading (LSF) and the action space representing the association decisions of the user with the access points. Our proposed formulation accounts for the temporal evolution and the partial observability of the channel states. This allows us to consider future rewards when performing HO decisions, and hence obtain a robust HO policy. To alleviate the high complexity of solving our POMDP, we follow a divide-and-conquer approach by breaking down the POMDP formulation into sub-problems, each solved individually. Then, the policy and the candidate cluster of access points for the best solved sub-problem is used to perform HOs within a specific time horizon. We control the number of HOs by determining when to use the HO policy. Our simulation results show that our proposed solution reduces HOs by 47% compared to time-triggered LSF-based HOs and by 70% compared to data rate threshold-triggered LSF-based HOs. This amount can be further reduced through increasing the time horizon of the POMDP.
- Pub Date:
- August 2022
- Computer Science - Information Theory;
- Computer Science - Networking and Internet Architecture;
- Electrical Engineering and Systems Science - Signal Processing
- IEEE Global Communications Conference 2022