Extending Decision-Making Policies in Partially Observable Environments for Active Perception
This paper presents a method for extending decision-making policies in active perception tasks for multi-agent systems within partially observable environments. Multiple agents obtain their policies by training in an environment of a certain size. Those policies are then used in the environments larger in size, that are divided into sub-environments of size similar (or smaller) to one that the agents were trained in. Learned policies are adapted accordingly by proposed Action-Space Reduced Policy (ASRP). By leveraging Multi-Agent Reinforcement Learning (MARL) within a POMDP framework, agents can use their learned policies across environments of differing complexity without requiring retraining. The consensus mechanism allows agents to maintain a common belief state, supporting collaborative decision-making based on observations from all agents involved. Validation of the method is conducted on the scenario of multi-agent exploration missions, demonstrating the use of extended policies and enhanced perception accuracy. Simulation results indicate expected success rates and decision-making times, regardless of the environment’s dimensionality. Potential applications for scalable, multi-agent perception systems are discussed, along with directions for future research.