We consider an IoT sensing network with multiple users, multiple energy harvesting sensors, and a wireless edge node acting as a gateway between the users and sensors. The users request for updates about the value of physical processes, each of which is measured by one sensor. The edge node has a cache storage that stores the most recently received measurements from each sensor. Upon receiving a request, the edge node can either command the corresponding sensor to send a status update, or use the data in the cache. We aim to find the best action of the edge node to minimize the average long-term cost which trade-offs between the age of information and energy consumption. We propose a practical reinforcement learning approach that finds an optimal policy without knowing the exact battery levels of the sensors. Simulation results show that the proposed method significantly reduces the average cost compared to several baseline methods.
Funding Agencies|Infotech Oulu; Academy of FinlandAcademy of FinlandEuropean Commission [323698, 319485]; Academy of Finland 6Genesis Flagship [318927]; European Unions Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Grant [793402]