Internet of things (IoT) has been transforming inventory management disruptively by linking and synchronizing inventory products together. It is one of the driving forces for the prevailing innovation of AgriTech. For fresh produce replenishment in the presence of its inherent seasonal variations, not only can IoT devices capture bidirectional seasonal information of lead time and demand, but also detect fresh produce loss and waste (FPLW) caused by deterioration. With the aid of the massive data collected by IoT, we propose a data-driven deep reinforcement learning (DRL) approach using reward shaping, called DQN-SV-RS, to optimize the dynamic replenishment policy for a fresh produce wholesaler, specifically addressing the challenge posed by seasonal variations. Experimental results show that our DQN-SV-SR approach yields significant improvements for fresh produce supply chain (FPSC) inventory management, especially achieving a remarkable reduction in FPLW. As a core innovation in our DQN-SV-SR approach, the introduced reward shaping can significantly mitigate lost sales and inventory holding, thereby lowering the total cost. Furthermore, with numerical experiments based on real business data, our proposed approach is demonstrated with plausible robustness and scalable applicability.