paper Investigation of the USV-AUV Cooperative Environment via Reinforcement Learning and Its Impact on Data Collection and Energy Efficiency


TÜRKOĞLU M. M., AKYUZ E.

OCEAN ENGINEERING, vol.348, 2026 (SCI-Expanded, Scopus) identifier

  • Nəşrin Növü: Article / Article
  • Cild: 348
  • Nəşr tarixi: 2026
  • Doi nömrəsi: 10.1016/j.oceaneng.2025.124004
  • jurnalın adı: OCEAN ENGINEERING
  • Jurnalın baxıldığı indekslər: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, Environment Index, Geobase, ICONDA Bibliographic, INSPEC
  • Adres: Yox

Qısa məlumat

Autonomous Underwater Vehicles (AUVs) and Unmanned Surface Vehicles (USVs) are key enablers for underwater sensing, yet their performance degrades under turbulent currents and wave-induced disturbances. We present a unified simulation framework for cooperative USV-AUV missions that operationalizes Fisher Information Matrix (FIM) within multi-agent reinforcement learning: FIM-guided USV positioning feeds into the agents' observation and, when applicable, reward signals to reduce localization uncertainty and improve coordination. We evaluate three policies-Proximal Policy Optimization (PPO), Curriculum Reinforcement Learning (CRL), and Twin Delayed Deep Deterministic Policy Gradient (TD3) with FIM features (TD3-FIM)-under a single, reproducible setup across two regimes (normal and extreme sea states), with training and testing performed in both, using task-level metrics (data rate, total throughput), resource metrics (energy), system health (overflow events), and tracking error. A composite Reliability Index (RI) summarizes multi-objective performance. Results show that PPO consistently achieves higher reliability and more stable data collection than CRL in both regimes, while FIM-guided USV adaptation markedly lowers tracking error versus static baselines. The three-arm comparison establishes a practical benchmark for USV-AUV cooperation with physically motivated sea dynamics. Limitations include a simulation-based evaluation and simplified acoustic assumptions; future work will consider multi-USV coordination, latency/packet-loss models, and hardware-in-the-loop trials.