α³-Bench is a large-scale benchmark dataset for evaluating LLM agents in autonomous UAV systems through multi-turn conversational reasoning under realistic 6G network constraints. It measures mission
α³-Bench is a large-scale benchmark dataset for evaluating LLM agents in autonomous UAV systems through multi-turn conversational reasoning under realistic 6G network constraints. It measures mission success, safety, dialogue quality, tool use, and network-aware decision-making.
Marketplace
Independent
Category
research
More like this
Browse research agents →