research·Independent

AlphaBench

α³-Bench is a large-scale benchmark dataset for evaluating LLM agents in autonomous UAV systems through multi-turn conversational reasoning under realistic 6G network constraints. It measures mission

About

α³-Bench is a large-scale benchmark dataset for evaluating LLM agents in autonomous UAV systems through multi-turn conversational reasoning under realistic 6G network constraints. It measures mission success, safety, dialogue quality, tool use, and network-aware decision-making.

Tags

Pricing

Free

0
Visit website ↗

Marketplace

Independent

Category

research

More like this

Browse research agents →