Yubo Dong1, Nianhao You1, Yuxuan Hou1, Zixun Sun1, Yue Zhang1, Liang Zhang2, Siyuan Zhao2, Hehe Fan1, ✉
1Zhejiang University, 2Ant Group


🌟 Visit the Official Project Website & Leaderboard

Paper PDF Code Repository

Super Research introduces a new benchmark for evaluating LLMs on highly complex questions requiring long-horizon planning, massive evidence gathering, and synthesis across heterogeneous sources. It evaluates systems on Super Deep Investigation and Super Wide Retrieval.


Overview

data_dist_01.png

While Large Language Models (LLMs) have demonstrated proficiency in Deep Research or Wide Search, their capacity to solve highly complex questions remains largely unexplored. We introduce Super Research, a task for complex autonomous research tasks that integrates:

  1. Structured decomposition into a research plan
  2. Super wide retrieval for diverse perspectives
  3. Super deep investigation to resolve uncertainties through iterative queries.

To evaluate this capability, we curated a benchmark of 300 expert-written questions across diverse domains, each requiring up to 100+ retrieval steps and 1,000+ web pages to reconcile conflicting evidence.


Methodology & Graph-Anchored Auditing

Super Research employs a structured graph-based reasoning approach to handle high-stakes analytics. We present a graph-anchored auditing protocol that evaluates Super Research along five dimensions:

  • Coverage * Logical Consistency
  • Report Utility
  • Objectivity
  • Citation Health (Dominance & Monopolization)
superresearch-pipeline.png


👉 Click here to view the full interactive Leaderboard & Case Gallery

@article{superresearch2026,
  title={Super Research: Answering Highly Complex Questions with Large Language Models though Super Deep and Super Wide Research},
  author={Yubo Dong, Nianhao You, Yuxuan Hou, Zixun Sun, Yue Zhang, Liang Zhang, Siyuan Zhao, Hehe Fan},
  journal={ArXiv},
  year={2026}
}