News

Latest news and updates from StatAI Lab.

2026

StatAI Lab and CMFAI Jointly Launch DataSciEval — A Unified Benchmark for LLM and AI Agent Data Science Capabilities

July 21, 2026 July 23, 2026

We are excited to announce the release of **[DataSciEval](https://huggingface.co/spaces/StatAILab/DataSciEval)**, a unified benchmark for evaluating the data science capabilities of large language models and AI agents, jointly developed with HK PolyU CMFAI.

Two Papers Accepted to STAl-X 2026 — One Selected for Paper Award

June 22, 2026 July 23, 2026

Two papers on RLHF and A/B testing have been accepted to STAl-X 2026, with one paper selected for the Paper Award.

Professor Fan Zhou Named Distinguished Alumni Award by UNC Chapel Hill

May 19, 2026 July 23, 2026

Professor Fan Zhou has been honored with the James E. Grizzle Distinguished Alumni Award by the University of North Carolina at Chapel Hill.

Professor Fan Zhou accepted the invitation to serve as Area Chair for NeurIPS 2026

May 02, 2026 July 23, 2026

Professor Fan Zhou accepted the invitation to serve as Area Chair for NeurIPS 2026.

StatProver — Agentic Statistical Proof Assistant

April 23, 2026 July 23, 2026

We are excited to announce the release of **[StatProver](https://statprover.com)**, a brand new agentic statistical proof assistant. StatProver helps users clarify the problem, find references, outline skeleton steps, and write the proof.

New paper accepted at JASA

April 14, 2026 July 23, 2026

Our paper, ['Discussion of LAMBDA: Large Model Based Data Agent'](assets/publications/JASA2026.pdf) (Bang Liu, Run Yang, Fan Zhou), has been published [online](https://www.tandfonline.com/doi/full/10.1080/01621459.2025.2554757)at the Journal of the American Statistical Association (JASA).

New paper accepted at JASA

April 13, 2026 July 23, 2026

Our paper, "[Distributional Off-Policy Evaluation with Deep Quantile Process Regression](/assets/publications/JASA2026a.pdf)" (Qi Kuang, Chao Wang, Yuling Jiao, Fan Zhou), has been accepted at the Journal of the American Statistical Association (JASA).

2025

StatEval — Benchmarking Statistical Reasoning in Large Language Models

October 12, 2025 July 23, 2026

**[StatEval](https://statai-lab.github.io/StatEval.github.io/)**, developed by the team of Professor Fan Zhou, is the first benchmark systematically organized along both difficulty and disciplinary axes to evaluate large language models’ statistical reasoning.It includes a Foundational Knowledge Dataset comprising exactly **22,262** problems (**9,382** undergraduate and **12,880** graduate instances) curated from 76 classical textbooks and extensive exam collections.Furthermore, it features a Statistical Research Dataset consisting of **84,179** proof-based tasks derived from **6,953** high-impact research articles (published between 2000 and 2025). These tasks are categorized by derivation difficulty into **40,366** Easy, **22,013** Medium, and **21,800** Hard problems.A representative partial test set (Demo) is publicly available and can be accessed on [Hugging Face](https://huggingface.co/datasets/StatAILab/StatEval)