Key Recent Works

Our team has a rich and diverse research background, with a strong focus on advancing core areas of AI and reasoning. We have extensively explored the domains of evaluation methodologies, including novel approaches for assessing reasoning beyond traditional accuracy metrics. Additionally, our research delves into AI alignment, particularly in fostering honesty and ethical behavior in models. We are also deeply involved in the development of datasets and benchmarks that challenge AI’s cognitive and reasoning abilities across multiple disciplines. Moreover, we have contributed significantly to generative AI for complex domains like mathematics, where we emphasize large-scale pretraining and reasoning. This solid foundation of prior work equips us with the expertise to tackle ambitious AI challenges and foster continued innovation in the field.

Some highly relevant key recent projects are listed below (ordered by the time of publication on arXiv):

Project	Focus	GitHub	Website	Publication	Date
Generative AI for Math: Abel	Reasoning models	Code	Homepage	ArXiv	2023.09
Alignment for Honesty	Honesty-aligned models	Code	Homepage	NeurIPS 2024	2023.12
MathPile: A Billion-Token-Scale Pretraining Corpus for Math	Math pre-train data	Code	Homepage	NeurIPS 2024	2023.12
Reformatted Alignment	Alignment methods	Code	Homepage	EMNLP 2024	2024.02
Evaluating Mathematical Reasoning Beyond Accuracy	Reasoning Evaluation	Code	-	ArXiv	2024.04
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI	Reasoning Evaluation	Code	Homepage	NeurIPS 2024	2024.06
Progress or Regress? Self-Improvement Reversal in Post-training	Evaluation	Code	-	ICML 2024 workshop	2024.07
Weak-to-Strong Reasoning	Reasoning	Code	-	EMNLP 2024	2024.07
OpenResearcher: Unleashing AI for Accelerated Scientific Research	AI for scientific research	Code	-	EMNLP 2024 demo	2024.08