Skip to main content

Key Recent Works

Our team has a rich and diverse research background, with a strong focus on advancing core areas of AI and reasoning. We have extensively explored the domains of evaluation methodologies, including novel approaches for assessing reasoning beyond traditional accuracy metrics. Additionally, our research delves into AI alignment, particularly in fostering honesty and ethical behavior in models. We are also deeply involved in the development of datasets and benchmarks that challenge AI’s cognitive and reasoning abilities across multiple disciplines. Moreover, we have contributed significantly to generative AI for complex domains like mathematics, where we emphasize large-scale pretraining and reasoning. This solid foundation of prior work equips us with the expertise to tackle ambitious AI challenges and foster continued innovation in the field.

Some highly relevant key recent projects are listed below (ordered by the time of publication on arXiv):

ProjectFocusGitHubWebsitePublicationDate
Generative AI for Math: AbelReasoning modelsCodeHomepageArXiv2023.09
Alignment for HonestyHonesty-aligned modelsCodeHomepageNeurIPS 20242023.12
MathPile: A Billion-Token-Scale Pretraining Corpus for MathMath pre-train dataCodeHomepageNeurIPS 20242023.12
Reformatted AlignmentAlignment methodsCodeHomepageEMNLP 20242024.02
Evaluating Mathematical Reasoning Beyond AccuracyReasoning EvaluationCode-ArXiv2024.04
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AIReasoning EvaluationCodeHomepageNeurIPS 20242024.06
Progress or Regress? Self-Improvement Reversal in Post-trainingEvaluationCode-ICML 2024 workshop2024.07
Weak-to-Strong ReasoningReasoningCode-EMNLP 20242024.07
OpenResearcher: Unleashing AI for Accelerated Scientific ResearchAI for scientific researchCode-EMNLP 2024 demo2024.08