Reformatted Alignment

1Shanghai Jiao Tong University, 2Shanghai Artificial Intelligence Laboratory, 3Fudan University, 4University of Maryland, College Park, 5CMU, 6Generative AI Research Lab (GAIR) *Corresponding Author

Introduction

We explores elevating the quality of existing instruction data to better align with human values, introducing a simple and effective approach named ReAlign (Reformatted Alignment), which reformats the responses of instruction data into a format that better aligns with pre-established criteria and the collated evidence. This approach minimizes human annotation, hallucination, and the difficulty in scaling, remaining orthogonal to existing alignment techniques. Experimentally, ReAlign significantly boosts the general alignment ability, math reasoning, factuality, and readability of the LLMs.

Encouragingly, without introducing any additional data or advanced training techniques, and merely by reformatting the response, LLaMA-2-13B's mathematical reasoning ability on GSM8K can be improved from 46.77% to 56.63% in accuracy. Additionally, a mere 5% of ReAlign data yields a 67% boost in general alignment ability measured by the Alpaca dataset. This work highlights the need for further research into the science and interpretability of LLMs.

The underlying philosophy of ReAlign is to re-coordinate the roles of humans and LLMs in the alignment process, leveraging their complementary strengths -- humans articulate their preferences, and LLMs, in turn, reconstruct instructions based on their generative power (e.g., instruction-following ability), without directly using distilled LLM knowledge. Through this collaborative synergy, we expect the generated instruction data to be not only more contextually precise but also more closely aligned with human preferences.

Description of first image
The accuracy of the GSM8K test set for LLaMA-2-13B and Mistral-7B models fine-tuned on the training set of GSM8K and MATH with and without ReAlign. (a): Training and testing on GSM8K. (b): Training on MATH and testing on GSM8K (Out-of-Distribution Setting).

Methodology

Description of second image
An overview of our ReAlign including three steps. KILT denotes Knowledge Intensive Language Tasks.

The ReAlign process unfolds in three main steps.

The first step involves criteria definition, where humans define their preferences (e.g., the preferred format of responses) in various scenarios in the form of natural language. In this paper, we meticulously define criteria for 46 distinct scenarios.

The second step, retrieval augmentation, broadens the knowledge base for knowledge-intensive tasks like open-domain QA and fact verification. This is achieved by incorporating additional information, thereby improving the factuality and informativeness of responses.

The final step, reformatting, aims to re-align the responses with the pre-established criteria and the collated evidence, guaranteeing outputs that are both structured and substantiated.

Examples

Description of third image
ReAlign realigns the original response with the pre-defined criteria to be a better format.
Description of fourth image
An example of the response from the original model and the response from the ReAlign Model.

Results

Description of fifth image
The results of the general alignment ability on the original datasets and the ReAlign datasets.
Description of sixth image
The results of math reasoning on GSM8K, MATH and them + ReAlign based on LLaMA-2-13B and Mistral-7B. We test models on both GSM8K and MATH test sets. We report the accuracy by exact matching.
Description of seventh image
The results of the factuality score.
Description of eighth image
The readability win-rate of the original dataset + ReAlign against the original dataset based on LLaMA-2-13B, judged by GPT-4 and human.
Description of ninth image
The scaling trends in ReAlign data percentage, including general alignment ability and knowledge ability. We conduct the experiment in the Alpaca dataset based on LLaMA-2-13B.

BibTeX

@article{fan2024reformatted,
      title={Reformatted Alignment},
      author={Fan, Run-Ze and Li, Xuefeng and Zou, Haoyang and Li, Junlong and He, Shwai and Chern, Ethan and Hu, Jiewen and Liu, Pengfei},
      year={2024},
      journal={arXiv preprint arXiv:2402.12219},
      url={https://arxiv.org/abs/2402.12219}
}