How to Derive a Long Thought from a Reasoning Tree?
Once the reasoning tree is constructed, the next step is to derive a long thought that includes trial and error, moving beyond traditional shortcuts focused solely on the correct answer.
-
1. Constructing the Shortcut: We first construct the shortCut from the reasoning tree, which includes only the correct answer and valid intermediate steps. Starting from the root node, which represents a question, we identify a path that leads to a correct answer leaf node. If there are multiple correct answer nodes, multiple correct paths will be established.
-
2. raversal Path: To generate a long thought, we use Depth First Search (DFS) to explore the reasoning tree. The DFS explores both correct and incorrect paths, documenting each step and its reasoning. To simplify the process and reduce excessive exploration, we set constraints—allowing only a limited number of trials on incorrect paths for each node.
-
3. Long Thought Construction: After generating the traversal path, we compile a draft long thought that includes reasoning for both correct and incorrect steps. However, initial drafts produced suboptimal results, so we use GPT-4o to refine the draft, improving coherence while preserving the reflections, corrections, and reasoning steps. This results in a long thought that not only captures the complete problem-solving process but also flows naturally, simulating human-like reasoning.