初始化项目，由ModelHub XC社区提供模型

Model: uw-math-ai/gAPRIL-wo-exp Source: Original Platform
2026-04-20 13:51:08 +08:00
commit 2e6784298f
16 changed files with 152506 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,190 @@
+---
+license: apache-2.0
+datasets:
+- uw-math-ai/APRIL
+tags:
+  - lean4
+base_model:
+- Goedel-LM/Goedel-Prover-V2-8B
+pipeline_tag: text-generation
+library_name: transformers
+---
+
+# APRIL-Goedel-8B: Lean Proof Repair (Repair Only)
+
+This model is a LoRA finetune of [Goedel-Prover-V2-8B](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-8B) on the [APRIL](https://huggingface.co/datasets/uw-math-ai/APRIL) dataset for **Lean 4 proof repair without explanation supervision**. Given an erroneous Lean proof and compiler feedback, the model directly produces a corrected proof. This variant maximizes single-shot repair accuracy by training exclusively on the repair objective.
+
+## Model Details
+
+- **Base model:** Goedel-Prover-V2-8B
+- **Method:** Supervised finetuning with LoRA (rank 32, α = 64)
+- **Training data:** APRIL — 260K paired erroneous/correct Lean proofs with compiler diagnostics (explanations excluded from supervision)
+- **Task:** Proof repair only
+- **Lean version:** 4.22.0-rc4
+
+## Results
+
+Single-shot proof repair accuracy (pass@1) on the APRIL test set (1,835 examples), evaluated by Lean compilation:
+
+| Error Type | This Model (w/o exp) | With Explanations | Base Goedel-8B | Goedel-32B |
+|---|---|---|---|---|
+| **Full** | **36.7%** | 34.6% | 15.5% | 26.8% |
+| Tactic | **48.5%** | 41.7% | 19.6% | 34.2% |
+| Line | **25.5%** | 18.5% | 20.0% | 28.5% |
+| Theorem | **37.5%** | 36.8% | 12.7% | 23.0% |
+| Multi-Line | **24.3%** | 20.8% | 19.4% | 32.6% |
+
+Training exclusively for repair (without explanation supervision) yields the highest pass@1 accuracy, gaining ~2% over the joint variant. However, this model does not produce human-interpretable diagnostics. See the [with-explanation variant](https://huggingface.co/uw-math-ai/gAPRIL-w-exp) for the trade-off discussion.
+
+## Model & Dataset Download
+
+| Resource | Description | Link |
+|---|---|---|
+| **APRIL Dataset** | 260K Lean proof-repair tuples with compiler diagnostics and explanations | [uw-math-ai/APRIL](https://huggingface.co/datasets/uw-math-ai/APRIL) |
+| **gAPRIL-w-exp** | Goedel-8B finetuned on APRIL with joint explanation supervision | [uw-math-ai/gAPRIL-w-exp](https://huggingface.co/uw-math-ai/gAPRIL-w-exp) |
+| **gAPRIL-wo-exp** | Goedel-8B finetuned on APRIL for repair only (no explanations) | [uw-math-ai/gAPRIL-wo-exp](https://huggingface.co/uw-math-ai/gAPRIL-wo-exp) |
+
+## Usage
+
+The model expects a chat-formatted prompt with the erroneous proof, goal state, error line, and compiler error message. The assistant response contains the corrected proof in a `lean` code block.
+
+**System:** `You are diagnosing a single failing proof`
+
+**User:**
+```
+Explain the error, suggest a fix, and provide the corrected proof based on the context:
+
+Incorrect Proof: <erroneous proof>
+State: <goal state before error from InfoView>
+Line at error: <error-occurred line of code>
+Lean error: <error messages from InfoView>
+```
+
+**Assistant** (model output):
+```
+Explanation: <explanation of error cause>
+Fix: <code manipulation fix suggestion>
+Corrected Proof: <corrected proof>
+```
+
+### Example Inference
+
+````python
+import re
+
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+
+torch.manual_seed(42)
+
+
+def extract_proof_from_text(output):
+    lean_codes = re.findall(r"```lean\s*(.*?)\s*```", output, re.DOTALL)
+    if not lean_codes or len(lean_codes) == 0:
+        lean_codes = re.findall(r"```lean4\s*(.*?)\s*```", output, re.DOTALL)
+    words = ["by", ":="]
+
+    for i in range(len(lean_codes)):
+        lean_code = lean_codes[-i - 1]
+        if all(word in lean_code for word in words):
+            return lean_code
+    return None
+
+
+model_id = "uw-math-ai/gAPRIL-wo-exp"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True
+)
+
+system_prompt = (
+    "You are a Lean 4 programmer diagnosing a single failing proof. "
+    "Assume you only see the incorrect proof text, the infoview state"
+    " near the failure, and Lean's error message."
+)
+
+# Context information for the incorrect proof
+incorrect_proof = """
+theorem lean_problem : IsLeast {x : ℕ | x > 0 ∧ (7 * x) % 100 = 29} 47 := by
+  constructor
+  · constructor
+    · norm_num
+    · norm_num
+  · intro x ⟨hx_pos, hx_cong⟩
+    by_contra h
+    push_neg at h
+    obtain ⟨h_le, h_ne⟩ := lt_iff_le_and_ne.mp h
+    have h_lt := h_le
+    revert x hx_pos hx_cong h_lt
+    refine' Nat.le_induction _ _ 47 _ <;> intros x hx_lt hx_pos hx_cong
+    · rfl
+    · have : x < 47 := by omega
+      interval_cases x
+      all_goals try { norm_num at hx_cong; norm_num }
+""".strip()
+
+infoview_state = (
+    "case right.intro.refine'_4 ⊢ ∀ (n : ℕ), sorry ≤ n → "
+    "(∀ ⦃x : ℕ⦄, x > 0 → 7 * x % 100 = 29 → x < n → x ≤ n → x ≠ n → x ≤ n → False) → "
+    "∀ ⦃x : ℕ⦄, x > 0 → 7 * x % 100 = 29 → x < n + 1 → x ≤ n + 1 → x ≠ n + 1 → x ≤ n + 1 → False"
+)
+
+line_at_error = "refine' Nat.le_induction _ _ 47 _ <;> intros x hx_lt hx_pos hx_cong"
+
+error_message = (
+    "tactic 'introN' failed, insufficient number of binders\n"
+    "case right.intro.refine'_1\n⊢ ℕ"
+)
+
+user_prompt = f"""
+**Instruction:** Provide the full corrected Lean 4 theorem/proof in a single ```lean``` code block.
+
+**Context:**
+
+Incorrect proof:
+```lean
+{incorrect_proof}
+```
+
+Infoview state:
+{infoview_state}
+
+Line at error:
+{line_at_error}
+
+Lean error:
+{error_message}
+""".strip()
+
+
+chat = [
+    {"role": "system", "content": system_prompt},
+    {"role": "user", "content": user_prompt},
+]
+
+inputs = tokenizer.apply_chat_template(
+    chat, tokenize=True, add_generation_prompt=True, return_tensors="pt"
+).to(model.device)
+
+outputs = model.generate(inputs, max_new_tokens=8192)
+decoded_outputs = tokenizer.batch_decode(outputs)
+proof = extract_proof_from_text(decoded_outputs[0])
+print(proof)
+````
+
+## Citation
+
+```bibtex
+@article{wang2026repairlean,
+  title  = {Learning to Repair Lean Proofs from Compiler Feedback},
+  author = {Wang, Evan and Chess, Simon and Lee, Daniel and Ge, Siyuan and Mallavarapu, Ajit and Ilin, Vasily},
+  journal= {arXiv preprint arXiv:2602.02990},
+  year   = {2026},
+  doi    = {10.48550/arXiv.2602.02990},
+  url    = {https://arxiv.org/abs/2602.02990}
+}
+```
+
+## Acknowledgements
+
+Work by the Math AI Lab, University of Washington. Supported by UW eScience School, UW IT (AWS credits), UW Department of Applied Mathematics (GPU access), and Nebius (LLM cloud credits).