初始化项目,由ModelHub XC社区提供模型
Model: codefuse-ai/SWE-CARE-RM Source: Original Platform
This commit is contained in:
232
README.md
Normal file
232
README.md
Normal file
@@ -0,0 +1,232 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
language:
|
||||
- en
|
||||
- zh
|
||||
base_model:
|
||||
- Qwen/Qwen3-8B
|
||||
library_name: transformers
|
||||
tags:
|
||||
- rm
|
||||
- cr
|
||||
---
|
||||
|
||||
# SWE-CARE-RM
|
||||
|
||||
This model is a custom reward model built on top of **Qwen3-8B** with:
|
||||
|
||||
- a merged **LoRA** adapter
|
||||
- an additional **projector head**
|
||||
- a scalar reward output in **[0, 1]**
|
||||
|
||||
The model is designed to score the quality of a review conditioned on:
|
||||
|
||||
1. an issue / problem statement
|
||||
2. a code patch
|
||||
3. a candidate review
|
||||
|
||||
A higher score means the model considers the review better under the given issue and patch.
|
||||
|
||||
## Model Architecture
|
||||
|
||||
The model consists of:
|
||||
|
||||
- base model: **Qwen3-8B**
|
||||
- adaptation: **LoRA**
|
||||
- reward head: a custom **MLP projector**
|
||||
- final score: `sigmoid(projector(last_hidden_state[:, -1]))`
|
||||
|
||||
This repository contains the **merged decoder weights** together with `projector.pth`.
|
||||
|
||||
## Input Format
|
||||
|
||||
The model expects three text fields:
|
||||
|
||||
- `issue`
|
||||
- `patch`
|
||||
- `review`
|
||||
|
||||
During inference, the input is formatted as:
|
||||
|
||||
```latex
|
||||
<issue>{issue}</issue><patch>{patch}</patch><review>{review}<review>
|
||||
```
|
||||
|
||||
The score is computed from the last token hidden state.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```latex
|
||||
from pathlib import Path
|
||||
import json
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
|
||||
MODEL_DIR = "codefuse-ai/SWE-CARE-RM"
|
||||
MAX_SEQ_LEN = 51200
|
||||
MIN_REVIEW_LEN = 4096
|
||||
TRUST_REMOTE_CODE = True
|
||||
|
||||
with open(f"{MODEL_DIR}/data_sample.jsonl", "r") as fr:
|
||||
for line in fr:
|
||||
json_data = json.loads(line)
|
||||
break
|
||||
|
||||
SAMPLE = {
|
||||
"issue": json_data['problem_statement'],
|
||||
"patch": json_data['patch_to_review'],
|
||||
"review": json_data['pos_review'][0]
|
||||
}
|
||||
|
||||
class Projector(nn.Module):
|
||||
def __init__(self, arch, input_size, hidden_size, use_bf16):
|
||||
super().__init__()
|
||||
depth = int(arch[len("mlp"): arch.index("x_relu")])
|
||||
layers = [nn.Linear(input_size, hidden_size).bfloat16() if use_bf16 else
|
||||
nn.Linear(input_size, hidden_size)]
|
||||
for _ in range(1, depth):
|
||||
layers.append(nn.ReLU())
|
||||
layers.append(nn.Linear(hidden_size, 1).bfloat16() if use_bf16 else
|
||||
nn.Linear(hidden_size, 1))
|
||||
self.model = nn.Sequential(*layers)
|
||||
|
||||
def forward(self, x):
|
||||
return self.model(x)
|
||||
|
||||
|
||||
def resolve_dtype(dtype_name):
|
||||
if dtype_name in {"bf16", "bfloat16"}:
|
||||
return torch.bfloat16
|
||||
if dtype_name in {"fp16", "float16"}:
|
||||
return torch.float16
|
||||
return torch.float32
|
||||
|
||||
|
||||
def infer_proj_arch(projector_state_dict):
|
||||
linear_weight_keys = [k for k in projector_state_dict if k.startswith("model.")
|
||||
and k.endswith(".weight")]
|
||||
return f"mlp{len(linear_weight_keys)}x_relu"
|
||||
|
||||
|
||||
def process_one(issue_ids, issue_masks, patch_ids, patch_masks, review_ids,
|
||||
review_masks, max_len, min_review_len):
|
||||
review_keep = min(min_review_len, len(review_ids))
|
||||
remain_for_patch = max(max_len - len(issue_ids) - review_keep, 0)
|
||||
patch_keep = min(len(patch_ids), remain_for_patch)
|
||||
|
||||
ids_all = issue_ids + patch_ids[:patch_keep] + review_ids[-review_keep:]
|
||||
masks_all = issue_masks + patch_masks[:patch_keep] + review_masks[-review_keep:]
|
||||
|
||||
if len(ids_all) < max_len:
|
||||
pad_len = max_len - len(ids_all)
|
||||
ids_all = [0] * pad_len + ids_all
|
||||
masks_all = [0] * pad_len + masks_all
|
||||
|
||||
return ids_all[:max_len], masks_all[:max_len]
|
||||
|
||||
|
||||
reward_config = {}
|
||||
reward_config_path = Path(MODEL_DIR) / "reward_config.json"
|
||||
if reward_config_path.exists():
|
||||
reward_config = json.load(open(reward_config_path, "r", encoding="utf-8"))
|
||||
|
||||
projector_path = Path(MODEL_DIR) / "projector.pth"
|
||||
projector_state_dict = torch.load(projector_path, map_location="cpu")
|
||||
proj_arch = reward_config.get("proj_arch") or infer_proj_arch(projector_state_dict)
|
||||
torch_dtype = resolve_dtype(reward_config.get("torch_dtype") or "bfloat16")
|
||||
attn_implementation = reward_config.get("attn_implementation")
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR,
|
||||
trust_remote_code=TRUST_REMOTE_CODE, padding_side="left")
|
||||
|
||||
model_kwargs = {"trust_remote_code": TRUST_REMOTE_CODE, "torch_dtype": torch_dtype}
|
||||
if attn_implementation:
|
||||
model_kwargs["attn_implementation"] = attn_implementation
|
||||
decoder = AutoModelForCausalLM.from_pretrained(MODEL_DIR, **model_kwargs)
|
||||
|
||||
projector = Projector(proj_arch, decoder.config.hidden_size,
|
||||
decoder.config.hidden_size, torch_dtype == torch.bfloat16)
|
||||
projector.load_state_dict(projector_state_dict)
|
||||
|
||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
||||
decoder.to(device).eval()
|
||||
projector.to(device).eval()
|
||||
|
||||
issue_inputs = tokenizer(f"<issue>{SAMPLE['issue']}</issue>", padding=False,
|
||||
truncation="longest_first")
|
||||
patch_inputs = tokenizer(f"<patch>{SAMPLE['patch']}</patch>", padding=False,
|
||||
truncation="longest_first")
|
||||
review_inputs = tokenizer(SAMPLE["review"], padding=False, truncation="longest_first")
|
||||
|
||||
input_ids, attention_mask = process_one(
|
||||
issue_inputs["input_ids"],
|
||||
issue_inputs["attention_mask"],
|
||||
patch_inputs["input_ids"],
|
||||
patch_inputs["attention_mask"],
|
||||
review_inputs["input_ids"],
|
||||
review_inputs["attention_mask"],
|
||||
max_len=MAX_SEQ_LEN,
|
||||
min_review_len=MIN_REVIEW_LEN,
|
||||
)
|
||||
|
||||
inputs = {
|
||||
"input_ids": torch.tensor([input_ids], dtype=torch.long, device=device),
|
||||
"attention_mask": torch.tensor([attention_mask], dtype=torch.long, device=device),
|
||||
}
|
||||
|
||||
with torch.no_grad():
|
||||
hidden_state = decoder(**inputs, output_hidden_states=True).hidden_states[-1]
|
||||
reward = torch.sigmoid(projector(hidden_state).squeeze(-1)[:, -1]).item()
|
||||
|
||||
print(reward)
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
The model outputs a single scalar reward score in [0, 1].
|
||||
|
||||
Typical interpretation:
|
||||
|
||||
- higher score: better review quality
|
||||
- lower score: worse review quality
|
||||
|
||||
This score is best used for:
|
||||
|
||||
- ranking candidate reviews
|
||||
- pairwise comparison
|
||||
- reward modeling in downstream training or reranking
|
||||
|
||||
## Intended Use
|
||||
|
||||
This model is intended for:
|
||||
|
||||
- code review quality scoring
|
||||
- reward modeling for review generation
|
||||
- reranking multiple candidate reviews for the same issue and patch
|
||||
|
||||
## Limitations
|
||||
|
||||
- The score is relative, not an absolute guarantee of correctness.
|
||||
- Long-input truncation may affect results.
|
||||
- The model should not be used as the only signal for production-critical review
|
||||
decisions.
|
||||
|
||||
## Citation
|
||||
|
||||
If you use this model, please cite SWE-CARE as appropriate.
|
||||
|
||||
```
|
||||
@misc{guo2025codefusecrbenchcomprehensivenessawarebenchmarkendtoend,
|
||||
title={CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects},
|
||||
author={Hanyang Guo and Xunjin Zheng and Zihan Liao and Hang Yu and Peng DI and Ziyin Zhang and Hong-Ning Dai},
|
||||
year={2025},
|
||||
eprint={2509.14856},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.SE},
|
||||
url={https://arxiv.org/abs/2509.14856},
|
||||
}
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user