Go to file

ModelHub XC 0f2a038051 初始化项目，由ModelHub XC社区提供模型

Model: zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10
Source: Original Platform

2026-05-03 18:43:31 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

added_tokens.json

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

pytorch_model-00001-of-00002.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

pytorch_model-00002-of-00002.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

pytorch_model.bin.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-03 18:43:31 +08:00

README.md

license, base_model, tags, language, pipeline_tag

license

base_model

Qwen3-1.7B Jacobi Forcing (v2math811, AR×10)

Jacobi-Forcing fine-tune of Qwen/Qwen3-1.7B trained on a mixed code + math trajectory dataset (v2math811). Produces output identical in quality to the base AR model while supporting Jacobi parallel decoding for ~1.5–1.7× wall-clock speedup.

Highlights

Lossless quality: HumanEval pass@1 / GSM8K accuracy match base AR generation (within noise).
Speedup: 1.65× on HumanEval, 1.53× on GSM8K (vs greedy AR, same model).
Drop-in compatible with HuggingFace AutoModelForCausalLM for AR generation. Jacobi inference requires the JacobiForcing repo (custom forward kernel).

Training recipe

Continued from base Qwen3-1.7B with the consistency + AR loss from the JacobiForcing paper:

Setting	Value
Base	`Qwen/Qwen3-1.7B`
Dataset	code (OpenCodeInstruct buckets 8-11) + math (OpenThought2 buckets 8-11), 26 510 trajectory samples after traj_count ≤ 3 filter
Strategy	progressive noise window, N=32, window=16
Epochs	1
Optimizer	AdamW
LR	5e-6 (cosine, warmup 0.03)
Batch	per-device 1 × grad-accum 4 = 4
Precision	bf16
`AR_LOSS_WEIGHT`	10 (paper default; tested 20 — slightly worse Jacobi acceptance)
GPU	1× A100-80GB, ~4h47m

Benchmarks (1× A100, greedy)

Bench	AR pass@1 / acc	Jacobi pass@1 / acc	AR tok/s	Jacobi tok/s	Speedup
HumanEval (n=164)	60.4 %	61.0 %	37.2	61.3	1.65×
GSM8K (n=653 subset)	72.4 %	74.3 %	38.0	58.3	1.53×

Jacobi internals (HumanEval): tok/iter = 1.74, average accept-window 87 % of N=32.

Usage — standard AR

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

ckpt = "zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10"
tok = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForCausalLM.from_pretrained(
    ckpt, torch_dtype=torch.bfloat16, device_map="cuda"
)

msgs = [{"role": "user", "content": "Write a Python is_prime(n)."}]
inp = tok.apply_chat_template(
    msgs, tokenize=False, add_generation_prompt=True, enable_thinking=False
)
ids = tok(inp, return_tensors="pt").to("cuda")
out = model.generate(**ids, max_new_tokens=200, do_sample=False)
print(tok.decode(out[0][ids["input_ids"].shape[1]:], skip_special_tokens=True))

Usage — Jacobi parallel decoding

Jacobi inference uses a custom jacobi_forward_greedy registered on Qwen3ForCausalLM. See the JacobiForcing repo for the full inference script, or use the snippet:

from transformers import Qwen3ForCausalLM
from generate_trajectory.generation.qwen3_modeling_jacobi_forcing_greedy import (
    jacobi_forward_greedy,
)
Qwen3ForCausalLM.jacobi_forward_greedy = jacobi_forward_greedy
# ... call model.jacobi_forward_greedy(...) for prefill + generation phases.

The model checkpoint itself is a standard Qwen3 — no architecture changes — so any speculative-decoding framework that accepts a Qwen3 base model can drive it.

Citation

@article{kou2024cllm,
  title={CLLMs: Consistency Large Language Models},
  author={Kou, Siqi and Hu, Lanxiang and He, Zhezhi and Deng, Zhijie and Zhang, Hao},
  journal={arXiv preprint arXiv:2403.00835},
  year={2024}
}

License

Apache 2.0, inherited from the base Qwen3-1.7B model.

README.md Unescape Escape

Qwen3-1.7B Jacobi Forcing (v2math811, AR×10)

Highlights

Training recipe

Benchmarks (1× A100, greedy)

Usage — standard AR

Usage — Jacobi parallel decoding

Citation

License

README.md