Go to file

ModelHub XC fe9de6e7f1 初始化项目，由ModelHub XC社区提供模型

Model: ferrotorch/smollm-135m
Source: Original Platform

2026-06-04 17:40:17 +08:00

_value_parity_input.txt

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

_value_parity_output.bin

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

_value_parity_token_ids.json

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-06-04 17:40:17 +08:00

README.md

license, tags

license

`ferrotorch/smollm-135m`

SmolLM-135M (HuggingFaceTB/SmolLM-135M). Llama-architecture causal LM, 135M parameters, 30 layers / 9 q-heads / 3 kv-heads (GQA), hidden=576, intermediate=1536, vocab=49152, tie_word_embeddings=true, rope_theta=10000.0. Apache 2.0 license. Pinned as the real-artifact baseline for causal LM parity vs transformers==4.50.3 (#1147).

Provenance

Upstream: HuggingFaceTB/SmolLM-135M (apache-2.0).
Conversion script: ferrotorch/scripts/pin_pretrained_llm_weights.py.
Ferrotorch issue: https://github.com/dollspace/ferrotorch/issues/1147.
Number of trainable parameters: 134,515,008.
SHA-256 of model.safetensors (this file is pinned in ferrotorch-hub/src/registry.rs): c7a387d6fe81ca6dd304aeb809bda3932ff1bbef3ca41c9484502f2f448dc093.
Config snapshot: hidden=576, layers=30, heads=9, kv_heads=3, intermediate=1536, vocab=49152, tie_word_embeddings=True, rope_theta=10000.0, rms_norm_eps=1e-05.

Value-parity probe

Two extra files are uploaded so the ferrotorch-side harness can reproduce the parity verdict without re-running the upstream transformers model:

_value_parity_input.txt — the verbatim prompt string the harness tokenizes ("The quick brown fox jumps over the lazy").
_value_parity_token_ids.json — the tokenizer's output for that prompt (with the upstream tokenizer's add_special_tokens=True).
_value_parity_output.bin — float32 logits dumped from a fresh transformers.AutoModelForCausalLM.from_pretrained(..., torch_dtype=float32) single-prefill forward pass on those token ids (no cache). Format: [u32 ndim][u32 × ndim shape][f32 × prod(shape) data] little-endian; identical layout to the vision-side dumps.

How to load

use ferrotorch_hub::load_pretrained;
use ferrotorch_llama::{LlamaConfig, LlamaForCausalLM};
use ferrotorch_hub::HfTransformerConfig;

let state = load_pretrained::<f32>("smollm-135m")?;
let hf_cfg = HfTransformerConfig::from_file("config.json")?;
let cfg = LlamaConfig::from_hf(&hf_cfg)?;
let mut model = LlamaForCausalLM::<f32>::new(cfg)?;
model.load_hf_state_dict(&state, /* strict = */ true)?;

Upstream license

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

README.md Unescape Escape

ferrotorch/smollm-135m

Provenance

Value-parity probe

How to load

Upstream license

README.md

`ferrotorch/smollm-135m`