初始化项目，由ModelHub XC社区提供模型

Model: Franso/reinvent_43M_128_prior Source: Original Platform
2026-05-04 21:46:59 +08:00
commit e0ee7aa57b
8 changed files with 924 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,35 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,201 @@
+---
+library_name: transformers
+tags:
+- trl
+- sft
+---
+
+# Model Card for Model ID
+
+<!-- Provide a quick summary of what the model is/does. -->
+
+
+
+## Model Details
+
+### Model Description
+
+<!-- Provide a longer summary of what this model is. -->
+
+This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+
+### Model Sources [optional]
+
+<!-- Provide the basic links for the model. -->
+
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+
+## Uses
+
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+
+### Direct Use
+
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+
+[More Information Needed]
+
+### Downstream Use [optional]
+
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+
+[More Information Needed]
+
+### Out-of-Scope Use
+
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+
+[More Information Needed]
+
+## Bias, Risks, and Limitations
+
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+
+[More Information Needed]
+
+### Recommendations
+
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+
+## How to Get Started with the Model
+
+Use the code below to get started with the model.
+
+[More Information Needed]
+
+## Training Details
+
+### Training Data
+
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+
+[More Information Needed]
+
+### Training Procedure
+
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+
+#### Preprocessing [optional]
+
+[More Information Needed]
+
+
+#### Training Hyperparameters
+
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+
+#### Speeds, Sizes, Times [optional]
+
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+
+[More Information Needed]
+
+## Evaluation
+
+<!-- This section describes the evaluation protocols and provides the results. -->
+
+### Testing Data, Factors & Metrics
+
+#### Testing Data
+
+<!-- This should link to a Dataset Card if possible. -->
+
+[More Information Needed]
+
+#### Factors
+
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+
+[More Information Needed]
+
+#### Metrics
+
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+
+[More Information Needed]
+
+### Results
+
+[More Information Needed]
+
+#### Summary
+
+
+
+## Model Examination [optional]
+
+<!-- Relevant interpretability work for the model goes here -->
+
+[More Information Needed]
+
+## Environmental Impact
+
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+
+## Technical Specifications [optional]
+
+### Model Architecture and Objective
+
+[More Information Needed]
+
+### Compute Infrastructure
+
+[More Information Needed]
+
+#### Hardware
+
+[More Information Needed]
+
+#### Software
+
+[More Information Needed]
+
+## Citation [optional]
+
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+
+**BibTeX:**
+
+[More Information Needed]
+
+**APA:**
+
+[More Information Needed]
+
+## Glossary [optional]
+
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+
+[More Information Needed]
+
+## More Information [optional]
+
+[More Information Needed]
+
+## Model Card Authors [optional]
+
+[More Information Needed]
+
+## Model Card Contact
+
+[More Information Needed]
--- a/config.json
+++ b/config.json
@@ -0,0 +1,39 @@
+{
+  "architectures": [
+    "Qwen3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 129,
+  "dtype": "bfloat16",
+  "eos_token_id": 130,
+  "head_dim": 32,
+  "hidden_act": "silu",
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 2073,
+  "layer_types": [
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 32768,
+  "max_window_layers": 28,
+  "model_type": "qwen3",
+  "num_attention_heads": 24,
+  "num_hidden_layers": 6,
+  "num_key_value_heads": 24,
+  "pad_token_id": 128,
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 10000.0,
+  "sliding_window": null,
+  "tie_word_embeddings": false,
+  "transformers_version": "4.57.3",
+  "use_cache": true,
+  "use_sliding_window": false,
+  "vocab_size": 132
+}
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,9 @@
+{
+  "_from_model_config": true,
+  "bos_token_id": 129,
+  "eos_token_id": [
+    130
+  ],
+  "pad_token_id": 128,
+  "transformers_version": "4.57.3"
+}
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:04b5ea376671379b6bcf05d003bd01ab209e84d44095b9fa1ceb33707c4e6d68
+size 86059592
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,30 @@
+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,559 @@
+{
+  "version": "1.0",
+  "truncation": null,
+  "padding": null,
+  "added_tokens": [
+    {
+      "id": 128,
+      "content": "<pad>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 129,
+      "content": "<s>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 130,
+      "content": "</s>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 131,
+      "content": "<unk>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    }
+  ],
+  "normalizer": null,
+  "pre_tokenizer": {
+    "type": "Split",
+    "pattern": {
+      "Regex": "\\(|\\)"
+    },
+    "behavior": "Isolated",
+    "invert": false
+  },
+  "post_processor": null,
+  "decoder": {
+    "type": "BPEDecoder",
+    "suffix": "</w>"
+  },
+  "model": {
+    "type": "BPE",
+    "dropout": null,
+    "unk_token": null,
+    "continuing_subword_prefix": null,
+    "end_of_word_suffix": null,
+    "fuse_unk": false,
+    "byte_fallback": false,
+    "ignore_merges": false,
+    "vocab": {
+      "#": 0,
+      "%": 1,
+      "(": 2,
+      ")": 3,
+      "+": 4,
+      "-": 5,
+      "/": 6,
+      "0": 7,
+      "1": 8,
+      "2": 9,
+      "3": 10,
+      "4": 11,
+      "5": 12,
+      "6": 13,
+      "7": 14,
+      "8": 15,
+      "9": 16,
+      "=": 17,
+      "@": 18,
+      "B": 19,
+      "C": 20,
+      "F": 21,
+      "H": 22,
+      "I": 23,
+      "N": 24,
+      "O": 25,
+      "P": 26,
+      "S": 27,
+      "[": 28,
+      "\\": 29,
+      "]": 30,
+      "c": 31,
+      "i": 32,
+      "l": 33,
+      "n": 34,
+      "o": 35,
+      "r": 36,
+      "s": 37,
+      "cc": 38,
+      "CC": 39,
+      "c1": 40,
+      "=O": 41,
+      "c2": 42,
+      "H]": 43,
+      "[C": 44,
+      "[C@": 45,
+      "c1cc": 46,
+      "[C@@": 47,
+      "c3": 48,
+      "c2cc": 49,
+      "[C@H]": 50,
+      "[C@@H]": 51,
+      "NC": 52,
+      "c1ccc": 53,
+      "CCC": 54,
+      "CO": 55,
+      "cc1": 56,
+      "=C": 57,
+      "c1cccc": 58,
+      "n1": 59,
+      "N1": 60,
+      "nc": 61,
+      "c2cccc": 62,
+      "OC": 63,
+      "c3cc": 64,
+      "Cl": 65,
+      "C1": 66,
+      "N2": 67,
+      "CCN": 68,
+      "CC1": 69,
+      "c2ccccc2": 70,
+      "c2ccc": 71,
+      "n2": 72,
+      "O=C": 73,
+      "c1ccccc1": 74,
+      "C2": 75,
+      "CC2": 76,
+      "CN": 77,
+      "cc2": 78,
+      "CCO": 79,
+      "[C@@H]1": 80,
+      "C[C@H]": 81,
+      "c3cccc": 82,
+      "[n": 83,
+      "[nH]": 84,
+      "c1n": 85,
+      "cn": 86,
+      "c4": 87,
+      "[C@@H]2": 88,
+      "[C@H]1": 89,
+      "c3ccccc3": 90,
+      "Cc1ccc": 91,
+      "CCCC": 92,
+      "c2c": 93,
+      "[C@H]2": 94,
+      "COc1ccc": 95,
+      "/C": 96,
+      "c2n": 97,
+      "C[C@@H]": 98,
+      "Cc1cc": 99,
+      "c1c": 100,
+      "c3ccc": 101,
+      "CNC": 102,
+      "cccc": 103,
+      "n3": 104,
+      "CS": 105,
+      "nc1": 106,
+      "COC": 107,
+      "+]": 108,
+      "Br": 109,
+      "cc3": 110,
+      "N1CCC": 111,
+      "C3": 112,
+      "[N": 113,
+      "[N+]": 114,
+      "-]": 115,
+      "[O": 116,
+      "[O-]": 117,
+      "s1": 118,
+      "c1nc": 119,
+      "nc2": 120,
+      "N1C": 121,
+      "CCOC": 122,
+      "o1": 123,
+      "CCCCC": 124,
+      "CC3": 125,
+      "CCCN": 126,
+      "[C@]": 127
+    },
+    "merges": [
+      [
+        "c",
+        "c"
+      ],
+      [
+        "C",
+        "C"
+      ],
+      [
+        "c",
+        "1"
+      ],
+      [
+        "=",
+        "O"
+      ],
+      [
+        "c",
+        "2"
+      ],
+      [
+        "H",
+        "]"
+      ],
+      [
+        "[",
+        "C"
+      ],
+      [
+        "[C",
+        "@"
+      ],
+      [
+        "c1",
+        "cc"
+      ],
+      [
+        "[C@",
+        "@"
+      ],
+      [
+        "c",
+        "3"
+      ],
+      [
+        "c2",
+        "cc"
+      ],
+      [
+        "[C@",
+        "H]"
+      ],
+      [
+        "[C@@",
+        "H]"
+      ],
+      [
+        "N",
+        "C"
+      ],
+      [
+        "c1cc",
+        "c"
+      ],
+      [
+        "CC",
+        "C"
+      ],
+      [
+        "C",
+        "O"
+      ],
+      [
+        "cc",
+        "1"
+      ],
+      [
+        "=",
+        "C"
+      ],
+      [
+        "c1cc",
+        "cc"
+      ],
+      [
+        "n",
+        "1"
+      ],
+      [
+        "N",
+        "1"
+      ],
+      [
+        "n",
+        "c"
+      ],
+      [
+        "c2cc",
+        "cc"
+      ],
+      [
+        "O",
+        "C"
+      ],
+      [
+        "c3",
+        "cc"
+      ],
+      [
+        "C",
+        "l"
+      ],
+      [
+        "C",
+        "1"
+      ],
+      [
+        "N",
+        "2"
+      ],
+      [
+        "CC",
+        "N"
+      ],
+      [
+        "CC",
+        "1"
+      ],
+      [
+        "c2cccc",
+        "c2"
+      ],
+      [
+        "c2cc",
+        "c"
+      ],
+      [
+        "n",
+        "2"
+      ],
+      [
+        "O",
+        "=C"
+      ],
+      [
+        "c1cccc",
+        "c1"
+      ],
+      [
+        "C",
+        "2"
+      ],
+      [
+        "CC",
+        "2"
+      ],
+      [
+        "C",
+        "N"
+      ],
+      [
+        "cc",
+        "2"
+      ],
+      [
+        "CC",
+        "O"
+      ],
+      [
+        "[C@@H]",
+        "1"
+      ],
+      [
+        "C",
+        "[C@H]"
+      ],
+      [
+        "c3cc",
+        "cc"
+      ],
+      [
+        "[",
+        "n"
+      ],
+      [
+        "[n",
+        "H]"
+      ],
+      [
+        "c1",
+        "n"
+      ],
+      [
+        "c",
+        "n"
+      ],
+      [
+        "c",
+        "4"
+      ],
+      [
+        "[C@@H]",
+        "2"
+      ],
+      [
+        "[C@H]",
+        "1"
+      ],
+      [
+        "c3cccc",
+        "c3"
+      ],
+      [
+        "C",
+        "c1ccc"
+      ],
+      [
+        "CC",
+        "CC"
+      ],
+      [
+        "c2",
+        "c"
+      ],
+      [
+        "[C@H]",
+        "2"
+      ],
+      [
+        "CO",
+        "c1ccc"
+      ],
+      [
+        "/",
+        "C"
+      ],
+      [
+        "c2",
+        "n"
+      ],
+      [
+        "C",
+        "[C@@H]"
+      ],
+      [
+        "C",
+        "c1cc"
+      ],
+      [
+        "c1",
+        "c"
+      ],
+      [
+        "c3cc",
+        "c"
+      ],
+      [
+        "C",
+        "NC"
+      ],
+      [
+        "cc",
+        "cc"
+      ],
+      [
+        "n",
+        "3"
+      ],
+      [
+        "C",
+        "S"
+      ],
+      [
+        "n",
+        "c1"
+      ],
+      [
+        "CO",
+        "C"
+      ],
+      [
+        "+",
+        "]"
+      ],
+      [
+        "B",
+        "r"
+      ],
+      [
+        "cc",
+        "3"
+      ],
+      [
+        "N1",
+        "CCC"
+      ],
+      [
+        "C",
+        "3"
+      ],
+      [
+        "[",
+        "N"
+      ],
+      [
+        "[N",
+        "+]"
+      ],
+      [
+        "-",
+        "]"
+      ],
+      [
+        "[",
+        "O"
+      ],
+      [
+        "[O",
+        "-]"
+      ],
+      [
+        "s",
+        "1"
+      ],
+      [
+        "c1",
+        "nc"
+      ],
+      [
+        "n",
+        "c2"
+      ],
+      [
+        "N1",
+        "C"
+      ],
+      [
+        "CC",
+        "OC"
+      ],
+      [
+        "o",
+        "1"
+      ],
+      [
+        "CC",
+        "CCC"
+      ],
+      [
+        "CC",
+        "3"
+      ],
+      [
+        "CCC",
+        "N"
+      ],
+      [
+        "[C@",
+        "]"
+      ]
+    ]
+  }
+}
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,48 @@
+{
+  "added_tokens_decoder": {
+    "128": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "129": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "130": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "131": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "extra_special_tokens": {},
+  "model_input_names": [
+    "input_ids",
+    "attention_mask"
+  ],
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<pad>",
+  "tokenizer_class": "PreTrainedTokenizerFast",
+  "unk_token": "<unk>"
+}