初始化项目,由ModelHub XC社区提供模型
Model: biodatlab/ec-raft Source: Original Platform
This commit is contained in:
11
.gitattributes
vendored
Normal file
11
.gitattributes
vendored
Normal file
@@ -0,0 +1,11 @@
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00006-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00007-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00001-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00005-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00009-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00003-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00008-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00002-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
pytorch_model-00004-of-00009.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
72
README.md
Normal file
72
README.md
Normal file
@@ -0,0 +1,72 @@
|
||||
---
|
||||
license: llama3.1
|
||||
datasets:
|
||||
- biodatlab/ec-raft-dataset
|
||||
language:
|
||||
- en
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria
|
||||
|
||||
## Model Description
|
||||
|
||||
**EC-RAFT** is a fine-tuned Retrieval-Augmented Fine-Tuning (RAFT) model based on **LLaMA-3.1-8B-Instruct** architecture.
|
||||
It is designed to automatically generate **structured, high-quality clinical trial eligibility criteria (EC)** directly from trial titles and descriptions.
|
||||
|
||||
EC-RAFT integrates **domain-specific retrieval** with **synthesized intermediate reasoning** steps, enabling it to produce **clinically relevant** and **contextually appropriate** EC sets.
|
||||
|
||||
## Fine-tuning Details
|
||||
|
||||
- **Original Model:** LLaMA-3.1-8B-Instruct
|
||||
- **Datasets used for fine-tuning:**
|
||||
- ClinicalTrials.gov (267,347 trials, 2000–2024) [biodatlab/ec-raft-dataset](https://huggingface.co/datasets/biodatlab/ec-raft-dataset)
|
||||
- Retrieval corpus constructed using **SciNCL model**
|
||||
- Intermediate reasoning steps **R** generated using **Gemini-1.5-flash-002**
|
||||
- Fine-tuning method:
|
||||
- **Retrieval-Augmented Fine-Tuning (RAFT)**
|
||||
- **Low-Rank Adaptation (LoRA)**
|
||||
|
||||
## Model Performance
|
||||
|
||||
Evaluated on a held-out ClinicalTrials.gov test split:
|
||||
|
||||
| Metric | Score |
|
||||
|-----------------------------------|---------|
|
||||
| **BERTScore** (semantic similarity) | **86.23** |
|
||||
| **Precision** (LLM-guided evaluation) | **78.84%** |
|
||||
| **Recall** (LLM-guided evaluation) | **75.89%** |
|
||||
| **Mean LLM-as-a-Judge Score** (0–3) | **1.7150** |
|
||||
| **Mean Pair-BERTScore** | **67.76** |
|
||||
|
||||
- **Outperforms zero-shot LLaMA-3.1 and Gemini-1.5-flash baselines**
|
||||
- **Outperforms fine-tuned LLaMA and Meditron baselines**
|
||||
- **Clinically validated:** LLM-as-a-Judge scores highly correlated with human physician evaluation
|
||||
|
||||
## Intended Use
|
||||
|
||||
- Assist **researchers**, **trial designers**, and **sponsors** in drafting clinical trial eligibility criteria.
|
||||
- **Automate** EC generation to reduce manual effort and improve consistency.
|
||||
- Support **clinical trial design** transparency and quality.
|
||||
- Enable integration with **trial registry platforms**, **clinical trial matching systems**, and **EC recommendation tools**.
|
||||
|
||||
## Limitations
|
||||
|
||||
- Requires **human validation** of generated EC before clinical use.
|
||||
- Trained on **public ClinicalTrials.gov data** — may not generalize well to:
|
||||
- Rare or novel diseases
|
||||
- Specialized or non-standard trial designs
|
||||
- Non-public trial data
|
||||
- Optimized for **English-language clinical trials**.
|
||||
- As with any LLM-based system, risks include hallucination, subtle errors, and domain shifts.
|
||||
- Evaluation metrics (BERTScore, LLM-as-a-Judge) are proxies — not full substitutes for domain expert review.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
This model was developed using resources provided by:
|
||||
|
||||
- **RAVIS Technology** for feedback and collaboration.
|
||||
- **Faculty of Medicine Ramathibodi Hospital**
|
||||
- **NSTDA Supercomputer Center (ThaiSC), Project \#pv814001**
|
||||
|
||||
We also acknowledge the contributions of the broader open-source community whose tools and prior works on **RAFT**, **SciNCL**, **LoRA**, **LLaMA-3**, and **biomedical NLP** made this project possible.
|
||||
40
config.json
Normal file
40
config.json
Normal file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"_name_or_path": "/project/lt200217-thwhis/swiss-work-dir/huggingface/huggingface/hub/models--meta-llama--Meta-Llama-3.1-8B-Instruct/snapshots/0e9e39f249a16976918f6564b8830bc894c89659",
|
||||
"architectures": [
|
||||
"LlamaForCausalLM"
|
||||
],
|
||||
"attention_bias": false,
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 128000,
|
||||
"eos_token_id": [
|
||||
128001,
|
||||
128008,
|
||||
128009
|
||||
],
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 14336,
|
||||
"max_position_embeddings": 131072,
|
||||
"mlp_bias": false,
|
||||
"model_type": "llama",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 32,
|
||||
"num_key_value_heads": 8,
|
||||
"pretraining_tp": 1,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_scaling": {
|
||||
"factor": 8.0,
|
||||
"high_freq_factor": 4.0,
|
||||
"low_freq_factor": 1.0,
|
||||
"original_max_position_embeddings": 8192,
|
||||
"rope_type": "llama3"
|
||||
},
|
||||
"rope_theta": 500000.0,
|
||||
"tie_word_embeddings": false,
|
||||
"torch_dtype": "float16",
|
||||
"transformers_version": "4.48.1",
|
||||
"use_cache": true,
|
||||
"vocab_size": 128256
|
||||
}
|
||||
12
generation_config.json
Normal file
12
generation_config.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"bos_token_id": 128000,
|
||||
"do_sample": true,
|
||||
"eos_token_id": [
|
||||
128001,
|
||||
128008,
|
||||
128009
|
||||
],
|
||||
"temperature": 0.6,
|
||||
"top_p": 0.9,
|
||||
"transformers_version": "4.48.1"
|
||||
}
|
||||
3
pytorch_model-00001-of-00009.bin
Normal file
3
pytorch_model-00001-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0f74aad9fa8f2d5c171975068477746dec5b91f75373379eeaf738c72680d6a1
|
||||
size 1973460982
|
||||
3
pytorch_model-00001-of-00009.safetensors
Normal file
3
pytorch_model-00001-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:502276036c5854cff468b0d1e8c0a988ea981b0dd61309704bbe4a90513bcc87
|
||||
size 1973455352
|
||||
3
pytorch_model-00002-of-00009.bin
Normal file
3
pytorch_model-00002-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:4f0572fbfa35ae215abde37c18707abbc8aba574666a828518aad15734ef2a56
|
||||
size 1895904598
|
||||
3
pytorch_model-00002-of-00009.safetensors
Normal file
3
pytorch_model-00002-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fcc475ac6d2b5cf09d8855edfbe7da689f87d70ca8cb115642da4d96948ddf37
|
||||
size 1895895296
|
||||
3
pytorch_model-00003-of-00009.bin
Normal file
3
pytorch_model-00003-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1c34006b65c9a76f9d427ffb37ffd3ccf25f38b9f35651a4c0826d87df7cf5a1
|
||||
size 1979807738
|
||||
3
pytorch_model-00003-of-00009.safetensors
Normal file
3
pytorch_model-00003-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:de1c2cef5db2ebd348d3f12df2bba2b7fefa0f10d9a9bfdd5b651f78c0e2a968
|
||||
size 1979798000
|
||||
3
pytorch_model-00004-of-00009.bin
Normal file
3
pytorch_model-00004-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fcd16dd53529424c16a534ebdddee481bf47a28872e35bdd64a307f7e9822fba
|
||||
size 1946237324
|
||||
3
pytorch_model-00004-of-00009.safetensors
Normal file
3
pytorch_model-00004-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:df9759762b41503849997b843a0f9af0d95b465c0da45ff4af2390b9d64ffb81
|
||||
size 1946227328
|
||||
3
pytorch_model-00005-of-00009.bin
Normal file
3
pytorch_model-00005-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:44e6af5de53eaaae5b726d84991d3507050d96d147e0a6386e02877a0ace2d94
|
||||
size 1979807802
|
||||
3
pytorch_model-00005-of-00009.safetensors
Normal file
3
pytorch_model-00005-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ef11fb91eab18521131e01f8dcca91f76705be04dd71229fae1645d41e228d62
|
||||
size 1979798024
|
||||
3
pytorch_model-00006-of-00009.bin
Normal file
3
pytorch_model-00006-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d61c89c5e7691c035b9b683050c4b3804b0a03e3c0201ed1f5f692ab87a63b54
|
||||
size 1946237324
|
||||
3
pytorch_model-00006-of-00009.safetensors
Normal file
3
pytorch_model-00006-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:128d4151a03076d565c456ce8395b24d13b38a1af267450bb0d10aa35501b664
|
||||
size 1946227328
|
||||
3
pytorch_model-00007-of-00009.bin
Normal file
3
pytorch_model-00007-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2d84731b36412404b00198c0633ac7d6228c1c472fb4982eb6688d108a57169d
|
||||
size 1979807802
|
||||
3
pytorch_model-00007-of-00009.safetensors
Normal file
3
pytorch_model-00007-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e8fef8c1c51611239b8c7b5c185a381cf09de285cd358644d685971ec0649086
|
||||
size 1979798024
|
||||
3
pytorch_model-00008-of-00009.bin
Normal file
3
pytorch_model-00008-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:32ca334ffaab4aec081391687a36db1afd6599934233eed161c2cf5f2fd3bc69
|
||||
size 1308690338
|
||||
3
pytorch_model-00008-of-00009.safetensors
Normal file
3
pytorch_model-00008-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:8557ba103fbd6cc83869be454b63350b095fb36c60a4589d1dfbe33eed1ad5f1
|
||||
size 1308683392
|
||||
3
pytorch_model-00009-of-00009.bin
Normal file
3
pytorch_model-00009-of-00009.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c24f691ad2249c899177a53c87336f12241f7e5f140b17d067205f759f701f9f
|
||||
size 1050674565
|
||||
3
pytorch_model-00009-of-00009.safetensors
Normal file
3
pytorch_model-00009-of-00009.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3e9a36ad418b0cd9b253f65771552d7dc05cf1abf1170b94de4b4d546aac255d
|
||||
size 1050673280
|
||||
298
pytorch_model.bin.index.json
Normal file
298
pytorch_model.bin.index.json
Normal file
@@ -0,0 +1,298 @@
|
||||
{
|
||||
"metadata": {
|
||||
"total_size": 16060522496
|
||||
},
|
||||
"weight_map": {
|
||||
"lm_head.weight": "pytorch_model-00009-of-00009.bin",
|
||||
"model.embed_tokens.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.10.input_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.11.input_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.input_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.input_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.input_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.15.input_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00004-of-00009.bin",
|
||||
"model.layers.16.input_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.input_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.input_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.input_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00005-of-00009.bin",
|
||||
"model.layers.2.input_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00009.bin",
|
||||
"model.layers.20.input_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.input_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.input_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.input_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.24.input_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00006-of-00009.bin",
|
||||
"model.layers.25.input_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.input_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.input_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.input_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00007-of-00009.bin",
|
||||
"model.layers.29.input_layernorm.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.3.input_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.30.input_layernorm.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.input_layernorm.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00008-of-00009.bin",
|
||||
"model.layers.4.input_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.input_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.6.input_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00002-of-00009.bin",
|
||||
"model.layers.7.input_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.input_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.input_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00003-of-00009.bin",
|
||||
"model.norm.weight": "pytorch_model-00008-of-00009.bin"
|
||||
}
|
||||
}
|
||||
16
special_tokens_map.json
Normal file
16
special_tokens_map.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"bos_token": {
|
||||
"content": "<|begin_of_text|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"eos_token": {
|
||||
"content": "<|eot_id|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
Binary file not shown.
2063
tokenizer_config.json
Normal file
2063
tokenizer_config.json
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user