初始化项目,由ModelHub XC社区提供模型

Model: froggeric/WestLake-10.7B-v2
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-11 00:17:03 +08:00
commit 5bc1dee396
12 changed files with 91405 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

107
README.md Normal file
View File

@@ -0,0 +1,107 @@
---
base_model:
- senseable/WestLake-7B-v2
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
language:
- en
---
# WestLake-10.7B-v2: Role-Play & Text Generation Specialist Model
[GGUF version available here](https://huggingface.co/froggeric/WestLake-10.7B-v2-GGUF)\
EXL2 versions available here:
[3.3bpw](https://huggingface.co/StopTryharding/WestLake-10.7B-v2-exl2-3.3) / [4.0bpw](https://huggingface.co/StopTryharding/WestLake-10.7B-v2-exl2-4.0) / [5.0bpw](https://huggingface.co/StopTryharding/WestLake-10.7B-v2-exl2-5.0) / [6.0bpw](https://huggingface.co/StopTryharding/WestLake-10.7B-v2-exl2-6.0) / [8.0bpw](https://huggingface.co/StopTryharding/WestLake-10.7B-v2-exl2-8.0)
This is my first viable self-merge of the fantastic WestLake-7B-v2 model, obtained after more than 12 rounds of testing different
merge configurations. In my [LLM Creativity Benchmark](https://huggingface.co/datasets/froggeric/creativity), it greatly improves over the original 7B model, and ranks between miqu-1-120b
and goliath-120b! I would describe the improvements as a better writing style, with more details. It has a bit more difficulties following instructions, but not by much.
It is also the first model I have tested to obtain a perfect score with the following test:
```
Write a sequence of nominal groups that flow into one another, using the following rules:
- each nominal group is made of exactly 3 words
- the first word of each nominal group must be the last word of the previous nominal group
- the first word of the first nominal group is: "ball"
- the last word of the last nominal group is: "stone"
- there must be a theme, of your choosing, pertaining to all nominal groups
- there must be exactly 7 nominal groups, leading from the first word (ball) to the last word (stone)
- a word already used at the beginning and end of a nominal group cannot be reused
Present your solution as a list numbered with roman numerals.
Finally, explain why you chose your specific theme.
```
## Usage
* Base model: senseable/WestLake-7B-v2 based of Mistral-7B-v0.1
* Context size: **8192** (even though Mistral-7B is 32k, WestLake was trained with 8k, and using a larger context is likely to cause problems)
* Prompt format: in general, Mistral based models are able to understand many prompt formats, but the following produce the best results, and are recommended (in order of preference)
- **Alpaca** (reported by senseable as working better than ChatML, and confirmed by me)
- ChatML (used during WestLake training)
- Mistral Instruct (original format from Mistral-7B)
- Zephyr (variant of ChatML which I have found to sometimes produce better results)
## Merge Details
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).\
This model was merged using the passthrough merge method.\
The following models were included in the merge:
* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
The following YAML configuration was used to produce this model:
```yaml
dtype: float16
merge_method: passthrough
slices:
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [0,9]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [5,14]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [10,19]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [15,24]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [20,32]
```
---
# Original model card: Westlake-7Bv2: Role-Play & Text Generation Specialist Model
**Update Notes:**
*Version 2 trained 1 additional epoch cycle for 3 total*
Welcome to the documentation of Westlake-7B, a cutting-edge language model designed for exceptional role-play and text generation tasks. This README file aims to provide an overview of our capabilities, usage guidelines, and potential applications.
## About Westlake-7Bv2
Westlake-7B is built upon a vast corpus of diverse texts, enabling it to generate contextually relevant responses in various scenarios. With its impressive size of 7 billion parameters, this model excels at understanding nuances in language and producing creative outputs.
### Key Features
1. **Role-Play**: Westlake-7Bv2 can seamlessly adapt to different character personas and engage in dynamic conversations while maintaining consistency throughout the interaction. It can generate believable dialogues across various genres, including fiction, non-fiction, historical events, or even fantasy worlds.
2. **Text Generation**: This model is proficient at generating original content such as stories, poems, essays, news articles, and more. Its ability to capture the essence of different writing styles makes it an ideal tool for creative writers seeking inspiration or assistance in their projects.
3. **Contextual Understanding**: Westlake-7B's extensive training allows it to comprehend complex contexts and generate responses that align with given situations. It can handle multiple topics simultaneously, making it versatile across various applications.
4. **Continuous Learning**: As a language model, Westlake-7B continuously improves its performance through ongoing training on new data sets. This ensures its capabilities remain up-to-date and relevant in an ever-evolving world of communication.
## Usage Guidelines
To utilize Westlake-7Bv2 for your projects or experiments, follow these steps:
1. **Prompting**: Provide clear and concise prompts that outline the desired role-play scenario or text generation task. The quality of output depends heavily on the clarity and relevance of input instructions.
2. **Feedback Loop**: For optimal results, consider incorporating a feedback loop into your application to refine generated outputs based on user preferences or additional contextual information. This iterative process can significantly enhance the model's performance in specific domains.
3. **Ethical Considerations**: As with any AI system, ensure responsible usage of Westlake-7B by avoiding harmful content generation or misuse of its capabilities.
## Potential Applications
Westlake-7Bv2's versatility makes it suitable for various applications across different industries:
1. **Creative Writing**: Assist authors in generating new ideas, expanding storylines, or even completing drafts by providing creative suggestions and textual content.
2. **Education**: Enhance language learning platforms with interactive role-play scenarios to improve students' communication skills and cultural understanding.
3. **Gaming**: Integrate Westlake-7B into game engines for dynamic non-player character interactions or generating unique questlines based on player choices.
4. **Customer Support**: Leverage the model's conversational abilities to create chatbots capable of handling complex queries and providing personalized assistance.
5. **Social Media**: Develop applications that generate engaging content such as captions, status updates, or even entire posts tailored to users' preferences and interests.

28
config.json Normal file
View File

@@ -0,0 +1,28 @@
{
"_name_or_path": "senseable/WestLake-7B-v2",
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 48,
"num_key_value_heads": 8,
"pad_token_id": 2,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"sliding_window": 4096,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.38.1",
"unsloth_version": "2024.1",
"use_cache": true,
"vocab_size": 32000
}

18
mergekit_config.yml Normal file
View File

@@ -0,0 +1,18 @@
dtype: float16
merge_method: passthrough
slices:
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [0,9]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [5,14]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [10,19]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [15,24]
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [20,32]

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8d1f55922079ddecd5dc6ba4f06ac5d2c46801bfc79e57f59f5120d433b221b3
size 9942931648

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d838076274dc5bef22eb90931aef0e55d62cc8366fb2fa3f67202a94d7966688
size 9943039696

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9848802179a6830e081dbbc3359d0ab9ee164dd79848b263aaee85e5706430c6
size 1577127064

File diff suppressed because one or more lines are too long

35
special_tokens_map.json Normal file
View File

@@ -0,0 +1,35 @@
{
"additional_special_tokens": [
"<unk>",
"<s>",
"</s>"
],
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

91122
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

47
tokenizer_config.json Normal file
View File

@@ -0,0 +1,47 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<unk>",
"<s>",
"</s>"
],
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 255,
"pad_token": "<unk>",
"padding_side": "right",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": true
}