初始化项目，由ModelHub XC社区提供模型

Model: Liangmingxin/ThetaWave-7B-sft Source: Original Platform
2026-05-04 11:52:52 +08:00
commit 7f398d4b66
16 changed files with 527 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,67 @@
+---
+license: apache-2.0
+datasets:
+- Open-Orca/SlimOrca
+pipeline_tag: text-generation
+---
+
+Obtained from freecs/ThetaWave-7B after SFT fine tuning.
+
+Open-Orca/SlimOrca datasets were used.
+
+The model does not currently support system_prompt because it uses mistral's chat_template, and the next release is in training to switch to the chatml template to support system_prompt. system_prompt can be implemented if you manually change the chat_template, but the After testing, this seems to degrade the model performance.
+
+More model details will be released...
+
+Vllm deployment command
+```
+# Single graphics card
+python /path/to/vllm/vllm/entrypoints/openai/api_server.py \
+--model '/path/to/ThetaWave-7B-sft' \
+--tokenizer '/path/to/ThetaWave-7B-sft' \
+--tokenizer-mode auto \
+--dtype float16 \
+--enforce-eager \
+--host 0.0.0.0 \
+--port 6000 \
+--disable-log-stats \
+--disable-log-requests
+
+# Dual graphics cards
+python /path/to/vllm/vllm/entrypoints/openai/api_server.py \
+--model '/path/to/ThetaWave-7B-sft' \
+--tokenizer '/path/to/ThetaWave-7B-sft' \
+--tokenizer-mode auto \
+--dtype float16 \
+--enforce-eager \
+--tensor-parallel-size 2 \
+--worker-use-ray \
+--engine-use-ray \
+--host 0.0.0.0 \
+--port 6000 \
+--disable-log-stats \
+--disable-log-requests
+```
+
+Try it directly:
+```
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+device = "cuda" # the device to load the model onto
+
+model = AutoModelForCausalLM.from_pretrained("Liangmingxin/ThetaWave-7B-sft")
+tokenizer = AutoTokenizer.from_pretrained("Liangmingxin/ThetaWave-7B-sft")
+
+messages = [
+    {"role": "user", "content": "Who are you?"},
+]
+
+encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
+
+model_inputs = encodeds.to(device)
+model.to(device)
+
+generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
+decoded = tokenizer.batch_decode(generated_ids)
+print(decoded[0])
+```