Files
Vikhr-7B-instruct_0.3/README.md
ModelHub XC 4c84b8e14d 初始化项目,由ModelHub XC社区提供模型
Model: Vikhrmodels/Vikhr-7B-instruct_0.3
Source: Original Platform
2026-06-08 00:23:15 +08:00

51 lines
1.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
tags:
- trl
- sft
datasets:
- Vikhrmodels/Veles-2.5
- dichspace/darulm
- zjkarina/Vikhr_instruct
---
# Veles Instruct [DONT TOUCH, Under Dev]
Просто лучшая русская инстракт модель теперь с CHATML
Метрики, DPO, коды для запуска подьедут позже, мне если честно похуй, вам думаю вообще поебать
Самый быстрый старт: https://colab.research.google.com/drive/10g5LSuzwsGVCCtiTuVM35T0LiiXwlWSQ?usp=sharing
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained("Vikhrmodels/Vikhr-7B-instruct_0.3",
device_map="auto",
attn_implementation="flash_attention_2",
torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("Vikhrmodels/Vikhr-7B-instruct_0.3",use_fast=False)
from transformers import AutoTokenizer, pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompts = [
"В чем разница между фруктом и овощем?",
"Годы жизни колмагорова?"]
def test_inference(prompt):
prompt = pipe.tokenizer.apply_chat_template([{"role": "user", "content": prompt}], tokenize=False, add_generation_prompt=True)
print(prompt)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, eos_token_id=tokenizer.eos_token_id)
return outputs[0]['generated_text'][len(prompt):].strip()
for prompt in prompts:
print(f" prompt:\n{prompt}")
print(f" response:\n{test_inference(prompt)}")
print("-"*50)
```