vicuna-160m

Go to file

ModelHub XC cd0daaed08 初始化项目，由ModelHub XC社区提供模型

Model: double7/vicuna-160m
Source: Original Platform

2026-05-02 22:17:54 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

tokenizer.model

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

trainer_state.json

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

training_args.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-02 22:17:54 +08:00

README.md

license, datasets, language, pipeline_tag

license

datasets

language

pipeline_tag

apache-2.0

anon8231489123/ShareGPT_Vicuna_unfiltered

text-generation

Model description

This is a Vicuna-like model with only 160M parameters, which is fine-tuned from LLaMA-160m on ShareGPT data.

The training setup follows the Vicuna suite.

The model is mainly developed as a base Small Speculative Model in MCSD paper. As a comparison, it can be better aligned to the Vicuna models than LLaMA-160m with little loss of alignment to the LLaMA models.

Draft Model	Target Model	Alignment
LLaMA-68/160M	LLaMA-13/33B	😃
LLaMA-68/160M	Vicuna-13/33B	😟
Vicuna-68/160M	LLaMA-13/33B	😃
Vicuna-68/160M	Vicuna-13/33B	😃