Go to file

ModelHub XC e1305a41d5 初始化项目，由ModelHub XC社区提供模型

Model: gia-uh/cecilia-2b-instruct-v1
Source: Original Platform

2026-05-05 08:30:49 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

cecilia-2b-instruct-v1.f16.gguf

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

tokenizer.model

初始化项目，由ModelHub XC社区提供模型

2026-05-05 08:30:49 +08:00

README.md

library_name, tags, base_model, license, datasets, language, pipeline_tag

library_name

Cecilia: The Cuban Language Model

Cecilia is a family of language models continual pretrained specifically on Cuban written text, capturing the linguistic, cultural, and social nuances of Cuban Spanish. These models are designed to support natural language processing tasks with a focus on Cuban language varieties and cultural context.

About Cecilia FT MS v1

This model is a fine-tuned version of Cecilia 2B v0.1 which is a continual pre-trained model based on Salamandra 2b. It belongs to the Cecilia collection and follows the same lineage as Cecilia 2B v0.1.

Model Formats

This repository is a Hybrid Release containing:

Safetensors: For use with Hugging Face transformers.
GGUF (FP16): For use with llama.cpp, vLLM, or local inference tools.

Quantizations

Official quantized GGUF versions (Q8_0, Q6_K, Q4_K_M) in the repository gia-uh/cecilia-2b-instruct-v1-GGUF

Quickstart (Transformers)

from transformers import AutoConfig, AutoModel, AutoTokenizer

repo_id = "gia-uh/cecilia_ft_ms_v1"

# Load model and tokenizer
config = AutoConfig.from_pretrained(repo_id, trust_remote_code=False)
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModel.from_pretrained(repo_id, trust_remote_code=False)

# Simple inference
inputs = tokenizer("Hola, que bolá?", return_tensors="pt")
outputs = model(**inputs)