Go to file

ModelHub XC a66bec897b 初始化项目，由ModelHub XC社区提供模型

Model: svc-nai-cci/nanollama-public
Source: Original Platform

2026-04-23 10:35:39 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-04-23 10:35:39 +08:00

README.md

library_name, tags, license

library_name

NanoLlama (Public)

A compact Llama-based language model optimized for efficient inference and deployment. This is the public version with open access.

Model Details

Model Description

NanoLlama is a small-scale language model based on the Llama architecture, designed for lightweight applications and resource-constrained environments. This model provides a good balance between performance and computational efficiency.

Developed by: svc-nai-cci
Model type: Causal Language Model
Language(s): English
License: Apache 2.0
Finetuned from: Llama architecture
Access: Public (Open Access)

Model Architecture

Architecture: LlamaForCausalLM
Hidden Size: 4096
Number of Layers: 4
Number of Attention Heads: 4
Number of Key-Value Heads: 2
Vocabulary Size: 32000
Max Position Embeddings: 4096
Hidden Activation: SiLU

Uses

Direct Use

This model can be used for:

Text generation
Conversational AI
Code completion
Creative writing
Question answering

Downstream Use

The model can be fine-tuned for specific tasks such as:

Domain-specific text generation
Task-specific instruction following
Specialized conversational agents

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model and tokenizer
model_name = "svc-nai-cci/nanollama-public"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Generate text
input_text = "Hello, how are you?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.7)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Technical Specifications

Model Architecture and Objective

The model uses the standard Llama architecture with:

RMSNorm for layer normalization
RoPE (Rotary Position Embedding) for positional encoding
SwiGLU activation function
Grouped Query Attention (GQA)

Performance Characteristics

Model Size: Compact design for efficient deployment
Memory Requirements: Optimized for low-memory environments
Inference Speed: Fast inference suitable for real-time applications

Limitations

Limited context length (4096 tokens)
May not perform as well as larger models on complex reasoning tasks
Primarily trained/fine-tuned for English text

Citation

If you use this model, please cite:

@misc{nanollama2024,
  title={NanoLlama: A Compact Llama-based Language Model},
  author={svc-nai-cci},
  year={2024},
  url={https://huggingface.co/svc-nai-cci/nanollama-public}
}

Contact

For questions or issues, please contact: svc-nai-cci