初始化项目,由ModelHub XC社区提供模型
Model: cococoomo/Exaone3.5-7.8B_ReST_V0_Quantized Source: Original Platform
This commit is contained in:
90
README.md
Normal file
90
README.md
Normal file
@@ -0,0 +1,90 @@
|
||||
---
|
||||
language:
|
||||
- ko
|
||||
- en
|
||||
base_model:
|
||||
- LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- llm
|
||||
- exaone
|
||||
- instruction-tuned
|
||||
- quantized
|
||||
- awq
|
||||
- vllm
|
||||
- medical
|
||||
---
|
||||
|
||||
# Exaone3.5-7.8B_ReST_V0_Quantized
|
||||
|
||||
This model is a fine-tuned and AWQ-quantized version of EXAONE 3.5 7.8B (Instruct), optimized for efficient inference and structured text generation.
|
||||
|
||||
## Overview
|
||||
|
||||
- Base Model: EXAONE 3.5 7.8B (Instruct)
|
||||
- Fine-tuning: Supervised fine-tuning on domain-specific data
|
||||
- Quantization: 4-bit AWQ
|
||||
- Inference: Optimized for vLLM
|
||||
- Context Length: up to 32K tokens
|
||||
|
||||
## Model Details
|
||||
|
||||
- Architecture: ExaoneForCausalLM
|
||||
- Hidden Size: 4096
|
||||
- Layers: 32
|
||||
- Attention Heads: 32
|
||||
- Max Position Embeddings: 32768
|
||||
- Quantization: 4-bit AWQ
|
||||
- Torch dtype: float16
|
||||
|
||||
## Intended Use
|
||||
|
||||
- Instruction-based text generation
|
||||
- Structured output generation (JSON)
|
||||
- LLM-based data pipelines
|
||||
- RAG systems
|
||||
- Efficient inference
|
||||
|
||||
## Example Usage
|
||||
|
||||
```python
|
||||
from vllm import LLM, SamplingParams
|
||||
|
||||
llm = LLM(
|
||||
model="cococoomo/Exaone3.5-7.8B_ReST_V0_Quantized",
|
||||
quantization="AWQ",
|
||||
)
|
||||
|
||||
sampling_params = SamplingParams(
|
||||
temperature=0.2,
|
||||
top_p=0.8,
|
||||
max_tokens=1024,
|
||||
)
|
||||
|
||||
outputs = llm.generate(["Your prompt here"], sampling_params)
|
||||
print(outputs[0].outputs[0].text)
|
||||
```
|
||||
|
||||
## Training
|
||||
|
||||
Fine-tuned using supervised learning on domain-specific data.
|
||||
Dataset is not included due to privacy constraints.
|
||||
|
||||
## Limitations
|
||||
|
||||
- May produce incorrect outputs
|
||||
- Sensitive to prompt quality
|
||||
- Domain bias may exist
|
||||
|
||||
## Safety
|
||||
|
||||
Not intended for critical decision-making without human validation.
|
||||
|
||||
## Evaluation
|
||||
|
||||
- BLEU
|
||||
- ROUGE
|
||||
|
||||
## Deployment
|
||||
|
||||
Optimized for vLLM and GPU-efficient inference.
|
||||
Reference in New Issue
Block a user