CodeLlama-7B-KStack-clean/README.md

---
license: apache-2.0
datasets:
- JetBrains/KStack-clean
base_model: meta-llama/CodeLlama-7b-hf
results:
- task:
    type: text-generation
  dataset:
    name: MultiPL-HumanEval (Kotlin)
    type: openai_humaneval
  metrics:
  - name: pass@1
    type: pass@1
    value: 37.89
tags:
- code
---

# Model description

This is a repository for the **CodeLlama-7b** model fine-tuned on the [KStack-clean](https://huggingface.co/datasets/JetBrains/KStack-clean) dataset with rule-based filtering, in the *Hugging Face Transformers* format. KStack-clean is a small subset of [KStack](https://huggingface.co/datasets/JetBrains/KStack), the largest collection of permissively licensed Kotlin code, automatically filtered to include files that have the highest "educational value for learning algorithms in Kotlin".

# How to use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load pre-trained model and tokenizer
model_name = 'JetBrains/CodeLlama-7B-KStack-clean'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda')

# Create and encode input
input_text = """\
This function takes an integer n and returns factorial of a number:
fun factorial(n: Int): Int {\
"""
input_ids = tokenizer.encode(
    input_text, return_tensors='pt'
).to('cuda')

# Generate
output = model.generate(
    input_ids, max_length=60, num_return_sequences=1, 
    pad_token_id=tokenizer.eos_token_id
)

# Decode output
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
```

As with the base model, we can use FIM. To do this, the following format must be used: 
```
'<PRE> ' + prefix + ' <SUF> ' + suffix + ' <MID>'
```

# Training setup

The model was trained on one A100 GPU with following hyperparameters:

|         **Hyperparameter**           |             **Value**              |
|:---------------------------:|:----------------------------------------:|
|        `warmup`            |           100 steps            |
|        `max_lr`        |          5e-5          |
|        `scheduler`        |          linear          |
|        `total_batch_size`        |        32 (~30K tokens per step)          |
|        `num_epochs`        |          2          |

More details about fine-tuning can be found in the technical report (coming soon!).

# Fine-tuning data

For tuning the model, we used 25K exmaples from the [KStack-clean](https://huggingface.co/datasets/JetBrains/KStack-clean) dataset, selected from the larger [KStack](https://huggingface.co/datasets/JetBrains/KStack) dataset according to educational value for learning algorithms. In total, the dataset contains about 23M tokens. 

# Evaluation 

For evaluation, we used the [Kotlin HumanEval](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval) dataset, which contains all 161 tasks from HumanEval translated into Kotlin by human experts. You can find more details about the pre-processing necessary to obtain our results, including the code for running, on the [datasets's page](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval).

Here are the results of our evaluation:

|         **Model name**           |             **Kotlin HumanEval Pass Rate**              |
|:---------------------------:|:----------------------------------------:|
|           `CodeLlama-7B`            |           26.89            |
|        `CodeLlama-7B-KStack-clean`        |          **37.89**        |

# Ethical Considerations and Limitations

CodeLlama-7B-KStack-clean is a new technology that carries risks with use. The testing conducted to date has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, CodeLlama-7B-KStack-clean's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate or objectionable responses to user prompts. The model was fine-tuned on a specific data format (Kotlin tasks), and deviation from this format can also lead to inaccurate or undesirable responses to user queries. Therefore, before deploying any applications of CodeLlama-7B-KStack-clean, developers should perform safety testing and tuning tailored to their specific applications of the model.
初始化项目，由ModelHub XC社区提供模型 Model: JetBrains/CodeLlama-7B-KStack-clean Source: Original Platform 2026-06-02 18:30:13 +08:00			`---`
			`license: apache-2.0`
			`datasets:`
			`- JetBrains/KStack-clean`
			`base_model: meta-llama/CodeLlama-7b-hf`
			`results:`
			`- task:`
			`type: text-generation`
			`dataset:`
			`name: MultiPL-HumanEval (Kotlin)`
			`type: openai_humaneval`
			`metrics:`
			`- name: pass@1`
			`type: pass@1`
			`value: 37.89`
			`tags:`
			`- code`
			`---`

			`# Model description`

			`This is a repository for the CodeLlama-7b model fine-tuned on the [KStack-clean](https://huggingface.co/datasets/JetBrains/KStack-clean) dataset with rule-based filtering, in the Hugging Face Transformers format. KStack-clean is a small subset of [KStack](https://huggingface.co/datasets/JetBrains/KStack), the largest collection of permissively licensed Kotlin code, automatically filtered to include files that have the highest "educational value for learning algorithms in Kotlin".`

			`# How to use`

			```python
			`from transformers import AutoModelForCausalLM, AutoTokenizer`

			`# Load pre-trained model and tokenizer`
			`model_name = 'JetBrains/CodeLlama-7B-KStack-clean'`
			`tokenizer = AutoTokenizer.from_pretrained(model_name)`
			`model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda')`

			`# Create and encode input`
			`input_text = """\`
			`This function takes an integer n and returns factorial of a number:`
			`fun factorial(n: Int): Int {\`
			`"""`
			`input_ids = tokenizer.encode(`
			`input_text, return_tensors='pt'`
			`).to('cuda')`

			`# Generate`
			`output = model.generate(`
			`input_ids, max_length=60, num_return_sequences=1,`
			`pad_token_id=tokenizer.eos_token_id`
			`)`

			`# Decode output`
			`generated_text = tokenizer.decode(output[0], skip_special_tokens=True)`
			`print(generated_text)`
			```

			`As with the base model, we can use FIM. To do this, the following format must be used:`
			```
			`'<PRE> ' + prefix + ' <SUF> ' + suffix + ' <MID>'`
			```

			`# Training setup`

			`The model was trained on one A100 GPU with following hyperparameters:`

			`\| Hyperparameter \| Value \|`
			`\|:---------------------------:\|:----------------------------------------:\|`
			\| `warmup` \| 100 steps \|
			\| `max_lr` \| 5e-5 \|
			\| `scheduler` \| linear \|
			\| `total_batch_size` \| 32 (~30K tokens per step) \|
			\| `num_epochs` \| 2 \|

			`More details about fine-tuning can be found in the technical report (coming soon!).`

			`# Fine-tuning data`

			`For tuning the model, we used 25K exmaples from the [KStack-clean](https://huggingface.co/datasets/JetBrains/KStack-clean) dataset, selected from the larger [KStack](https://huggingface.co/datasets/JetBrains/KStack) dataset according to educational value for learning algorithms. In total, the dataset contains about 23M tokens.`

			`# Evaluation`

			`For evaluation, we used the [Kotlin HumanEval](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval) dataset, which contains all 161 tasks from HumanEval translated into Kotlin by human experts. You can find more details about the pre-processing necessary to obtain our results, including the code for running, on the [datasets's page](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval).`

			`Here are the results of our evaluation:`

			`\| Model name \| Kotlin HumanEval Pass Rate \|`
			`\|:---------------------------:\|:----------------------------------------:\|`
			\| `CodeLlama-7B` \| 26.89 \|
			\| `CodeLlama-7B-KStack-clean` \| 37.89 \|

			`# Ethical Considerations and Limitations`

			CodeLlama-7B-KStack-clean is a new technology that carries risks with use. The testing conducted to date has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, CodeLlama-7B-KStack-clean's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate or objectionable responses to user prompts. The model was fine-tuned on a specific data format (Kotlin tasks), and deviation from this format can also lead to inaccurate or undesirable responses to user queries. Therefore, before deploying any applications of CodeLlama-7B-KStack-clean, developers should perform safety testing and tuning tailored to their specific applications of the model.