ModelHub XC f3a7d2dfc3 初始化项目,由ModelHub XC社区提供模型
Model: bigcode/octocoder
Source: Original Platform
2026-05-10 08:17:22 +08:00

pipeline_tag, inference, widget, license, datasets, metrics, library_name, tags, model-index
pipeline_tag inference widget license datasets metrics library_name tags model-index
text-generation true
text example_title group
Question: Please write a function in Python that performs bubble sort.\n\nAnswer: Bubble sort Python
bigcode-openrail-m
bigcode/commitpackft
bigcode/oasst-octopack
code_eval
transformers
code
name results
OctoCoder
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalSynthesize Python
name type value verified
pass@1 pass@1 46.2 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalSynthesize JavaScript
name type value verified
pass@1 pass@1 39.2 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalSynthesize Java
name type value verified
pass@1 pass@1 38.2 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalSynthesize Go
name type value verified
pass@1 pass@1 30.4 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalSynthesize C++
name type value verified
pass@1 pass@1 35.6 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalSynthesize Rust
name type value verified
pass@1 pass@1 23.4 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalSynthesize Average
name type value verified
pass@1 pass@1 35.5 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalFix Python
name type value verified
pass@1 pass@1 30.4 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalFix JavaScript
name type value verified
pass@1 pass@1 28.4 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalFix Java
name type value verified
pass@1 pass@1 30.6 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalFix Go
name type value verified
pass@1 pass@1 30.2 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalFix C++
name type value verified
pass@1 pass@1 26.1 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalFix Rust
name type value verified
pass@1 pass@1 16.5 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalFix Average
name type value verified
pass@1 pass@1 27.0 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalExplain Python
name type value verified
pass@1 pass@1 35.1 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalExplain JavaScript
name type value verified
pass@1 pass@1 24.5 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalExplain Java
name type value verified
pass@1 pass@1 27.3 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalExplain Go
name type value verified
pass@1 pass@1 21.1 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalExplain C++
name type value verified
pass@1 pass@1 24.1 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalExplain Rust
name type value verified
pass@1 pass@1 14.8 false
task dataset metrics
type
text-generation
type name
bigcode/humanevalpack HumanEvalExplain Average
name type value verified
pass@1 pass@1 24.5 false

Octopack

Table of Contents

  1. Model Summary
  2. Use
  3. Training
  4. Citation

Model Summary

OctoCoder is an instruction tuned model with 15.5B parameters created by finetuning StarCoder on CommitPackFT & OASST as described in the OctoPack paper.

Data CommitPack 4TB of GitHub commits across 350 programming languages
CommitPackFT Filtered version of CommitPack for high-quality commit messages that resemble instructions
Model OctoCoder StarCoder (16B parameters) instruction tuned on CommitPackFT + OASST
OctoGeeX CodeGeeX2 (6B parameters) instruction tuned on CommitPackFT + OASST
Evaluation   HumanEvalPack Extension of OpenAI's HumanEval to cover 3 scenarios across 6 languages

Use

Intended use

The model follows instructions provided in the input. You should always preface your input with "Question: " and finish it with "Answer:", for example: "Question: Please write a function in Python that performs bubble sort.\n\nAnswer:"

Feel free to share your generations in the Community tab!

Generation

# pip install -q transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "bigcode/octocoder"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("Question: Please write a function in Python that performs bubble sort.\n\nAnswer:", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

Training

Model

  • Architecture: GPT-2 model with multi-query attention and Fill-in-the-Middle objective
  • Steps: 250k pretraining & 30 instruction tuning
  • Pretraining tokens: 1 trillion pretraining & 2M instruction tuning
  • Precision: bfloat16

Hardware

  • Pretraining:
    • GPUs: 512 Tesla A100
    • Training time: 24 days
  • Instruction tuning:
    • GPUs: 8 Tesla A100
    • Training time: 4 hours

Software

Citation

@article{muennighoff2023octopack,
      title={OctoPack: Instruction Tuning Code Large Language Models}, 
      author={Niklas Muennighoff and Qian Liu and Armel Zebaze and Qinkai Zheng and Binyuan Hui and Terry Yue Zhuo and Swayam Singh and Xiangru Tang and Leandro von Werra and Shayne Longpre},
      journal={arXiv preprint arXiv:2308.07124},
      year={2023}
}
Description
Model synced from source: bigcode/octocoder
Readme 266 KiB
Languages
Text 100%