Files
ModelHub XC 4b9b89f346 初始化项目,由ModelHub XC社区提供模型
Model: prithivMLmods/Megatron-Corpus-14B-Exp.v2
Source: Original Platform
2026-06-06 22:30:15 +08:00

7.4 KiB

license, language, base_model, pipeline_tag, library_name, tags, model-index
license language base_model pipeline_tag library_name tags model-index
apache-2.0
en
prithivMLmods/Megatron-Corpus-14B-Exp
text-generation transformers
Coding
Math
name results
Megatron-Corpus-14B-Exp.v2
task dataset metrics source
type name
text-generation Text Generation
name type split args
IFEval (0-Shot) wis-k/instruction-following-eval train
num_few_shot
0
type value name
inst_level_strict_acc and prompt_level_strict_acc 48.7 averaged accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FMegatron-Corpus-14B-Exp.v2 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
BBH (3-Shot) SaylorTwift/bbh test
num_few_shot
3
type value name
acc_norm 46.79 normalized accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FMegatron-Corpus-14B-Exp.v2 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
MATH Lvl 5 (4-Shot) lighteval/MATH-Hard test
num_few_shot
4
type value name
exact_match 25.3 exact match
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FMegatron-Corpus-14B-Exp.v2 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
GPQA (0-shot) Idavidrein/gpqa train
num_few_shot
0
type value name
acc_norm 12.3 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FMegatron-Corpus-14B-Exp.v2 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
MuSR (0-shot) TAUR-Lab/MuSR
num_few_shot
0
type value name
acc_norm 15.36 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FMegatron-Corpus-14B-Exp.v2 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU-PRO (5-shot) TIGER-Lab/MMLU-Pro main test
num_few_shot
5
type value name
acc 42.33 accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FMegatron-Corpus-14B-Exp.v2 Open LLM Leaderboard

corpus2.gif

Megatron-Corpus-14B-Exp.v2

Megatron-Corpus-14B-Exp.v2 is based on the Qwen 2.5 14B modality architecture, designed to enhance the reasoning capabilities of 14B-parameter models. It has been fine-tuned on a synthetic dataset based on math corpus, further optimizing its chain-of-thought (CoT) reasoning and logical problem-solving abilities. The model demonstrates significant improvements in context understanding, structured data processing, and long-context comprehension, making it ideal for complex reasoning tasks, instruction-following, and text generation.

Key Improvements

  1. Advanced Reasoning & Logic: Optimized for multi-step problem-solving, logical deduction, and contextual analysis.
  2. Fine-Tuned Instruction Following: Generates precise responses, structured outputs (e.g., JSON), and extended long-form text (8K+ tokens).
  3. Greater Adaptability: Excels in role-playing, multi-turn dialogues, and diverse system prompts.
  4. Long-Context Support: Handles up to 128K tokens and generates up to 8K tokens per output.
  5. Multilingual Proficiency: Supports over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, and more.

Quickstart with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Megatron-Corpus-14B-Exp.v2"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Explain the concept of logical reasoning in AI."
messages = [
    {"role": "system", "content": "You are an expert AI assistant specialized in reasoning and logic."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Intended Use

  • Advanced Logical & Analytical Reasoning: Designed for problem-solving, multi-step deductions, and cognitive reasoning tasks.
  • Mathematical & Scientific Computation: Supports theorem proving, complex calculations, and scientific knowledge retrieval.
  • Code Generation & Debugging: Generates optimized code, detects errors, and improves programming workflows.
  • Structured Data Analysis: Processes tables, JSON, and structured formats for data-centric applications.
  • Multilingual Reasoning & Translation: High proficiency across 29+ languages for international applications.
  • Extended Text Generation: Capable of generating research papers, instructional guides, and in-depth reports.

Limitations

  1. High Computational Requirements: Due to its 14B parameters and 128K context support, it requires powerful GPUs or TPUs for efficient inference.
  2. Language-Specific Variability: Performance may differ across supported languages, especially for low-resource languages.
  3. Potential Error Accumulation: Long-form text generation can introduce inconsistencies over extended outputs.
  4. Limited Real-World Awareness: Knowledge is restricted to training data and may not reflect recent world events.
  5. Prompt Sensitivity: The quality of responses depends on the specificity and clarity of the input prompt.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here! Summarized results can be found here!

Metric Value (%)
Average 31.80
IFEval (0-Shot) 48.70
BBH (3-Shot) 46.79
MATH Lvl 5 (4-Shot) 25.30
GPQA (0-shot) 12.30
MuSR (0-shot) 15.36
MMLU-PRO (5-shot) 42.33