初始化项目，由ModelHub XC社区提供模型

Model: lordjia/Qwen2-Cantonese-7B-Instruct Source: Original Platform
2026-05-25 13:44:55 +08:00
commit 12c214a141
17 changed files with 455377 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,166 @@
+---
+language:
+- zh
+- en
+license: apache-2.0
+tags:
+- Cantonese
+- Qwen2
+- chat
+datasets:
+- jed351/cantonese-wikipedia
+- raptorkwok/cantonese-traditional-chinese-parallel-corpus
+pipeline_tag: text-generation
+model-index:
+- name: Qwen2-Cantonese-7B-Instruct
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: IFEval (0-Shot)
+      type: HuggingFaceH4/ifeval
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: inst_level_strict_acc and prompt_level_strict_acc
+      value: 54.35
+      name: strict accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lordjia/Qwen2-Cantonese-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: BBH (3-Shot)
+      type: BBH
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc_norm
+      value: 32.45
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lordjia/Qwen2-Cantonese-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MATH Lvl 5 (4-Shot)
+      type: hendrycks/competition_math
+      args:
+        num_few_shot: 4
+    metrics:
+    - type: exact_match
+      value: 8.76
+      name: exact match
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lordjia/Qwen2-Cantonese-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GPQA (0-shot)
+      type: Idavidrein/gpqa
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: acc_norm
+      value: 6.04
+      name: acc_norm
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lordjia/Qwen2-Cantonese-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MuSR (0-shot)
+      type: TAUR-Lab/MuSR
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: acc_norm
+      value: 7.81
+      name: acc_norm
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lordjia/Qwen2-Cantonese-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU-PRO (5-shot)
+      type: TIGER-Lab/MMLU-Pro
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 31.59
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lordjia/Qwen2-Cantonese-7B-Instruct
+      name: Open LLM Leaderboard
+---
+
+# Qwen2-Cantonese-7B-Instruct
+
+## Model Overview / 模型概述
+
+Qwen2-Cantonese-7B-Instruct is a Cantonese language model based on Qwen2-7B-Instruct, fine-tuned using LoRA. It aims to enhance Cantonese text generation and comprehension capabilities, supporting various tasks such as dialogue generation, text summarization, and question-answering.
+
+Qwen2-Cantonese-7B-Instruct係基於Qwen2-7B-Instruct嘅粵語語言模型，使用LoRA進行微調。 它旨在提高粵語文本的生成和理解能力，支持各種任務，如對話生成、文本摘要和問答。
+
+## Model Features / 模型特性
+
+- **Base Model**: Qwen2-7B-Instruct
+- **Fine-tuning Method**: LoRA instruction tuning
+- **Training Steps**: 4572 steps
+- **Primary Language**: Cantonese / 粵語
+- **Datasets**:
+  - [jed351/cantonese-wikipedia](https://huggingface.co/datasets/jed351/cantonese-wikipedia)
+  - [raptorkwok/cantonese-traditional-chinese-parallel-corpus](https://huggingface.co/datasets/raptorkwok/cantonese-traditional-chinese-parallel-corpus)
+- **Training Tools**: [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
+
+## Quantized Version / 量化版本
+
+A 4-bit quantized version of this model is also available: [qwen2-cantonese-7b-instruct-q4_0.gguf](https://huggingface.co/lordjia/Qwen2-Cantonese-7B-Instruct/blob/main/qwen2-cantonese-7b-instruct-q4_0.gguf).
+
+此外，仲提供此模型嘅4位量化版本：[qwen2-cantonese-7b-instruct-q4_0.gguf](https://huggingface.co/lordjia/Qwen2-Cantonese-7B-Instruct/blob/main/qwen2-cantonese-7b-instruct-q4_0.gguf)。
+
+## Alternative Model Recommendations / 備選模型舉薦
+
+For alternatives, consider the following models, both fine-tuned by LordJia on Cantonese language tasks:
+
+揾其他嘅話，可以諗下呢啲模型，全部都係LordJia用廣東話嘅工作調教好嘅：
+
+1. [Llama-3-Cantonese-8B-Instruct](https://huggingface.co/lordjia/Llama-3-Cantonese-8B-Instruct) based on Meta-Llama-3-8B-Instruct.
+2. [Llama-3.1-Cantonese-8B-Instruct](https://huggingface.co/lordjia/Llama-3.1-Cantonese-8B-Instruct) based on Meta-Llama-3.1-8B-Instruct.
+
+## License / 許可證
+
+This model is licensed under the Apache 2.0 license. Please review the terms before use.
+
+此模型喺Apache 2.0許可證下獲得許可。 請在使用前仔細閱讀呢啲條款。
+
+## Contributors / 貢獻
+
+- LordJia [https://ai.chao.cool](https://ai.chao.cool/)
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lordjia__Qwen2-Cantonese-7B-Instruct)
+
+|      Metric       |Value|
+|-------------------|----:|
+|Avg.               |23.50|
+|IFEval (0-Shot)    |54.35|
+|BBH (3-Shot)       |32.45|
+|MATH Lvl 5 (4-Shot)| 8.76|
+|GPQA (0-shot)      | 6.04|
+|MuSR (0-shot)      | 7.81|
+|MMLU-PRO (5-shot)  |31.59|
+