初始化项目，由ModelHub XC社区提供模型

Model: recogna-nlp/internlm-chatbode-7b Source: Original Platform
2026-05-07 10:58:53 +08:00
commit 08efe4ff32
17 changed files with 2701 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,242 @@
+---
+library_name: transformers
+model-index:
+- name: internlm-chatbode-7b
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: ENEM Challenge (No Images)
+      type: eduagarcia/enem_challenge
+      split: train
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc
+      value: 63.05
+      name: accuracy
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: BLUEX (No Images)
+      type: eduagarcia-temp/BLUEX_without_images
+      split: train
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc
+      value: 51.46
+      name: accuracy
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: OAB Exams
+      type: eduagarcia/oab_exams
+      split: train
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc
+      value: 42.32
+      name: accuracy
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Assin2 RTE
+      type: assin2
+      split: test
+      args:
+        num_few_shot: 15
+    metrics:
+    - type: f1_macro
+      value: 91.33
+      name: f1-macro
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Assin2 STS
+      type: eduagarcia/portuguese_benchmark
+      split: test
+      args:
+        num_few_shot: 15
+    metrics:
+    - type: pearson
+      value: 80.69
+      name: pearson
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: FaQuAD NLI
+      type: ruanchaves/faquad-nli
+      split: test
+      args:
+        num_few_shot: 15
+    metrics:
+    - type: f1_macro
+      value: 79.8
+      name: f1-macro
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HateBR Binary
+      type: ruanchaves/hatebr
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: f1_macro
+      value: 87.99
+      name: f1-macro
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: PT Hate Speech Binary
+      type: hate_speech_portuguese
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: f1_macro
+      value: 68.09
+      name: f1-macro
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: tweetSentBR
+      type: eduagarcia/tweetsentbr_fewshot
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: f1_macro
+      value: 61.11
+      name: f1-macro
+    source:
+      url: >-
+        https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b
+      name: Open Portuguese LLM Leaderboard
+language:
+- pt
+pipeline_tag: text-generation
+---
+
+# internlm-chatbode-7b
+
+<!--- PROJECT LOGO -->
+<p align="center">
+  <img src="https://huggingface.co/recogna-nlp/internlm-chatbode-7b/resolve/main/_1add1e52-f428-4c7c-bab2-3c6958e029fa.jpeg" alt="ChatBode Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
+</p>
+
+O InternLm-ChatBode é um modelo de linguagem ajustado para o idioma português, desenvolvido a partir do modelo [InternLM2](https://huggingface.co/internlm/internlm2-chat-7b). Este modelo foi refinado através do processo de fine-tuning utilizando o dataset UltraAlpaca.
+
+## Características Principais
+
+- **Modelo Base:** [internlm/internlm2-chat-7b](internlm/internlm2-chat-7b)
+- **Dataset para Fine-tuning:** UltraAlpaca
+- **Treinamento:** O treinamento foi realizado a partir do fine-tuning, usando QLoRA, do internlm2-chat-7b.
+
+## Exemplo de uso
+
+A seguir um exemplo de código de como carregar e utilizar o modelo:
+
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("recogna-nlp/internlm-chatbode-7b", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained("recogna-nlp/internlm-chatbode-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
+model = model.eval()
+response, history = model.chat(tokenizer, "Olá", history=[])
+print(response)
+response, history = model.chat(tokenizer, "O que é o Teorema de Pitágoras? Me dê um exemplo", history=history)
+print(response)
+```
+
+As respostas podem ser geradas via stream utilizando o método `stream_chat`:
+
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_path = "recogna-nlp/internlm-chatbode-7b"
+model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
+tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+
+model = model.eval()
+length = 0
+for response, history in model.stream_chat(tokenizer, "Olá", history=[]):
+    print(response[length:], flush=True, end="")
+    length = len(response)
+```
+
+# Open Portuguese LLM Leaderboard Evaluation Results  
+
+Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/recogna-nlp/internlm-chatbode-7b) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
+
+|          Metric          |  Value  |
+|--------------------------|---------|
+|Average                   |**69.54**|
+|ENEM Challenge (No Images)|    63.05|
+|BLUEX (No Images)         |    51.46|
+|OAB Exams                 |    42.32|
+|Assin2 RTE                |    91.33|
+|Assin2 STS                |    80.69|
+|FaQuAD NLI                |    79.80|
+|HateBR Binary             |    87.99|
+|PT Hate Speech Binary     |    68.09|
+|tweetSentBR               |    61.11|
+
+
+## Citação
+Se você deseja utilizar o Chatbode em sua pesquisa, cite-o da seguinte maneira:
+
+```
+@misc {chatbode_2024,
+	author       = { Gabriel Lino Garcia, Pedro Henrique Paiola and  and João Paulo Papa},
+	title        = { Chatbode },
+	year         = {2024},
+	url          = { https://huggingface.co/recogna-nlp/internlm-chatbode-7b/ },
+	doi          = { 10.57967/hf/3317 },
+	publisher    = { Hugging Face }
+}
+```