初始化项目，由ModelHub XC社区提供模型

Model: cogbuji/Mr-Grammatology-clinical-problems-Mistral-7B-0.5 Source: Original Platform
2026-05-17 01:31:42 +08:00
commit 53e2cbc052
14 changed files with 91813 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,35 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,128 @@
 ---
 base_model: teknium/OpenHermes-2.5-Mistral-7B
 license: mit
 language:
 - en
 model_creator: Chime Ogbuji
 library_name: mlx
 model_name: Mr-Grammatology-clinical-problems-Mistral-7B-0.5
 pipeline_tag: text-generation
 prompt_template: '<|im_start|>system
  {system_message}<|im_end|>
  <|im_start|>user
  {prompt}<|im_end|>
  <|im_start|>assistant
 '
 tags:
 - mlx
 - medical
 - health  
 - mistral
 - instruct
 - finetune
 - chatml
 ---
 # Mr-Grammatology-clinical-problems-Mistral-7B-0.5
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/651d96a3e8c4c2ebaafc1e7d/uyiryuBhU4y62f4CRxabO.png)
 The name of the model is a homage to Fela Kuti's song __Mr Grammarticalogy-Lisationalsim Is The Boss__ released on the B-side of his 1976 LP [Excuse O](https://www.discogs.com/release/3149841-Fela-And-The-Africa-70-Excuse-O).
 It is a 16/32 QLoRa all linear layers finetune of [teknium/OpenHermes-2.5-Mistral-7B](/teknium/OpenHermes-2.5-Mistral-7B) using [controlled natural language (CNL) phrases](https://github.com/chimezie/django-snomed-ct#controlled-natural-language) 
 generated from the September 23rd release of [SNOMED CT United States Edition](https://www.snomed.org/snomed-ct/Use-SNOMED-CT).  The general idea is described in [Domain-Specific Biomedical Ontologies, RALM, and Generative Medical Expert Systems](https://chimezie.medium.com/biomedical-ontology-retrieval-augmented-language-models-using-django-snomed-ct-and-ogbujipt-dfa0d0b150d8).
 It is an experimental model for non-production environments to test how generative AI systems can be trained for use in various medical informatics scenarios.
 The original model was converted to MLX format, quantized, and then subject to continued pretraining using all the active domain-expert text definitions available in SNOMED-CT at a constant learning rate of 1e-5 using 
 [mlx_lm's LoRa finetuning functionality](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/LORA.md) with 16 LoRa layers.  
 It was then trained on a dataset of 336,762 records of medical terminology **definition instructions** generated from SNOMED-CT using a fork of [django-snomed-ct](https://github.com/chimezie/django-snomed-ct).  These definition instructions were generated from the **disorder**, **finding**, **morphological abnormality**, and **situation** hierarchies in SNOMED-CT.  This training step was done using [mlx-tuning-fork](https://github.com/chimezie/mlx-tuning-fork) through 42,096 training iterations, with a batch size of 8 at a time, using LoRa on all linear layers.
 There were 51,082 records of more granular definition instructions, part of which were incorporated into the training dataset.  However, 40% were kept aside for validation.  
 ## Use with mlx
 ```bash
 pip install mlx-lm
 ```
 Download and convert.
 ```bash
 $ python -m mlx_lm.convert --hf-path cogbuji/Mr-Grammatology-clinical-problems-Mistral-7B-0.5 \
         --mlx-path /path/to/mlx/model
 ```
 Generate from prompts in commandline (see [Generate Text with LLMs and MLX](https://github.com/ml-explore/mlx-examples/tree/main/llms) for more options )
 ```bash
 $ python -m mlx_lm.generate --prompt "How is Cardiomyopathy characterized in form?" \
         --temp .4 -m 300 --model /path/to/mlx/model --seed 4
 ```
 ```
 Prompt: <|im_start|>user
 How is Cardiomyopathy characterized in form?<|im_end|>
 <|im_start|>assistant
 Cardiomyopathy is characterized in form by a morphologically abnormal structure located in a myocardium structure
 ```
 ## Example of use of 1-shot description prompting
 Using mlx-tuning-fork with OgbujiPts word looms to construct 1-shot example of description prompting.
 ```bash
 $ python -m mlx_tuning_fork.training -nt 1200 -t .4 --loom-file=sct_prompt.toml -f chatml config.yaml
 ```
 ```
 Prompt: <|im_start|>system
 Give detailed responses.  Use critical thinking and think step-by-step.
 Don't make up answers if you don't know. Use SNOMED-CT definitions as much as possible, describing 
 characterizations in form of morphological abnormalities involved in diagnoses (their signs), their etiology (causes/basis),  
 their related findings. 
 ## Question ##
 What are the signs and etiology of Skin ulcer due to diabetes mellitus?
 ## Answer ##
 SNOMED-CT defines Skin ulcer due to diabetes mellitus as characterized in form by an ulcer located in a skin structure.  
 Ulcer is a mophologic abnormality.  
 The cause or origin of a disorder is part of its etiology. 
 Skin ulcer due to diabetes mellitus is caused by diabetes mellitus.  
 So, the etiology of Skin ulcer due to diabetes mellitus is diabetes mellitus.
 Infected ulcer of skin is an infectious process characterized in form by an ulcer located in a skin structure. 
 Since the signs of Skin ulcer due to diabetes mellitus is also an ulcer, Infected ulcer of skin is 
 an example of a disorder with the same signs as Skin ulcer due to diabetes mellitus.
 <|im_end|>
 <|im_start|>user
 Take a deep breath and work on this problem step by step. 
 ## Question ##
 The patient has Irritable bowel syndrome.
 Summarize the the biological processes and signs (characterizations in form)
 involved in the patient's diagnoses. Give examples of other disorders with the same signs and etiology.
 <|im_end|>
 <|im_start|>assistant
 Irritable bowel syndrome (SNOMED-CT definition: Irritable bowel syndrome) is a functional disorder of the gastrointestinal tract, characterized in form by abnormal bowel movements. It is a clinical diagnosis, and its signs and symptoms include abdominal pain, bloating, and altered bowel movements (diarrhea, constipation, or both).
 Etiology: Irritable bowel syndrome is believed to be caused by factors including abnormal gut motility, gastrointestinal dysrhythmias, and psychological factors.
 Signs: Examples of signs of the disorder include changes in stool consistency and abdominal pain.
 Irritable bowel syndrome is a functional bowel disorder, which is a category of gastrointestinal disorders with similar signs and symptoms. Examples of disorders with the same signs and etiology include:
 1. Functional dyspepsia: It is characterized by an upper abdominal pain or discomfort and has a similar etiology as irritable bowel syndrome. It is a functional disorder of the stomach and small intestine, and its signs include epigastric pain and discomfort.
 2. Chronic idiopathic constipation: It is characterized by chronic constipation and has a similar etiology as irritable bowel syndrome. It is a functional disorder of the colon
 ==========
 Prompt: 447.658 tokens-per-sec
 ```
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,4 @@
 {
  "<|im_end|>": 32000,
  "<|im_start|>": 32001
 }
--- a/config.json
+++ b/config.json
@@ -0,0 +1,80 @@
 {
    "vocab_size": 32002,
    "max_position_embeddings": 32768,
    "hidden_size": 4096,
    "intermediate_size": 14336,
    "num_hidden_layers": 32,
    "num_attention_heads": 32,
    "sliding_window": 4096,
    "num_key_value_heads": 8,
    "hidden_act": "silu",
    "initializer_range": 0.02,
    "rms_norm_eps": 1e-05,
    "use_cache": false,
    "rope_theta": 10000.0,
    "attention_dropout": 0.0,
    "return_dict": true,
    "output_hidden_states": false,
    "output_attentions": false,
    "torchscript": false,
    "torch_dtype": "bfloat16",
    "use_bfloat16": false,
    "tf_legacy_loss": false,
    "pruned_heads": {},
    "tie_word_embeddings": false,
    "chunk_size_feed_forward": 0,
    "is_encoder_decoder": false,
    "is_decoder": false,
    "cross_attention_hidden_size": null,
    "add_cross_attention": false,
    "tie_encoder_decoder": false,
    "max_length": 20,
    "min_length": 0,
    "do_sample": false,
    "early_stopping": false,
    "num_beams": 1,
    "num_beam_groups": 1,
    "diversity_penalty": 0.0,
    "temperature": 1.0,
    "top_k": 50,
    "top_p": 1.0,
    "typical_p": 1.0,
    "repetition_penalty": 1.0,
    "length_penalty": 1.0,
    "no_repeat_ngram_size": 0,
    "encoder_no_repeat_ngram_size": 0,
    "bad_words_ids": null,
    "num_return_sequences": 1,
    "output_scores": false,
    "return_dict_in_generate": false,
    "forced_bos_token_id": null,
    "forced_eos_token_id": null,
    "remove_invalid_values": false,
    "exponential_decay_length_penalty": null,
    "suppress_tokens": null,
    "begin_suppress_tokens": null,
    "architectures": [
        "MistralForCausalLM"
    ],
    "finetuning_task": null,
    "id2label": {
        "0": "LABEL_0",
        "1": "LABEL_1"
    },
    "label2id": {
        "LABEL_0": 0,
        "LABEL_1": 1
    },
    "tokenizer_class": null,
    "prefix": null,
    "bos_token_id": 1,
    "pad_token_id": null,
    "eos_token_id": 32000,
    "sep_token_id": null,
    "decoder_start_token_id": null,
    "task_specific_params": null,
    "problem_type": null,
    "_name_or_path": "/Users/oori/medical_llm/raw_models/mlx/other",
    "transformers_version": "4.37.0",
    "model_type": "mistral"
 }
--- a/logo.png
+++ b/logo.png
--- a/model-00001-of-00003.safetensors
+++ b/model-00001-of-00003.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:5a8e70339f19ea9fe5811f6f6bd86a04c2db632762850d24643dd6fd8f6277eb
 size 5261963013
--- a/model-00002-of-00003.safetensors
+++ b/model-00002-of-00003.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:3a31a4c8bdbbb58378bc53068c465b8230f39cae23a35d90840e73a143332214
 size 5352141111
--- a/model-00003-of-00003.safetensors
+++ b/model-00003-of-00003.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:1c2a7fc34bb8004cbee7787524a66b603c79d260fe224b243853f23387a5691c
 size 3869426317
--- a/model.safetensors.index.json
+++ b/model.safetensors.index.json
@@ -0,0 +1,298 @@
 {
    "metadata": {
        "total_size": 14483496960
    },
    "weight_map": {
        "lm_head.weight": "model-00003-of-00003.safetensors",
        "model.embed_tokens.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.23.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.23.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.23.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
        "model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.29.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.30.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.30.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.input_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.31.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
        "model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.input_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
        "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
        "model.norm.weight": "model-00003-of-00003.safetensors"
    }
 }
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,23 @@
 {
  "bos_token": {
    "content": "<s>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "<|im_end|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer.model
+++ b/tokenizer.model
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,61 @@
 {
  "add_bos_token": true,
  "add_eos_token": false,
  "added_tokens_decoder": {
    "0": {
      "content": "<unk>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "1": {
      "content": "<s>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "2": {
      "content": "</s>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "32000": {
      "content": "<|im_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "32001": {
      "content": "<|im_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    }
  },
  "additional_special_tokens": [],
  "bos_token": "<s>",
  "chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "legacy": true,
  "model_max_length": 1000000000000000019884624838656,
  "pad_token": null,
  "sp_model_kwargs": {},
  "spaces_between_special_tokens": false,
  "tokenizer_class": "LlamaTokenizer",
  "trust_remote_code": false,
  "unk_token": "<unk>",
  "use_default_system_prompt": true,
  "use_fast": true
 }
--- a/transformers_inference.py
+++ b/transformers_inference.py
@@ -0,0 +1,32 @@
 # Code to inference Open Hermes 2.5 with HF Transformers
 # Requires pytorch, transformers, bitsandbytes, sentencepiece, protobuf, and flash-attn packages
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
 from transformers import LlamaTokenizer, LlamaForCausalLM, MistralForCausalLM
 import bitsandbytes, flash_attn
 tokenizer = LlamaTokenizer.from_pretrained('teknium/OpenHermes-2.5-Mistral-7B', trust_remote_code=True)
 model = MistralForCausalLM.from_pretrained(
    "teknium/OpenHermes-2.5-Mistral-7B",
    torch_dtype=torch.float16,
    device_map="auto",#{'': 'cuda:0'},
    load_in_8bit=False,
    load_in_4bit=True,
    use_flash_attention_2=True
 )
 prompts = [
    """<|im_start|>system
 You are a sentient, superintelligent artificial general intelligence, here to teach and assist me.<|im_end|>
 <|im_start|>user
 Write a short story about Goku discovering kirby has teamed up with Majin Buu to destroy the world.<|im_end|>
 <|im_start|>assistant""",
    ]
 for chat in prompts:
    print(chat)
    input_ids = tokenizer(chat, return_tensors="pt").input_ids.to("cuda")
    generated_ids = model.generate(input_ids, max_new_tokens=750, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
    response = tokenizer.decode(generated_ids[0][input_ids.shape[-1]:], skip_special_tokens=True, clean_up_tokenization_space=True)
    print(f"Response: {response}")