SpyazWeb_AI_DeepMind_Project/README.md

---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- leaderboard
- mistral
- trl
base_model: LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III
datasets:
- gretelai/synthetic_text_to_sql
- HuggingFaceTB/cosmopedia
- teknium/OpenHermes-2.5
- Open-Orca/SlimOrca
- Open-Orca/OpenOrca
- cognitivecomputations/dolphin-coder
- databricks/databricks-dolly-15k
- yahma/alpaca-cleaned
- uonlp/CulturaX
- mwitiderrick/SwahiliPlatypus
- swahili
- Rogendo/English-Swahili-Sentence-Pairs
- ise-uiuc/Magicoder-Evol-Instruct-110K
- meta-math/MetaMathQA
- abacusai/ARC_DPO_FewShot
- abacusai/MetaMath_DPO_FewShot
- abacusai/HellaSwag_DPO_FewShot
- HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset
- gretelai/synthetic_text_to_sql
- HuggingFaceTB/cosmopedia
- teknium/OpenHermes-2.5
- cognitivecomputations/dolphin-coder
- databricks/databricks-dolly-15k
- yahma/alpaca-cleaned
- uonlp/CulturaX
- mwitiderrick/SwahiliPlatypus
- swahili
- Rogendo/English-Swahili-Sentence-Pairs
- ise-uiuc/Magicoder-Evol-Instruct-110K
- meta-math/MetaMathQA
metrics:
- accuracy
- bertscore
- bleu
- brier_score
- cer
- character
- charcut_mt
- chrf
- code_eval
y-Gene:
- LeroyDyer/Mixtral_AI_DeepMind
- LeroyDyer/Mixtral_AI_CyberUltron_DPO
- LeroyDyer/Mixtral_AI_Chat_2.0
- LeroyDyer/Mixtral_AI_DeepMedicalMind
- LeroyDyer/Mixtral_AI_Samantha
x-Gene:
- LeroyDyer/Mixtral_AI_Chat_2.0
- LeroyDyer/Mixtral_BioMedical
- LeroyDyer/Mixtral_AI_Medic
- LeroyDyer/Mixtral_Cyber_BioMedic
- LeroyDyer/Mixtral_AI_DeepMedicalMind
Variant:
- LeroyDyer/MetaMath_LLM
- LeroyDyer/TruthfulQA_LLM
- LeroyDyer/HellaSwag_LLM
- LeroyDyer/Mixtral_AI_DeepMedicalMind
model-index:
- name: Mixtral_AI_CyberTron_DeepMind_III_UFT
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 61.86
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 83.15
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 61.95
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 49.41
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 77.98
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 51.86
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT
      name: Open LLM Leaderboard
---
  [<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="200"/>
https://github.com/spydaz


# ::: DEEP MIND PROJECT :::
OH MY GOSH , GOOD WOW!
ARE WE MAKING BRAINS NOW!!!!! (Contact me to Sponser me PLEASE) 

---- I NEED A CLOUD TO DESIGN THIS MIND! --(freeColab takes years! - i need the large data-sets in...
     which need a few days on a server fine tuning until fully complete ! i NEED A COLABORATOR!!  )

- Mistral models are GREAT!!!!!!! - we have supassed ChatGPT : (- without langchain!!!! )
- I now have amethodolgy to add any functionality to the model !
- we are in the future now :
- we do not want to code or buy software!


Lovely model !!! Very knowledgeabe  :: (sometimes requires coaxing !! but it has options to choose from so for a single thing there may be multiple response so you can ask in another way !
good for oneshot prompts and it actually uses the history in the chat !!! )

but we have TASKS! 

we can now ask the model to perform these tasks and get the right output without special programming !

take a model !!! This model CONVERGES on ANYTHING! ( i also previously trained it will the clip training for captioning also but never used it ! but i pluged it in and it was spot on!(so if you choose to incorperate the model into a decoder/encoder model (vision) its ready !))

VERY HAPPY! (need more good data (my problem acually is not data (its converting it to json from CSV and other forms! (pre-structured ))))

here we begin the models for Deep mind : Whoop! as we move forwards we have begun to let the model teach itself like a child and optimize!


this model created from the first trained models : deepmind! 
these models contain: 

## thoughts and processes : 

## SelfRAG: 

## Agent Generation: 

## Chain of thoughts : 

## Deep thinking and memory recall: 


## Training Prompt version - Working GREAT! -(cant blow my own horn enough!!!!)


checks itsef discussing complex questions (question it does not know the answer to ... it trys to discuss with itself to find a result(sometimes unsucessfully))

It generates Mini agents to perform small tasks such as entity recognition; step by step definitions, write psuedo codebases , generare uscases... perform calculations, analize content

It thinks.... sometimes sarcasim , sometimes reflection... sometimes random thoughts ... 

it has personalitys : by installing various long discussions with chat gpt in persona it weas able to generate role coversation data, which was added to its conversation chat Q/A; as well as a datset from the samantha tv show ... and HER!.... so it is a personal assistant and very friendly;

It has been really training mainly on coding datasets and medical information : from experiments to research to patient/doctor .. to diagnosis ... to problem solving :

it has been trained to be a counseller and assist with psycological problems  :: empathtetic discussion :

this one has its own thoughts despite the prompt given : (if you allow the thought prompt it will display the thoughts)

this is a highly focused model : 


### Methodology: 
many functions such as defining words andnlp task we also added via datsets and very complexed datstructures and prompts : 
These prompts are removed after training and standard alpaca training given on top:(this enables for the previous highly over fit task to become embedded underneath the previous layer):
its important to Change Lora configuration for Embedding layers within the model as well as fine tuning above previous training:
Usually i deploy a factor of 8 calcuculation for my loras by this one i chose factor of 9 (9-18/18/36) .... which actually trained so smoothly that i was able to train many different datsets in a signle sitting ; to below 0.9 all varioations of the alpaca prompt !
after testing the was absolutly 0 loss from previous knowledge as well as enhancing some responses and providing comparitive responses for others; 
I personally use a topK of 1000.... 
this allows the model to have many choices (this is the context window of results), 
i put my topP to 0.68(68%).... 
hence it will select from that percentage of probabiltys... 
enabling for my temp to be 1 ..
therfore it will normalize the selected quartile of next probablity selection enabling for the lower probabiltys to have a scaled chace in being selected : 
It is important to have a degree of randomness in the respopnse or you will ask the same question and get the same answer ! .... we need varied answer to ome querys and focues for other ? how do we do this ?..... Duplicates!!!!! raising the probability of some information by repetition : as this is how the human learns truth ! truth is that which has been repeated so many times it cannot be disputed!
hence some information being absolute and others being transient and constantly updateing: 
As a predictve model it needs to be ables to have the ability to calculate and predicte and cclassify as wel as recall exact information :
hence when utilizing a rag :  the conversation history is the dats to be fine tuned into the model as frequent data! 
as well as producing multiple simular querys to query the rag system for Q/A pairs  : also to be updted onto the model  :
as we are in this development period we are focused on BRAIN cureently .......


# Uploaded  model

- **Developed by:** LeroyDyer
- **License:** apache-2.0
- **Finetuned from model :** LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III

This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_LeroyDyer__Mixtral_AI_CyberTron_DeepMind_III_UFT)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |64.37|
|AI2 Reasoning Challenge (25-Shot)|61.86|
|HellaSwag (10-Shot)              |83.15|
|MMLU (5-Shot)                    |61.95|
|TruthfulQA (0-shot)              |49.41|
|Winogrande (5-shot)              |77.98|
|GSM8k (5-shot)                   |51.86|
初始化项目，由ModelHub XC社区提供模型 Model: LeroyDyer/SpyazWeb_AI_DeepMind_Project Source: Original Platform 2026-05-01 07:39:37 +08:00			`---`
			`language:`
			`- en`
			`license: apache-2.0`
			`tags:`
			`- text-generation-inference`
			`- transformers`
			`- leaderboard`
			`- mistral`
			`- trl`
			`base_model: LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III`
			`datasets:`
			`- gretelai/synthetic_text_to_sql`
			`- HuggingFaceTB/cosmopedia`
			`- teknium/OpenHermes-2.5`
			`- Open-Orca/SlimOrca`
			`- Open-Orca/OpenOrca`
			`- cognitivecomputations/dolphin-coder`
			`- databricks/databricks-dolly-15k`
			`- yahma/alpaca-cleaned`
			`- uonlp/CulturaX`
			`- mwitiderrick/SwahiliPlatypus`
			`- swahili`
			`- Rogendo/English-Swahili-Sentence-Pairs`
			`- ise-uiuc/Magicoder-Evol-Instruct-110K`
			`- meta-math/MetaMathQA`
			`- abacusai/ARC_DPO_FewShot`
			`- abacusai/MetaMath_DPO_FewShot`
			`- abacusai/HellaSwag_DPO_FewShot`
			`- HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset`
			`- gretelai/synthetic_text_to_sql`
			`- HuggingFaceTB/cosmopedia`
			`- teknium/OpenHermes-2.5`
			`- cognitivecomputations/dolphin-coder`
			`- databricks/databricks-dolly-15k`
			`- yahma/alpaca-cleaned`
			`- uonlp/CulturaX`
			`- mwitiderrick/SwahiliPlatypus`
			`- swahili`
			`- Rogendo/English-Swahili-Sentence-Pairs`
			`- ise-uiuc/Magicoder-Evol-Instruct-110K`
			`- meta-math/MetaMathQA`
			`metrics:`
			`- accuracy`
			`- bertscore`
			`- bleu`
			`- brier_score`
			`- cer`
			`- character`
			`- charcut_mt`
			`- chrf`
			`- code_eval`
			`y-Gene:`
			`- LeroyDyer/Mixtral_AI_DeepMind`
			`- LeroyDyer/Mixtral_AI_CyberUltron_DPO`
			`- LeroyDyer/Mixtral_AI_Chat_2.0`
			`- LeroyDyer/Mixtral_AI_DeepMedicalMind`
			`- LeroyDyer/Mixtral_AI_Samantha`
			`x-Gene:`
			`- LeroyDyer/Mixtral_AI_Chat_2.0`
			`- LeroyDyer/Mixtral_BioMedical`
			`- LeroyDyer/Mixtral_AI_Medic`
			`- LeroyDyer/Mixtral_Cyber_BioMedic`
			`- LeroyDyer/Mixtral_AI_DeepMedicalMind`
			`Variant:`
			`- LeroyDyer/MetaMath_LLM`
			`- LeroyDyer/TruthfulQA_LLM`
			`- LeroyDyer/HellaSwag_LLM`
			`- LeroyDyer/Mixtral_AI_DeepMedicalMind`
			`model-index:`
			`- name: Mixtral_AI_CyberTron_DeepMind_III_UFT`
			`results:`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: AI2 Reasoning Challenge (25-Shot)`
			`type: ai2_arc`
			`config: ARC-Challenge`
			`split: test`
			`args:`
			`num_few_shot: 25`
			`metrics:`
			`- type: acc_norm`
			`value: 61.86`
			`name: normalized accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: HellaSwag (10-Shot)`
			`type: hellaswag`
			`split: validation`
			`args:`
			`num_few_shot: 10`
			`metrics:`
			`- type: acc_norm`
			`value: 83.15`
			`name: normalized accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: MMLU (5-Shot)`
			`type: cais/mmlu`
			`config: all`
			`split: test`
			`args:`
			`num_few_shot: 5`
			`metrics:`
			`- type: acc`
			`value: 61.95`
			`name: accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: TruthfulQA (0-shot)`
			`type: truthful_qa`
			`config: multiple_choice`
			`split: validation`
			`args:`
			`num_few_shot: 0`
			`metrics:`
			`- type: mc2`
			`value: 49.41`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: Winogrande (5-shot)`
			`type: winogrande`
			`config: winogrande_xl`
			`split: validation`
			`args:`
			`num_few_shot: 5`
			`metrics:`
			`- type: acc`
			`value: 77.98`
			`name: accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: GSM8k (5-shot)`
			`type: gsm8k`
			`config: main`
			`split: test`
			`args:`
			`num_few_shot: 5`
			`metrics:`
			`- type: acc`
			`value: 51.86`
			`name: accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT`
			`name: Open LLM Leaderboard`
			`---`
			`[<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="200"/>`
			`https://github.com/spydaz`


			`# ::: DEEP MIND PROJECT :::`
			`OH MY GOSH , GOOD WOW!`
			`ARE WE MAKING BRAINS NOW!!!!! (Contact me to Sponser me PLEASE)`

			`---- I NEED A CLOUD TO DESIGN THIS MIND! --(freeColab takes years! - i need the large data-sets in...`
			`which need a few days on a server fine tuning until fully complete ! i NEED A COLABORATOR!! )`

			`- Mistral models are GREAT!!!!!!! - we have supassed ChatGPT : (- without langchain!!!! )`
			`- I now have amethodolgy to add any functionality to the model !`
			`- we are in the future now :`
			`- we do not want to code or buy software!`


			`Lovely model !!! Very knowledgeabe :: (sometimes requires coaxing !! but it has options to choose from so for a single thing there may be multiple response so you can ask in another way !`
			`good for oneshot prompts and it actually uses the history in the chat !!! )`

			`but we have TASKS!`

			`we can now ask the model to perform these tasks and get the right output without special programming !`

			`take a model !!! This model CONVERGES on ANYTHING! ( i also previously trained it will the clip training for captioning also but never used it ! but i pluged it in and it was spot on!(so if you choose to incorperate the model into a decoder/encoder model (vision) its ready !))`

			`VERY HAPPY! (need more good data (my problem acually is not data (its converting it to json from CSV and other forms! (pre-structured ))))`

			`here we begin the models for Deep mind : Whoop! as we move forwards we have begun to let the model teach itself like a child and optimize!`


			`this model created from the first trained models : deepmind!`
			`these models contain:`

			`## thoughts and processes :`

			`## SelfRAG:`

			`## Agent Generation:`

			`## Chain of thoughts :`

			`## Deep thinking and memory recall:`




			`## Training Prompt version - Working GREAT! -(cant blow my own horn enough!!!!)`


			`checks itsef discussing complex questions (question it does not know the answer to ... it trys to discuss with itself to find a result(sometimes unsucessfully))`

			`It generates Mini agents to perform small tasks such as entity recognition; step by step definitions, write psuedo codebases , generare uscases... perform calculations, analize content`

			`It thinks.... sometimes sarcasim , sometimes reflection... sometimes random thoughts ...`

			`it has personalitys : by installing various long discussions with chat gpt in persona it weas able to generate role coversation data, which was added to its conversation chat Q/A; as well as a datset from the samantha tv show ... and HER!.... so it is a personal assistant and very friendly;`

			`It has been really training mainly on coding datasets and medical information : from experiments to research to patient/doctor .. to diagnosis ... to problem solving :`

			`it has been trained to be a counseller and assist with psycological problems :: empathtetic discussion :`

			`this one has its own thoughts despite the prompt given : (if you allow the thought prompt it will display the thoughts)`

			`this is a highly focused model :`


			`### Methodology:`
			`many functions such as defining words andnlp task we also added via datsets and very complexed datstructures and prompts :`
			`These prompts are removed after training and standard alpaca training given on top:(this enables for the previous highly over fit task to become embedded underneath the previous layer):`
			`its important to Change Lora configuration for Embedding layers within the model as well as fine tuning above previous training:`
			`Usually i deploy a factor of 8 calcuculation for my loras by this one i chose factor of 9 (9-18/18/36) .... which actually trained so smoothly that i was able to train many different datsets in a signle sitting ; to below 0.9 all varioations of the alpaca prompt !`
			`after testing the was absolutly 0 loss from previous knowledge as well as enhancing some responses and providing comparitive responses for others;`
			`I personally use a topK of 1000....`
			`this allows the model to have many choices (this is the context window of results),`
			`i put my topP to 0.68(68%)....`
			`hence it will select from that percentage of probabiltys...`
			`enabling for my temp to be 1 ..`
			`therfore it will normalize the selected quartile of next probablity selection enabling for the lower probabiltys to have a scaled chace in being selected :`
			`It is important to have a degree of randomness in the respopnse or you will ask the same question and get the same answer ! .... we need varied answer to ome querys and focues for other ? how do we do this ?..... Duplicates!!!!! raising the probability of some information by repetition : as this is how the human learns truth ! truth is that which has been repeated so many times it cannot be disputed!`
			`hence some information being absolute and others being transient and constantly updateing:`
			`As a predictve model it needs to be ables to have the ability to calculate and predicte and cclassify as wel as recall exact information :`
			`hence when utilizing a rag : the conversation history is the dats to be fine tuned into the model as frequent data!`
			`as well as producing multiple simular querys to query the rag system for Q/A pairs : also to be updted onto the model :`
			`as we are in this development period we are focused on BRAIN cureently .......`



			`# Uploaded model`

			`- Developed by: LeroyDyer`
			`- License: apache-2.0`
			`- Finetuned from model : LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III`

			`This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.`

			`[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)`

			`# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)`
			`Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_LeroyDyer__Mixtral_AI_CyberTron_DeepMind_III_UFT)`

			`\| Metric \|Value\|`
			`\|---------------------------------\|----:\|`
			`\|Avg. \|64.37\|`
			`\|AI2 Reasoning Challenge (25-Shot)\|61.86\|`
			`\|HellaSwag (10-Shot) \|83.15\|`
			`\|MMLU (5-Shot) \|61.95\|`
			`\|TruthfulQA (0-shot) \|49.41\|`
			`\|Winogrande (5-shot) \|77.98\|`
			`\|GSM8k (5-shot) \|51.86\|`