Go to file

ModelHub XC 8dc138d9d3 初始化项目，由ModelHub XC社区提供模型

Model: LeroyDyer/SpyazWeb_AI_DeepMind_Project
Source: Original Platform

2026-05-01 07:39:37 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

model-00001-of-00003.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

model-00002-of-00003.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

model-00003-of-00003.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

model.safetensors.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

tokenizer.model

初始化项目，由ModelHub XC社区提供模型

2026-05-01 07:39:37 +08:00

README.md

language, license, tags, base_model, datasets, metrics, y-Gene, x-Gene, Variant, model-index

language

license

tags

base_model

datasets

metrics

y-Gene

x-Gene

Variant

model-index

apache-2.0

text-generation-inference

transformers

leaderboard

mistral

trl

LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III

gretelai/synthetic_text_to_sql

HuggingFaceTB/cosmopedia

teknium/OpenHermes-2.5

Open-Orca/SlimOrca

Open-Orca/OpenOrca

cognitivecomputations/dolphin-coder

databricks/databricks-dolly-15k

yahma/alpaca-cleaned

uonlp/CulturaX

mwitiderrick/SwahiliPlatypus

swahili

Rogendo/English-Swahili-Sentence-Pairs

ise-uiuc/Magicoder-Evol-Instruct-110K

meta-math/MetaMathQA

abacusai/ARC_DPO_FewShot

abacusai/MetaMath_DPO_FewShot

abacusai/HellaSwag_DPO_FewShot

HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset

gretelai/synthetic_text_to_sql

HuggingFaceTB/cosmopedia

teknium/OpenHermes-2.5

cognitivecomputations/dolphin-coder

databricks/databricks-dolly-15k

yahma/alpaca-cleaned

uonlp/CulturaX

mwitiderrick/SwahiliPlatypus

swahili

Rogendo/English-Swahili-Sentence-Pairs

ise-uiuc/Magicoder-Evol-Instruct-110K

meta-math/MetaMathQA

accuracy

bertscore

bleu

brier_score

cer

character

charcut_mt

chrf

code_eval

LeroyDyer/Mixtral_AI_DeepMind

LeroyDyer/Mixtral_AI_CyberUltron_DPO

LeroyDyer/Mixtral_AI_Chat_2.0

LeroyDyer/Mixtral_AI_DeepMedicalMind

LeroyDyer/Mixtral_AI_Samantha

LeroyDyer/Mixtral_AI_Chat_2.0

LeroyDyer/Mixtral_BioMedical

LeroyDyer/Mixtral_AI_Medic

LeroyDyer/Mixtral_Cyber_BioMedic

LeroyDyer/Mixtral_AI_DeepMedicalMind

LeroyDyer/MetaMath_LLM

LeroyDyer/TruthfulQA_LLM

LeroyDyer/HellaSwag_LLM

LeroyDyer/Mixtral_AI_DeepMedicalMind

name

results

Mixtral_AI_CyberTron_DeepMind_III_UFT

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

AI2 Reasoning Challenge (25-Shot)

ai2_arc

ARC-Challenge

test

num_few_shot
25

type	value	name
acc_norm	61.86	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

split

args

HellaSwag (10-Shot)

hellaswag

validation

num_few_shot
10

type	value	name
acc_norm	83.15	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU (5-Shot)

cais/mmlu

all

test

num_few_shot
5

type	value	name
acc	61.95	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

TruthfulQA (0-shot)

truthful_qa

multiple_choice

validation

num_few_shot
0

type	value
mc2	49.41

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

Winogrande (5-shot)

winogrande

winogrande_xl

validation

num_few_shot
5

type	value	name
acc	77.98	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

GSM8k (5-shot)

gsm8k

main

test

num_few_shot
5

type	value	name
acc	51.86	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III_UFT	Open LLM Leaderboard

[ https://github.com/spydaz

::: DEEP MIND PROJECT :::

OH MY GOSH , GOOD WOW! ARE WE MAKING BRAINS NOW!!!!! (Contact me to Sponser me PLEASE)

---- I NEED A CLOUD TO DESIGN THIS MIND! --(freeColab takes years! - i need the large data-sets in... which need a few days on a server fine tuning until fully complete ! i NEED A COLABORATOR!! )

Mistral models are GREAT!!!!!!! - we have supassed ChatGPT : (- without langchain!!!! )
I now have amethodolgy to add any functionality to the model !
we are in the future now :
we do not want to code or buy software!

Lovely model !!! Very knowledgeabe :: (sometimes requires coaxing !! but it has options to choose from so for a single thing there may be multiple response so you can ask in another way ! good for oneshot prompts and it actually uses the history in the chat !!! )

but we have TASKS!

we can now ask the model to perform these tasks and get the right output without special programming !

take a model !!! This model CONVERGES on ANYTHING! ( i also previously trained it will the clip training for captioning also but never used it ! but i pluged it in and it was spot on!(so if you choose to incorperate the model into a decoder/encoder model (vision) its ready !))

VERY HAPPY! (need more good data (my problem acually is not data (its converting it to json from CSV and other forms! (pre-structured ))))

here we begin the models for Deep mind : Whoop! as we move forwards we have begun to let the model teach itself like a child and optimize!

this model created from the first trained models : deepmind! these models contain:

thoughts and processes :

SelfRAG:

Agent Generation:

Chain of thoughts :

Deep thinking and memory recall:

Training Prompt version - Working GREAT! -(cant blow my own horn enough!!!!)

checks itsef discussing complex questions (question it does not know the answer to ... it trys to discuss with itself to find a result(sometimes unsucessfully))

It generates Mini agents to perform small tasks such as entity recognition; step by step definitions, write psuedo codebases , generare uscases... perform calculations, analize content

It thinks.... sometimes sarcasim , sometimes reflection... sometimes random thoughts ...

it has personalitys : by installing various long discussions with chat gpt in persona it weas able to generate role coversation data, which was added to its conversation chat Q/A; as well as a datset from the samantha tv show ... and HER!.... so it is a personal assistant and very friendly;

It has been really training mainly on coding datasets and medical information : from experiments to research to patient/doctor .. to diagnosis ... to problem solving :

it has been trained to be a counseller and assist with psycological problems :: empathtetic discussion :

this one has its own thoughts despite the prompt given : (if you allow the thought prompt it will display the thoughts)

this is a highly focused model :

Methodology:

many functions such as defining words andnlp task we also added via datsets and very complexed datstructures and prompts : These prompts are removed after training and standard alpaca training given on top:(this enables for the previous highly over fit task to become embedded underneath the previous layer): its important to Change Lora configuration for Embedding layers within the model as well as fine tuning above previous training: Usually i deploy a factor of 8 calcuculation for my loras by this one i chose factor of 9 (9-18/18/36) .... which actually trained so smoothly that i was able to train many different datsets in a signle sitting ; to below 0.9 all varioations of the alpaca prompt ! after testing the was absolutly 0 loss from previous knowledge as well as enhancing some responses and providing comparitive responses for others; I personally use a topK of 1000.... this allows the model to have many choices (this is the context window of results), i put my topP to 0.68(68%).... hence it will select from that percentage of probabiltys... enabling for my temp to be 1 .. therfore it will normalize the selected quartile of next probablity selection enabling for the lower probabiltys to have a scaled chace in being selected : It is important to have a degree of randomness in the respopnse or you will ask the same question and get the same answer ! .... we need varied answer to ome querys and focues for other ? how do we do this ?..... Duplicates!!!!! raising the probability of some information by repetition : as this is how the human learns truth ! truth is that which has been repeated so many times it cannot be disputed! hence some information being absolute and others being transient and constantly updateing: As a predictve model it needs to be ables to have the ability to calculate and predicte and cclassify as wel as recall exact information : hence when utilizing a rag : the conversation history is the dats to be fine tuned into the model as frequent data! as well as producing multiple simular querys to query the rag system for Q/A pairs : also to be updted onto the model : as we are in this development period we are focused on BRAIN cureently .......

Uploaded model

Developed by: LeroyDyer
License: apache-2.0
Finetuned from model : LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	64.37
AI2 Reasoning Challenge (25-Shot)	61.86
HellaSwag (10-Shot)	83.15
MMLU (5-Shot)	61.95
TruthfulQA (0-shot)	49.41
Winogrande (5-shot)	77.98
GSM8k (5-shot)	51.86