--- base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B library_name: transformers tags: - mergekit - merge --- # merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_ties This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) as a base. ### Models Merged The following models were included in the merge: * /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR2e-6/global_step_30/actor/huggingface * /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR2e-6/global_step_40/actor/huggingface ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR2e-6/global_step_40/actor/huggingface parameters: weight: 0.5 density: 0.5 - model: /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR2e-6/global_step_30/actor/huggingface parameters: weight: 0.5 density: 0.5 merge_method: ties parameters: normalize: true dtype: bfloat16 base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B ```