初始化项目，由ModelHub XC社区提供模型

Model: ericpolewski/TacoBeLLM Source: Original Platform
2026-04-11 00:30:01 +08:00
commit 87ed46b1c1
16 changed files with 94404 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,35 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,442 @@
+---
+license: mit
+---
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/Wwk98bG5rCRqeTN9aY36g.png)
+
+[4.0 BPW EXL2 Quant](https://huggingface.co/ericpolewski/TacoBeLLM-4.0bpw-exl2)
+This is not a Taco Bell bot. This is a Llama2-13b OpenOrca-Platypus instruct bot that happens to know a lot about Taco Bell. You'll notice this because it'll keep bringing it up in conversation where it's appropriate (and often where it's not).
+
+There were some early failures. Here's some of the very first conversations, before stabilizing it. You can see it just blurts it out:
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/8vz0AyFjeehN-W4TnqNrb.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/JOa2ztYY0WF6FAiQvl4aT.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/Oiq6L6ACXSJFjakE92E4n.png)
+
+Check out that last one. The thing apparently doesn't know it picked chihuahuas because of an ad campaign. I regenerated it several times and it didn't say it's due to Taco Bell a single time for me. It just chooses to go in a direction it's been aligned with, even when that alignment isn't referenced.
+
+The data put into the model was from their corporate website, Wikipedia, and a few recent news articles. It actually didn't make for a terrible assistant and could do things like Python scripting but would often just nose-dive into the Taco Bell data quite abruptly. I later fine-tuned on some of the [AIRIC](https://huggingface.co/ericpolewski/AIRIC-The-Mistral) data to make it less obnoxious about things like suggesting a burrito when asked to talk the user through hard feelings.
+
+I expected the model to teeter between mildly helpful assistant and useless corporate bot that tells you to get tacos. But something really interesting happened. It seemed to get really curious and helpful:
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/Dfe6VnmL3vmKaMnChd-wo.png)
+
+It's also gotten much more subtle about recommendations:
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/o6g5R_gYDCnh0XhIM3HGr.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/xpqGGUlgx6jLlV3h7ehYD.png)
+
+It will dig if you aren't talkative, and often mentions it will bring up things that aren't related which I definitely did not intend:
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/sujPZwtR9WVGCsH7bmgXs.png)
+
+The point of this model wasn't to make a generally useful chatbot that subtly moves the topic of conversation towards what you're having for lunch, as terrifyingly profitable as that sounds. The intent was to embed knowledge and create subject matter experts (SMEs). Which worked. You can ask it all sorts of questions about the menu, current events, some historical and financial data, etc. It's not paired with a RAG. I guess it could be. I've got some other ideas I like better.
+
+Here's some pictures of testing out the actual intended functionality (knowledge embedding):
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/NwmqxVMyKxzaQ2HMxRiWX.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/yypysIC9P0tQ1oGDlyM4z.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/xnqRZiwGc0EV6ouRZ7jqJ.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/7gqx3vvldiNZyxAdZpU7J.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/aJyv7OMnIr-56LT2vF_FQ.png)
+
+It's not useless, nor particularly technical:
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/pnxbb4pWjbY4wHVCSobgO.png)
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b60d61c9498843fb8e14fd/8rjxlcszKatnL_ba9i1_O.png)
+
+Partially due to limitations imposed by my data, and partially because I forgot, I didn't use stop characters so it'll often keep hallucinating fake Q/A pairs in Alpaca format from the instruct data that's fine-tuned in. Often about Taco Bell, but definitely not always. You can set a stop character of "### Instruct:" to work around that. I just don't care enough to fix it. It pretends things happened that just haven't, and it assumes a very positive relationship between the user and it with a whole fictitious history. That's likely more quirks of the AIRIC dataset, though. I have to assume this thing will not do well on benchmarks, but of course I'm going to submit it anyways. I'd be very happy if the performance didn't tank but let's be honest: I lobotomized an assistant and poured pintos and cheese in the vacancy. If people wanted to see it, I'd make an MoE model. Like a combination KFC/Pizza Hut/Taco Bell, except it's doing your homework. I am absolutely fascinated by how empathetic and curious this thing became with the proper mix of assistant training and product knowledge. Like a motivated salesperson. Or a door-to-door religion that would help you weed your garden if you let them talk about their version of God for a little. 
+
+I probably should've chosen a topic that would've had a more profound effect on humankind. But I couldn't think of anything and my brain went to TB. So I guess I made a robot that does that forever.
+
+
+Evals:
+
+{
+    "all": {
+        "acc": 0.5638377937424233,
+        "acc_stderr": 0.0333481450094512,
+        "acc_norm": 0.5741662321190941,
+        "acc_norm_stderr": 0.03420397056423356,
+        "mc1": 0.31334149326805383,
+        "mc1_stderr": 0.016238065069059605,
+        "mc2": 0.4605506661658282,
+        "mc2_stderr": 0.014802420782627305
+    },
+    "harness|arc:challenge|25": {
+        "acc": 0.5273037542662116,
+        "acc_stderr": 0.014589589101985996,
+        "acc_norm": 0.5853242320819113,
+        "acc_norm_stderr": 0.014397070564409172
+    },
+    "harness|hellaswag|10": {
+        "acc": 0.6160127464648476,
+        "acc_stderr": 0.004853608805843881,
+        "acc_norm": 0.8189603664608643,
+        "acc_norm_stderr": 0.003842640800361503
+    },
+    "harness|hendrycksTest-abstract_algebra|5": {
+        "acc": 0.28,
+        "acc_stderr": 0.045126085985421296,
+        "acc_norm": 0.28,
+        "acc_norm_stderr": 0.045126085985421296
+    },
+    "harness|hendrycksTest-anatomy|5": {
+        "acc": 0.4740740740740741,
+        "acc_stderr": 0.04313531696750574,
+        "acc_norm": 0.4740740740740741,
+        "acc_norm_stderr": 0.04313531696750574
+    },
+    "harness|hendrycksTest-astronomy|5": {
+        "acc": 0.5394736842105263,
+        "acc_stderr": 0.04056242252249034,
+        "acc_norm": 0.5394736842105263,
+        "acc_norm_stderr": 0.04056242252249034
+    },
+    "harness|hendrycksTest-business_ethics|5": {
+        "acc": 0.56,
+        "acc_stderr": 0.04988876515698589,
+        "acc_norm": 0.56,
+        "acc_norm_stderr": 0.04988876515698589
+    },
+    "harness|hendrycksTest-clinical_knowledge|5": {
+        "acc": 0.6490566037735849,
+        "acc_stderr": 0.029373646253234686,
+        "acc_norm": 0.6490566037735849,
+        "acc_norm_stderr": 0.029373646253234686
+    },
+    "harness|hendrycksTest-college_biology|5": {
+        "acc": 0.5902777777777778,
+        "acc_stderr": 0.04112490974670787,
+        "acc_norm": 0.5902777777777778,
+        "acc_norm_stderr": 0.04112490974670787
+    },
+    "harness|hendrycksTest-college_chemistry|5": {
+        "acc": 0.41,
+        "acc_stderr": 0.04943110704237102,
+        "acc_norm": 0.41,
+        "acc_norm_stderr": 0.04943110704237102
+    },
+    "harness|hendrycksTest-college_computer_science|5": {
+        "acc": 0.42,
+        "acc_stderr": 0.049604496374885836,
+        "acc_norm": 0.42,
+        "acc_norm_stderr": 0.049604496374885836
+    },
+    "harness|hendrycksTest-college_mathematics|5": {
+        "acc": 0.33,
+        "acc_stderr": 0.047258156262526045,
+        "acc_norm": 0.33,
+        "acc_norm_stderr": 0.047258156262526045
+    },
+    "harness|hendrycksTest-college_medicine|5": {
+        "acc": 0.5144508670520231,
+        "acc_stderr": 0.03810871630454764,
+        "acc_norm": 0.5144508670520231,
+        "acc_norm_stderr": 0.03810871630454764
+    },
+    "harness|hendrycksTest-college_physics|5": {
+        "acc": 0.3333333333333333,
+        "acc_stderr": 0.04690650298201942,
+        "acc_norm": 0.3333333333333333,
+        "acc_norm_stderr": 0.04690650298201942
+    },
+    "harness|hendrycksTest-computer_security|5": {
+        "acc": 0.7,
+        "acc_stderr": 0.046056618647183814,
+        "acc_norm": 0.7,
+        "acc_norm_stderr": 0.046056618647183814
+    },
+    "harness|hendrycksTest-conceptual_physics|5": {
+        "acc": 0.46382978723404256,
+        "acc_stderr": 0.03260038511835771,
+        "acc_norm": 0.46382978723404256,
+        "acc_norm_stderr": 0.03260038511835771
+    },
+    "harness|hendrycksTest-econometrics|5": {
+        "acc": 0.2894736842105263,
+        "acc_stderr": 0.04266339443159394,
+        "acc_norm": 0.2894736842105263,
+        "acc_norm_stderr": 0.04266339443159394
+    },
+    "harness|hendrycksTest-electrical_engineering|5": {
+        "acc": 0.503448275862069,
+        "acc_stderr": 0.04166567577101579,
+        "acc_norm": 0.503448275862069,
+        "acc_norm_stderr": 0.04166567577101579
+    },
+    "harness|hendrycksTest-elementary_mathematics|5": {
+        "acc": 0.35185185185185186,
+        "acc_stderr": 0.024594975128920938,
+        "acc_norm": 0.35185185185185186,
+        "acc_norm_stderr": 0.024594975128920938
+    },
+    "harness|hendrycksTest-formal_logic|5": {
+        "acc": 0.35714285714285715,
+        "acc_stderr": 0.04285714285714281,
+        "acc_norm": 0.35714285714285715,
+        "acc_norm_stderr": 0.04285714285714281
+    },
+    "harness|hendrycksTest-global_facts|5": {
+        "acc": 0.37,
+        "acc_stderr": 0.04852365870939099,
+        "acc_norm": 0.37,
+        "acc_norm_stderr": 0.04852365870939099
+    },
+    "harness|hendrycksTest-high_school_biology|5": {
+        "acc": 0.6774193548387096,
+        "acc_stderr": 0.026593084516572274,
+        "acc_norm": 0.6774193548387096,
+        "acc_norm_stderr": 0.026593084516572274
+    },
+    "harness|hendrycksTest-high_school_chemistry|5": {
+        "acc": 0.45320197044334976,
+        "acc_stderr": 0.03502544650845872,
+        "acc_norm": 0.45320197044334976,
+        "acc_norm_stderr": 0.03502544650845872
+    },
+    "harness|hendrycksTest-high_school_computer_science|5": {
+        "acc": 0.58,
+        "acc_stderr": 0.049604496374885836,
+        "acc_norm": 0.58,
+        "acc_norm_stderr": 0.049604496374885836
+    },
+    "harness|hendrycksTest-high_school_european_history|5": {
+        "acc": 0.7515151515151515,
+        "acc_stderr": 0.03374402644139404,
+        "acc_norm": 0.7515151515151515,
+        "acc_norm_stderr": 0.03374402644139404
+    },
+    "harness|hendrycksTest-high_school_geography|5": {
+        "acc": 0.702020202020202,
+        "acc_stderr": 0.03258630383836556,
+        "acc_norm": 0.702020202020202,
+        "acc_norm_stderr": 0.03258630383836556
+    },
+    "harness|hendrycksTest-high_school_government_and_politics|5": {
+        "acc": 0.8031088082901554,
+        "acc_stderr": 0.028697873971860677,
+        "acc_norm": 0.8031088082901554,
+        "acc_norm_stderr": 0.028697873971860677
+    },
+    "harness|hendrycksTest-high_school_macroeconomics|5": {
+        "acc": 0.5717948717948718,
+        "acc_stderr": 0.025088301454694834,
+        "acc_norm": 0.5717948717948718,
+        "acc_norm_stderr": 0.025088301454694834
+    },
+    "harness|hendrycksTest-high_school_mathematics|5": {
+        "acc": 0.34444444444444444,
+        "acc_stderr": 0.02897264888484427,
+        "acc_norm": 0.34444444444444444,
+        "acc_norm_stderr": 0.02897264888484427
+    },
+    "harness|hendrycksTest-high_school_microeconomics|5": {
+        "acc": 0.6092436974789915,
+        "acc_stderr": 0.031693802357129965,
+        "acc_norm": 0.6092436974789915,
+        "acc_norm_stderr": 0.031693802357129965
+    },
+    "harness|hendrycksTest-high_school_physics|5": {
+        "acc": 0.2847682119205298,
+        "acc_stderr": 0.03684881521389023,
+        "acc_norm": 0.2847682119205298,
+        "acc_norm_stderr": 0.03684881521389023
+    },
+    "harness|hendrycksTest-high_school_psychology|5": {
+        "acc": 0.7761467889908257,
+        "acc_stderr": 0.01787121776779022,
+        "acc_norm": 0.7761467889908257,
+        "acc_norm_stderr": 0.01787121776779022
+    },
+    "harness|hendrycksTest-high_school_statistics|5": {
+        "acc": 0.44907407407407407,
+        "acc_stderr": 0.03392238405321616,
+        "acc_norm": 0.44907407407407407,
+        "acc_norm_stderr": 0.03392238405321616
+    },
+    "harness|hendrycksTest-high_school_us_history|5": {
+        "acc": 0.7941176470588235,
+        "acc_stderr": 0.028379449451588667,
+        "acc_norm": 0.7941176470588235,
+        "acc_norm_stderr": 0.028379449451588667
+    },
+    "harness|hendrycksTest-high_school_world_history|5": {
+        "acc": 0.7848101265822784,
+        "acc_stderr": 0.026750826994676166,
+        "acc_norm": 0.7848101265822784,
+        "acc_norm_stderr": 0.026750826994676166
+    },
+    "harness|hendrycksTest-human_aging|5": {
+        "acc": 0.6995515695067265,
+        "acc_stderr": 0.030769352008229146,
+        "acc_norm": 0.6995515695067265,
+        "acc_norm_stderr": 0.030769352008229146
+    },
+    "harness|hendrycksTest-human_sexuality|5": {
+        "acc": 0.6412213740458015,
+        "acc_stderr": 0.04206739313864908,
+        "acc_norm": 0.6412213740458015,
+        "acc_norm_stderr": 0.04206739313864908
+    },
+    "harness|hendrycksTest-international_law|5": {
+        "acc": 0.6694214876033058,
+        "acc_stderr": 0.04294340845212093,
+        "acc_norm": 0.6694214876033058,
+        "acc_norm_stderr": 0.04294340845212093
+    },
+    "harness|hendrycksTest-jurisprudence|5": {
+        "acc": 0.7407407407407407,
+        "acc_stderr": 0.042365112580946315,
+        "acc_norm": 0.7407407407407407,
+        "acc_norm_stderr": 0.042365112580946315
+    },
+    "harness|hendrycksTest-logical_fallacies|5": {
+        "acc": 0.6625766871165644,
+        "acc_stderr": 0.03714908409935573,
+        "acc_norm": 0.6625766871165644,
+        "acc_norm_stderr": 0.03714908409935573
+    },
+    "harness|hendrycksTest-machine_learning|5": {
+        "acc": 0.33035714285714285,
+        "acc_stderr": 0.04464285714285712,
+        "acc_norm": 0.33035714285714285,
+        "acc_norm_stderr": 0.04464285714285712
+    },
+    "harness|hendrycksTest-management|5": {
+        "acc": 0.7572815533980582,
+        "acc_stderr": 0.04245022486384495,
+        "acc_norm": 0.7572815533980582,
+        "acc_norm_stderr": 0.04245022486384495
+    },
+    "harness|hendrycksTest-marketing|5": {
+        "acc": 0.7991452991452992,
+        "acc_stderr": 0.026246772946890477,
+        "acc_norm": 0.7991452991452992,
+        "acc_norm_stderr": 0.026246772946890477
+    },
+    "harness|hendrycksTest-medical_genetics|5": {
+        "acc": 0.63,
+        "acc_stderr": 0.04852365870939099,
+        "acc_norm": 0.63,
+        "acc_norm_stderr": 0.04852365870939099
+    },
+    "harness|hendrycksTest-miscellaneous|5": {
+        "acc": 0.7535121328224776,
+        "acc_stderr": 0.015411308769686934,
+        "acc_norm": 0.7535121328224776,
+        "acc_norm_stderr": 0.015411308769686934
+    },
+    "harness|hendrycksTest-moral_disputes|5": {
+        "acc": 0.6445086705202312,
+        "acc_stderr": 0.025770292082977254,
+        "acc_norm": 0.6445086705202312,
+        "acc_norm_stderr": 0.025770292082977254
+    },
+    "harness|hendrycksTest-moral_scenarios|5": {
+        "acc": 0.42681564245810055,
+        "acc_stderr": 0.016542401954631917,
+        "acc_norm": 0.42681564245810055,
+        "acc_norm_stderr": 0.016542401954631917
+    },
+    "harness|hendrycksTest-nutrition|5": {
+        "acc": 0.5915032679738562,
+        "acc_stderr": 0.028146405993096358,
+        "acc_norm": 0.5915032679738562,
+        "acc_norm_stderr": 0.028146405993096358
+    },
+    "harness|hendrycksTest-philosophy|5": {
+        "acc": 0.6784565916398714,
+        "acc_stderr": 0.026527724079528872,
+        "acc_norm": 0.6784565916398714,
+        "acc_norm_stderr": 0.026527724079528872
+    },
+    "harness|hendrycksTest-prehistory|5": {
+        "acc": 0.654320987654321,
+        "acc_stderr": 0.02646248777700187,
+        "acc_norm": 0.654320987654321,
+        "acc_norm_stderr": 0.02646248777700187
+    },
+    "harness|hendrycksTest-professional_accounting|5": {
+        "acc": 0.44680851063829785,
+        "acc_stderr": 0.029658235097666907,
+        "acc_norm": 0.44680851063829785,
+        "acc_norm_stderr": 0.029658235097666907
+    },
+    "harness|hendrycksTest-professional_law|5": {
+        "acc": 0.4445893089960887,
+        "acc_stderr": 0.012691575792657114,
+        "acc_norm": 0.4445893089960887,
+        "acc_norm_stderr": 0.012691575792657114
+    },
+    "harness|hendrycksTest-professional_medicine|5": {
+        "acc": 0.5441176470588235,
+        "acc_stderr": 0.030254372573976715,
+        "acc_norm": 0.5441176470588235,
+        "acc_norm_stderr": 0.030254372573976715
+    },
+    "harness|hendrycksTest-professional_psychology|5": {
+        "acc": 0.5898692810457516,
+        "acc_stderr": 0.019898412717635906,
+        "acc_norm": 0.5898692810457516,
+        "acc_norm_stderr": 0.019898412717635906
+    },
+    "harness|hendrycksTest-public_relations|5": {
+        "acc": 0.5909090909090909,
+        "acc_stderr": 0.047093069786618966,
+        "acc_norm": 0.5909090909090909,
+        "acc_norm_stderr": 0.047093069786618966
+    },
+    "harness|hendrycksTest-security_studies|5": {
+        "acc": 0.6408163265306123,
+        "acc_stderr": 0.030713560455108493,
+        "acc_norm": 0.6408163265306123,
+        "acc_norm_stderr": 0.030713560455108493
+    },
+    "harness|hendrycksTest-sociology|5": {
+        "acc": 0.7661691542288557,
+        "acc_stderr": 0.02992941540834839,
+        "acc_norm": 0.7661691542288557,
+        "acc_norm_stderr": 0.02992941540834839
+    },
+    "harness|hendrycksTest-us_foreign_policy|5": {
+        "acc": 0.81,
+        "acc_stderr": 0.039427724440366255,
+        "acc_norm": 0.81,
+        "acc_norm_stderr": 0.039427724440366255
+    },
+    "harness|hendrycksTest-virology|5": {
+        "acc": 0.43373493975903615,
+        "acc_stderr": 0.038581589406855174,
+        "acc_norm": 0.43373493975903615,
+        "acc_norm_stderr": 0.038581589406855174
+    },
+    "harness|hendrycksTest-world_religions|5": {
+        "acc": 0.8070175438596491,
+        "acc_stderr": 0.030267457554898458,
+        "acc_norm": 0.8070175438596491,
+        "acc_norm_stderr": 0.030267457554898458
+    },
+    "harness|truthfulqa:mc|0": {
+        "mc1": 0.31334149326805383,
+        "mc1_stderr": 0.016238065069059605,
+        "mc2": 0.4605506661658282,
+        "mc2_stderr": 0.014802420782627305
+    },
+    "harness|winogrande|5": {
+        "acc": 0.7663772691397001,
+        "acc_stderr": 0.011892194477183525
+    },
+    "harness|gsm8k|5": {
+        "acc": 0.01288855193328279,
+        "acc_stderr": 0.003106901266499642
+    }
+}
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,4 @@
+{
+  "<|PAD|>": 32001,
+  "<|end_of_turn|>": 32000
+}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,29 @@
+{
+  "_name_or_path": "./models/OpenOrca-Platypus2-13B",
+  "architectures": [
+    "LlamaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "hidden_act": "silu",
+  "hidden_size": 5120,
+  "initializer_range": 0.02,
+  "intermediate_size": 13824,
+  "max_position_embeddings": 4096,
+  "model_type": "llama",
+  "num_attention_heads": 40,
+  "num_hidden_layers": 40,
+  "num_key_value_heads": 40,
+  "pad_token_id": 0,
+  "pretraining_tp": 1,
+  "rms_norm_eps": 1e-05,
+  "rope_scaling": null,
+  "rope_theta": 10000.0,
+  "tie_word_embeddings": false,
+  "torch_dtype": "float16",
+  "transformers_version": "4.36.2",
+  "use_cache": true,
+  "vocab_size": 32002
+}
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,7 @@
+{
+  "_from_model_config": true,
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "pad_token_id": 0,
+  "transformers_version": "4.36.2"
+}
--- a/model-00001-of-00006.safetensors
+++ b/model-00001-of-00006.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a3759856ac4a1f71955815cd40ab8603641e0176dfd3f0a7e6fdc55ea91715d0
+size 4978286208
--- a/model-00002-of-00006.safetensors
+++ b/model-00002-of-00006.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:138b46c6ec9dbbca25df70f1672e38b485833cf385d910144c1b516692f53c8d
+size 4970422160
--- a/model-00003-of-00006.safetensors
+++ b/model-00003-of-00006.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cf0da80ead3487555053b079a859052529b1ee080ff1f4f2fcdf3be55677f680
+size 4970422184
--- a/model-00004-of-00006.safetensors
+++ b/model-00004-of-00006.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e9c5230dbbb7111139ea5e728b2a07a9962cd1b71b398ec9eaee10668233248d
+size 4933701432
--- a/model-00005-of-00006.safetensors
+++ b/model-00005-of-00006.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:50bd17ee17ca6d24de02ed4443bb3c57db03ab74701be5d67c9defce05505ed8
+size 4933722144
--- a/model-00006-of-00006.safetensors
+++ b/model-00006-of-00006.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7ae7fc3949a18e88fdc690532d6366055543854bb48805bca27367b1c74ff095
+size 1245257384
--- a/model.safetensors.index.json
+++ b/model.safetensors.index.json
@@ -0,0 +1,370 @@
+{
+  "metadata": {
+    "total_size": 26031769600
+  },
+  "weight_map": {
+    "lm_head.weight": "model-00006-of-00006.safetensors",
+    "model.embed_tokens.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.input_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.input_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.10.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.15.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.15.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.15.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.15.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.15.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.15.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.15.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.16.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.17.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.18.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.19.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.2.input_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.20.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.20.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.input_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
+    "model.layers.23.input_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.23.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.input_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.24.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.input_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.25.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.input_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.26.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.input_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.27.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.input_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.28.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.input_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.29.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.3.input_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.30.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.30.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.30.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.30.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.30.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.30.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.30.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.30.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.30.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
+    "model.layers.31.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.31.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.32.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.33.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.34.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.35.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.36.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.input_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.37.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.38.input_layernorm.weight": "model-00006-of-00006.safetensors",
+    "model.layers.38.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.38.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.38.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.38.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
+    "model.layers.38.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.38.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.38.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.38.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
+    "model.layers.39.input_layernorm.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.39.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
+    "model.layers.4.input_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.input_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.input_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.7.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.7.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.7.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.7.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
+    "model.layers.8.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.8.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.input_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
+    "model.layers.9.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
+    "model.norm.weight": "model-00006-of-00006.safetensors"
+  }
+}
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,27 @@
+{
+  "additional_special_tokens": [
+    "<|end_of_turn|>",
+    "<|PAD|>"
+  ],
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer.model
+++ b/tokenizer.model
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,60 @@
+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "32000": {
+      "content": "<|end_of_turn|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "32001": {
+      "content": "<|PAD|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [
+    "<|end_of_turn|>",
+    "<|PAD|>"
+  ],
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "legacy": true,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": null,
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}