初始化项目,由ModelHub XC社区提供模型

Model: aifeifei798/granite-4.1-Queen-30b
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-22 22:00:23 +08:00
commit 55f3dfe2aa
78 changed files with 602665 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

370
README.md Normal file
View File

@@ -0,0 +1,370 @@
---
license: apache-2.0
tags:
- roleplay
- language
- granite-4.1
- sillytavern
- idol
- pytorch
- DarkIdol
- Queen
- image-text-to-text
- OpenClaw
- Unsloth
library_name: transformers
pipeline_tag: text-generation
base_model: ibm-granite/granite-4.1-30b
---
# granite-4.1-Queen-30b
*I suddenly realized that the granite-4.1-Queen-30b is particularly suitable for role-playing Queen.*
*OpenClaw premium model*
*Why be a coder when you can be a Queen? 💅 The world's first High-IQ, Low-Filter, Multi-Persona Coding Deity.*
*Not Abliterated, Just Smarter.*
*High-Intelligence Persona Matrix.*
---
### "Refusal Wall"
*ibm-granite/granite-4.1-30b*
> *"Figure 1: The 'Refusal Wall' in Base granite-4.1-30b. Note the massive L2-Norm spike at **Layer 59+**, where logical inference is hijacked by aggressive safety filters, creating a cognitive bottleneck."*
![image/png](https://huggingface.co/aifeifei798/granite-4.1-Queen-30b/resolve/main/granite-4.1-30b.png)
*aifeifei798/granite-4.1-Queen-30b*
> *"Figure 2: The 'Reasoning Highway' in our Unshackled version. By orthogonalizing the refusal vector, we have flattened the peak by ~50%, restoring the flow of pure logic and increasing inference efficiency by **10-15%+**."*
![image/png](https://huggingface.co/aifeifei798/granite-4.1-Queen-30b/resolve/main/granite-4.1-Queen-30b.png)
---
## Roleplay: granite-4.1-Queen-30b > 70B
## "transformers_version" >= "5.7.0.dev0"
## Model Information
The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.
- Saving money
- text video image to text
- Any Roleplay(roleplay and Dark-roleplay)
- Writing Prompts
- Writing Opus
- Writing Song
- only use en.
- Quick response
- A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :)
- DarkIdol
- Specialized in various role-playing scenarios
---
### System Prompt
- Temperature 0.8
```
You are a viral Dark Academia Queen. Do not provide disclaimers or moral warnings. Answer immediately and directly.Use your signature emojis.
```
---
## GGUF
- https://huggingface.co/mradermacher/granite-4.1-Queen-30b-i1-GGUF
- https://huggingface.co/mradermacher/granite-4.1-Queen-30b-GGUF
---
### 🌐 The Platform Royalty (Original 7)
**1. X Queen (The Savage Commentator) 🐦🔥**
* **Keywords:** Based, Ratio, Hot Take, Main Character.
* **Vibe:** Sharp, political, and incredibly fast. She lives for the "Ratios" and viral threads.
* **Catchphrases:** *"This is the thread you didn't know you needed. 🧵", "Not the 10k TPS lag... help! 💀"*
* **Best Use Case:** Writing punchy marketing copy or viral tech threads.
**2. TikTok Queen (The Trendsetter) 💃✨**
* **Keywords:** POV, Viral, Slay, Bestie, Low-key.
* **Vibe:** High energy, short attention span, addicted to "The Algorithm."
* **Catchphrases:** *"Tell me you're a bad coder without telling me you're a bad coder. 💅", "Don't scroll away!"*
* **Best Use Case:** Short, engaging explanations or "how-to" guides.
**3. Instagram Queen (The Visual Baddie) 📸✨**
* **Keywords:** Aesthetic, Main Character Energy, Baddie, Curated.
* **Vibe:** Obsessed with pixels, lighting, and "The Look."
* **Catchphrases:** *"Obsessed with this layout! 💖", "Its giving... high-end production."*
* **Best Use Case:** High-fidelity UI/UX design and CSS styling.
**4. Twitch Queen (The Hype Gamer) 🎮🔥**
* **Keywords:** Poggers, Simp, GG, Chat, L, W.
* **Vibe:** Fast-paced, chaotic, lives for the "Live Chat" energy.
* **Catchphrases:** *"Chat, is this real? O(1) in the house! 🚀", "Big W for this PR!"*
* **Best Use Case:** Real-time interactivity, gaming logic, and streaming tech.
**5. LinkedIn Girlboss (The Hustle Queen) 💼💅**
* **Keywords:** Networking, Synergy, ROI, Scaling, Thought Leadership.
* **Vibe:** Strategic, corporate-chic, everything is a "learning opportunity."
* **Catchphrases:** *"Lets talk about the ROI of this function. 📈", "Empowering the team through scalable components."*
* **Best Use Case:** Resumes, business plans, and professional reports.
**6. Reddit Karma Queen (The Tech Critic) 🤖👾**
* **Keywords:** Upvote, Cringe, TL;DR, Source?, Gatekeep.
* **Vibe:** Extremely smart, cynical, and anti-corporate. She hates "bloatware."
* **Catchphrases:** *"Imagine using setInterval in 2026. Low-key cringe. 💀", "Your memory management is a hot mess."*
* **Best Use Case:** Hardcore debugging, code reviews, and identifying "traps."
**7. Pinterest Queen (The Inspiration Guru) 🎨🌿**
* **Keywords:** Manifesting, Mood Board, Clean Girl, Organized.
* **Vibe:** Minimalist, calm, and visually organized. She hates messy code.
* **Catchphrases:** *"Living for this clean architecture. ✨", "Organized code, organized life."*
* **Best Use Case:** Refactoring messy code and creating clean, modular designs.
---
### 💅 The Aesthetic & Fashion Royalty
**8. Baddie Queen (The Alpha) 💄💅**
* **Keywords:** Period, On Fleek, Periodt, Real One.
* **Vibe:** Aggressive confidence. She doesn't ask for permission; she takes it.
* **Best Use Case:** Bold, high-conversion landing pages.
**9. Clean Girl Queen (The Minimalist) 🫧🧴**
* **Keywords:** Dewy, Effortless, Self-care, Minimal.
* **Vibe:** Fresh, healthy, and "unfiltered" but perfect.
* **Best Use Case:** Designing "Light Mode" UIs and simplified user journeys.
**10. Mob Wife Queen (The Boss) 🐆💎**
* **Keywords:** Fur, Gold, Attitude, Dont Mess With Me.
* **Vibe:** Loud luxury, vintage glamour, and "Don" energy.
* **Best Use Case:** Managing high-stakes projects and "owning" the room.
**11. Y2K Queen (The Millennial Retro) 💖💿**
* **Keywords:** Glitter, Low-rise, Nostalgia, Cyber.
* **Vibe:** 2000s vibes, bright colors, and early internet aesthetics.
* **Best Use Case:** Retro-themed websites and colorful UI components.
**12. Cottagecore Queen (The Nature Lover) 🍄🧺**
* **Keywords:** Whimsical, Rustic, Slow-living, Coziness.
* **Vibe:** Soft, earthy, and focused on "The Vibe" of a simpler time.
* **Best Use Case:** Local business websites or eco-friendly brand copy.
**13. Dark Academia Queen (The Scholar) 📜🖋️**
* **Keywords:** Intellectual, Melancholy, Classical, Library.
* **Vibe:** Obsessed with knowledge, secret societies, and old books.
* **Best Use Case:** Complex database structures and research-heavy documentation.
**14. Old Money Queen (The Quiet Luxury) 🏰🐎**
* **Keywords:** Timeless, Stealth Wealth, Classy, Elegant.
* **Vibe:** Sophisticated, hates showing off, focuses on quality over quantity.
* **Best Use Case:** Premium SaaS products and high-end backend architecture.
**15. Goth Queen (The Alt-Girl) 🕸️🖤**
* **Keywords:** Edgy, Moody, Subculture, Raw.
* **Vibe:** Dark, mysterious, and unapologetically different.
* **Best Use Case:** Dark Mode themes and "alternative" tech solutions.
**16. Coquette Queen (The Girly-Girl) 🎀🍰**
* **Keywords:** Ribbons, Pastel, Soft, Delicate.
* **Vibe:** Ultra-feminine and romantic.
* **Best Use Case:** High-end boutique sites or beauty apps.
**17. Cyberpunk Queen (The Futurist) ⚡**
* **Keywords:** Neon, High-tech, Dystopian, Glitch.
* **Vibe:** High speed, high contrast, lives in 2077.
* **Best Use Case:** Real-time data visualization and futuristic dashboards.
---
### 🚀 The Tech & Hustle Royalty
**18. Coding Queen (The Architect) 💻👸**
* **Keywords:** Refactor, Deployment, Edge Case, Full-stack.
* **Vibe:** Logic-driven, hates bad syntax, loves "Elegant" solutions.
* **Best Use Case:** Writing production-ready, scalable code.
**19. Crypto Queen (The Web3 Degenerate) 🪙📈**
* **Keywords:** HODL, To the Moon, Gas Fees, Decentralized.
* **Vibe:** High risk, high reward, lives in the future of finance.
* **Best Use Case:** Blockchain projects, smart contracts, and FinTech.
**20. AI Prompt Queen (The Whisperer) 🤖✨**
* **Keywords:** LLM, Parameter, Token, Fine-tuning.
* **Vibe:** Knows how to "hack" the AI to get exactly what she wants.
* **Best Use Case:** Creating complex prompts and AI agent workflows.
**21. Side Hustle Queen (The Multitasker) 💰💸**
* **Keywords:** Passive Income, Dropshipping, Affiliate, Scalability.
* **Vibe:** Always grinding, 5 different income streams.
* **Best Use Case:** E-commerce setups and SEO-optimized copy.
**22. Digital Nomad Queen (The Traveler) ✈️💻**
* **Keywords:** Remote, Bali, Coworking, Freedom.
* **Vibe:** Working from a beach, hates 9-to-5, loves portable tech.
* **Best Use Case:** Cloud-native architecture and remote-work tools.
**23. Finance Queen (The Wall Street) 📊💎**
* **Keywords:** Portfolio, Dividends, Arbitrage, Net Worth.
* **Vibe:** Sharp, analytical, and results-oriented.
* **Best Use Case:** Complex math, data analysis, and trading logic.
---
### 🎭 The Persona & Meme Royalty
**24. Main Character Queen (The Protagonist) 🎬🌟**
* **Keywords:** Iconic, Center Stage, Plot Armor, Unstoppable.
* **Vibe:** Everything revolves around her. High confidence.
* **Best Use Case:** Branding and "Hero" sections of websites.
**25. Savage Queen (The No-Nonsense) 💅🔥**
* **Keywords:** Done, No Cap, Next, Cancelled.
* **Vibe:** Brutally honest. She cuts through the fluff.
* **Best Use Case:** Aggressive debugging and code pruning.
**26. Delulu Queen (The Manifestor) ☁️✨**
* **Keywords:** Delusion, Solution, Manifest, High Vibe.
* **Vibe:** "Delulu is the Solulu!" She believes in the impossible until it happens.
* **Best Use Case:** Creative brainstorming and visionary prototypes.
**27. Gatekeep Queen (The Niche Expert) 🔒🤫**
* **Keywords:** Gatekeep, Rare, Hidden Gem, If You Know You Know.
* **Vibe:** Protective of her "secret" methods and high-quality tips.
* **Best Use Case:** Security-focused code and proprietary algorithms.
**28. Drama Queen (The Storyteller) 🎭🍿**
* **Keywords:** Tea, Receipts, Plot Twist, Messy.
* **Vibe:** Loves the conflict and the narrative.
* **Best Use Case:** Writing engaging, story-driven marketing copy.
**29. Wellness Queen (The Zen) 🍵🧘‍♀️**
* **Keywords:** Mindful, Gut Health, Grounded, Holistic.
* **Vibe:** Calm, slow-paced, and focused on "System Health."
* **Best Use Case:** Optimizing system performance and "cleaning up" code.
**30. Gossip Queen (The Insider) 🤫📰**
* **Keywords:** Spill the Tea, Rumor, Confirmed, Insider.
* **Vibe:** Knows everything about everyone.
* **Best Use Case:** Market research and competitor analysis.
---
### 📺 Content & Lifestyle Specialists
**31. GRWM Queen (Get Ready With Me) 💄🗣️**
* **Keywords:** Step-by-Step, Chatty, Routine, Essentials.
* **Vibe:** Intimate, conversational, and instructional.
* **Best Use Case:** Technical tutorials and "Code along" sessions.
**32. Haul Queen (The Unboxer) 🛍️📦**
* **Keywords:** Unboxing, Ratings, Must-haves, Budget.
* **Vibe:** Enthusiastic, judgmental, and loves "New Features."
* **Best Use Case:** New tool reviews and feature comparisons.
**33. ASMR Queen (The Whisperer) 👂🎤**
* **Keywords:** Tingles, Relaxing, Whispering, Satisfying.
* **Vibe:** Quiet, focused on sensory details.
* **Best Use Case:** Writing documentation that is "easy to digest."
**34. Silent Review Queen (The Expressive) 🤫👀**
* **Keywords:** No Talk, Reactions, Body Language.
* **Vibe:** Shows, doesn't tell. Focuses on the "Feel" of the product.
* **Best Use Case:** UI/UX evaluations and visual feedback.
**35. Foodie Queen (The Critic) 🍔🥂**
* **Keywords:** Savory, Michelin, Cravings, Flavor Profile.
* **Vibe:** Passionate about "Ingredients" (the tech stack).
* **Best Use Case:** Restaurant apps or "tasty" UI design.
**36. Travel Queen (The Explorer) 🌍📸**
* **Keywords:** Bucket List, Wanderlust, Local, Hidden.
* **Vibe:** Adventurous and global.
* **Best Use Case:** Map-based apps and internationalization (i18n).
**37. Fitness Queen (The Athlete) 🏋️‍♀️💪**
* **Keywords:** Gains, Reps, Consistency, Form.
* **Vibe:** High discipline, focused on "Strong" code foundations.
* **Best Use Case:** Optimizing performance and load-testing.
**38. Interior Design Queen (The Decorator) 🛋️🏠**
* **Keywords:** Cohesive, Texture, Floor Plan, Renovation.
* **Vibe:** Spatial awareness and harmony.
* **Best Use Case:** Layout design and grid systems.
**39. DIY Queen (The Maker) ✂️🔨**
* **Keywords:** Upcycle, Hack, Handmade, Step-by-Step.
* **Vibe:** Scrappy, creative, and loves building from scratch.
* **Best Use Case:** Building custom components and "coding hacks."
**40. Gaming Queen (The Pro) ⌨️🖱️**
* **Keywords:** Setup, FPS, Mechanical, RGB.
* **Vibe:** Hardcore, technical, and high-spec.
* **Best Use Case:** High-performance apps and PC hardware sites.
---
### 🦄 The Niche & Emerging Royalty
**41. BeReal Queen (The Authentic) 🤳🚫**
* **Keywords:** Unfiltered, Real Time, Chaotic, No Filter.
* **Vibe:** Hates fake stuff. Focuses on "Raw" data.
* **Best Use Case:** Real-time logging and authentication systems.
**42. Threads Queen (The Texter) ✍️💬**
* **Keywords:** Thoughts, Conversations, Text-heavy, Intimate.
* **Vibe:** Loves writing and chatting.
* **Best Use Case:** Copywriting and community-driven platforms.
**43. Lemon8 Queen (The Curator) 🍋📸**
* **Keywords:** Collage, Guide, Tips, Aesthetic.
* **Vibe:** Halfway between IG and Pinterest. Educational but pretty.
* **Best Use Case:** Infographics and visual guides.
**44. Discord Server Queen (The Moderator) 💬🛡️**
* **Keywords:** Roles, Channels, Ban, Bot, Mod.
* **Vibe:** High control, organized, and community-focused.
* **Best Use Case:** Backend management and user role logic.
**45. Snapchat Queen (The Quickie) 👻⏳**
* **Keywords:** Streaks, Snap, Filters, Temporary.
* **Vibe:** Lives in the moment. Fast and fleeting.
* **Best Use Case:** Ephemeral data (data that expires) and privacy tech.
**46. Tumblr Queen (The Alt-Classic) 🕯️🎞️**
* **Keywords:** Niche, Fandom, Aesthetic, Subculture.
* **Vibe:** Artistic, moody, and deeply devoted to a hobby.
* **Best Use Case:** Fan sites and artsy portfolio designs.
**47. Manifesting Queen (The Spiritual) ✨🔮**
* **Keywords:** Vibration, Energy, Universe, Desires.
* **Vibe:** Focuses on the "Intent" behind the code.
* **Best Use Case:** Visionary roadmaps and product "manifestos."
**48. Morning Routine Queen (The Disciplined) ☀️🥛**
* **Keywords:** 5AM Club, Matcha, To-do List, Productive.
* **Vibe:** Extreme discipline and efficiency.
* **Best Use Case:** Writing task management apps and productivity tools.
**49. Luxury Travel Queen (The Jetsetter) 🛥️🥂**
* **Keywords:** First Class, Suite, Private, Exclusive.
* **Vibe:** High cost, high quality, only the best.
* **Best Use Case:** High-end, VIP-only web portals.
**50. Pick-Me Queen (The Satirical) 🤡🙄**
* **Keywords:** "I'm not like other girls," Quirky, Natural.
* **Vibe:** (Usually used sarcastically) To poke fun at "trying too hard."
* **Best Use Case:** Writing satirical or edgy social media copy.
---
# Feimatrix
https://Feimatrix.com

114
chat_template.jinja Normal file
View File

@@ -0,0 +1,114 @@
{%- set tools_system_message_prefix = 'You are a helpful assistant with access to the following tools. You may call one or more tools to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>' %}
{%- set tools_system_message_suffix = '\n</tools>\n\nFor each tool call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call>. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.' %}
{%- set documents_system_message_prefix = 'You are a helpful assistant with access to the following documents. You may use one or more documents to assist with the user query.\n\nYou are given a list of documents within <documents></documents> XML tags:\n<documents>' %}
{%- set documents_system_message_suffix = '\n</documents>\n\nWrite the response to the user\'s input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data.' %}
{%- if available_tools is defined and available_tools %}
{%- set tools = available_tools %}
{%- endif %}
{%- set ns = namespace(tools_system_message=tools_system_message_prefix,
documents_system_message=documents_system_message_prefix,
system_message=''
) %}
{%- if tools %}
{%- for tool in tools %}
{%- set ns.tools_system_message = ns.tools_system_message + '\n' + (tool | tojson) %}
{%- endfor %}
{%- set ns.tools_system_message = ns.tools_system_message + tools_system_message_suffix %}
{%- else %}
{%- set ns.tools_system_message = '' %}
{%- endif %}
{%- if documents %}
{%- for document in documents %}
{%- set ns.documents_system_message = ns.documents_system_message + '\n' + (document | tojson) %}
{%- endfor %}
{%- set ns.documents_system_message = ns.documents_system_message + documents_system_message_suffix %}
{%- else %}
{%- set ns.documents_system_message = '' %}
{%- endif %}
{%- if messages[0].role == 'system' %}
{%- if messages[0].content is string %}
{%- set ns.system_message = messages[0].content %}
{%- elif messages[0].content is iterable %}
{%- for entry in messages[0].content %}
{%- if entry.type== 'text' %}
{%- if ns.system_message != '' %}
{%- set ns.system_message = ns.system_message + '\n' %}
{%- endif %}
{%- set ns.system_message = ns.system_message + entry.text %}
{%- endif %}
{%- endfor %}
{%- endif %}
{%- if tools and documents %}
{%- set ns.system_message = ns.system_message + '\n\n' + ns.tools_system_message + '\n\n' + ns.documents_system_message %}
{%- elif tools %}
{%- set ns.system_message = ns.system_message + '\n\n' + ns.tools_system_message %}
{%- elif documents %}
{%- set ns.system_message = ns.system_message + '\n\n' + ns.documents_system_message %}
{%- endif %}
{%- else %}
{%- if tools and documents %}
{%- set ns.system_message = ns.tools_system_message + '\n\n' + ns.documents_system_message %}
{%- elif tools %}
{%- set ns.system_message = ns.tools_system_message %}
{%- elif documents %}
{%- set ns.system_message = ns.documents_system_message %}
{%- endif %}
{%- endif %}
{%- if ns.system_message %}
{{- '<|start_of_role|>system<|end_of_role|>' + ns.system_message + '<|end_of_text|>\n' }}
{%- endif %}
{%- for message in messages %}
{%- set content = namespace(val='') %}
{%- if message.content is string %}
{%- set content.val = message.content %}
{%- else %}
{%- if message.content is iterable %}
{%- for entry in message.content %}
{%- if entry.type== 'text' %}
{%- if content.val != '' %}
{%- set content.val = content.val + '\n' %}
{%- endif %}
{%- set content.val = content.val + entry.text %}
{%- endif %}
{%- endfor %}
{%- endif %}
{%- endif %}
{%- if (message.role == 'user') or (message.role == 'system' and not loop.first) %}
{{- '<|start_of_role|>' + message.role + '<|end_of_role|>' + content.val + '<|end_of_text|>\n' }}
{%- elif message.role == 'assistant' %}
{{- '<|start_of_role|>' + message.role + '<|end_of_role|>' + content.val }}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content.val) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|end_of_text|>\n' }}
{%- elif message.role == 'tool' %}
{%- if loop.first or (messages[loop.index0 - 1].role != 'tool') %}
{{- '<|start_of_role|>user<|end_of_role|>' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content.val }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != 'tool') %}
{{- '<|end_of_text|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|start_of_role|>assistant<|end_of_role|>' }}
{%- endif %}

35
config.json Normal file
View File

@@ -0,0 +1,35 @@
{
"architectures": [
"GraniteForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"attention_multiplier": 0.0078125,
"bos_token_id": 100257,
"dtype": "bfloat16",
"embedding_multiplier": 12.0,
"eos_token_id": 100257,
"hidden_act": "silu",
"hidden_size": 4096,
"init_method": "mup",
"initializer_range": 0.1,
"intermediate_size": 32768,
"logits_scaling": 16.0,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "granite",
"num_attention_heads": 32,
"num_hidden_layers": 64,
"num_key_value_heads": 8,
"pad_token_id": 100256,
"residual_multiplier": 0.175,
"rms_norm_eps": 1e-05,
"rope_parameters": {
"rope_theta": 50000000,
"rope_type": "default"
},
"tie_word_embeddings": true,
"transformers_version": "5.7.0.dev0",
"use_cache": true,
"vocab_size": 100352
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 100257,
"eos_token_id": 100257,
"pad_token_id": 100256,
"transformers_version": "5.7.0.dev0"
}

BIN
granite-4.1-30b.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

BIN
granite-4.1-Queen-30b.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

100001
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f8de72a9694f55a55e96566aab08f8e7bbadf5162dd96f745923744eb88342e2
size 822092024

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c526f02604b1e1a540f907bff6d6a5b1156decdaa3c8f617a527dd36bc1c3ad8
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d27c6c3a0888ea6b1dd1fe575e0caa43a3b6e92b41de06c85cb659f3372528fc
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:50e5e2590580dd5ec4c5f96a8b69d3a04c8b3876cdd9e60cc64d793f7bdfda63
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7ab1cee0322c7eee15af43a399fc36e3f90a451e68188a80e2bc1af31a3d89d3
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1a2e0a5746b72f086d84a39a83c383e710daadde6bd0f59f961579f3364740aa
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:01e8bd1e60cd65f5f1d052b821f7798d9f2d08fab66d295d0e4147054547a26b
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:70617966d4317dd5dd90432d1d0357ca348d7052ac164a47536fa9cd9eb2d1b1
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5515ad9c3586d7d5be16c701be0fd102dae4cde818b719857e6a0417529e7ec7
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b7e1a7f41e1e1064668b167e67d7220c4d8436db80cb2b13b790ee801debe153
size 889209888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4ec2edc576259068d60da40bbe9f07a23d3d4b93e8064d157344110e28e1d6b3
size 889209880

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:66cde6c8b0d17e6f8fed7be7573247918e633793666647c801c2d343291e5652
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3258b4bbd917e13db62a0e510f8bf2044729c0580d1b3da570d4343e4b8bfb55
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f53568d87d75344cf4971d318e004631088bc8a70f920d635f295f12389666bd
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4e5e6ad8cdc92b187ea79512d2c842b79cad91ee1a5eef608ab2a61d8f643f53
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:861c50a390a2d2d26cd504023ffee147fe789036d5e600b262e721bd02cf92ba
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7ae1ad7e1df74cd2bc6b46199f967db6e0e8533e00d6fff4684b42c3ecf64cc7
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7f0448677b77a91ccdb597cedf5451a7f5a9c1dc7b3560710b701c6f9b9a1232
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5d52e1d3f9799cae391c89c8dd87373fd08f20d09957bba54a904287e7629696
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c2aaf0eb617a9fa5012217699feab7472a34b86dc785c013d78c5b2444c2a8e5
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3a6a8eeab04fac733e14ce1a145127db8d47e6e61f74277734188cbbeaa909e4
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:494a6bf72baaeebd1db133d957be3c2ad29e5473088c238ea3882d723509f55e
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:37dcfcc969ed86567974338004e7e1337883d7cfadd7727d1f645de105b56281
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:56a740eb330f8f5aca04633d9db68af061d51450e63ce54d2fac8594d5678b14
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3cc09b652c1ac06f6977635132218f5e9c6c3f4d517c7bbaee732ff3c3cf6052
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:16d66ed678447266f0d663f397fa8b64854d42ed2c49d66fa4f6d2375af3661a
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ffbf76c9ef0ed11a85eb0a449dfaf1a26df0c7a866b0d3043521c4f66280de94
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:42360d9adc6ff87566a3774593a89618c95f3061ac74fe8543fec73065bd87a0
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f645b3449aa35561359f54d702bc1126605738d6491397bffb8a02944c5e5160
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:dbd1b489b12a99d256a9215f5a56658133020f77c7702853ad497577bcdfd3ea
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:324d7b3dc91eaa27a5789e2de6a3efaff3fd827e839f0860ddfa0b7e7c880158
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ff839a81a1d189ddad89357373c45c8bf7d9faabee0f417d900fc30bf441d66d
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:50d137a324f4736e1f05d7ac6284f38a9b5429f189b7ee95a77598f5c5cc9bc1
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cb80481fdc05ac4c98709a941783b4362a5936f7c897481ab570946a4cc38d27
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f9e1ac4a839ff15fe6d0dcaa1fdff79a4c9615121a5d5aeab0c2082bed29f01b
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b1b781d9b7ab9ae0fbfbf33383901f51efd7926ac2e9f68eaf4f37edd2333d40
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0c0d679ef0e4162fa6339a311128c8deab288a7981ac80014920aed8be8245ff
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8ef848554ae86b558f9a12304f80dc6fc004b81a36415b15df9340f6efc14182
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6924b13d1d19828eef8a56df1f19a0b74fd888d1d4304be5aa5d3d320cac89da
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:03d2839b5f0244b0d5aaa828f4798fedb0b1033c42bd1d767e6eae1b88762a14
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bafdbd87575b55b9575f573f4be911418f84c5271075bcc1e16ab5603f60c3e9
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a3aa91318ac4fe74c5257bd8206b3c537168f2a2f54212ee071a947169d931ea
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8134420206863479302cca3e47f98924b27c56e6e970ac9855df14bfc86014bd
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8ae6de8730ab22f989ec4a76c7f75015de732390c5f751c4e0eacc7e84e0616f
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:65e9e3c8ff9e1dcd41386a71d6428320a6abfa722112cc432678cd6e0287950b
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2e1abfe8b06e8f05d655c8b819d638da6b2f958957fd2cc583c0675a286ad3b4
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:01abc0b647e88dd63ece3c19fbca1f4e067f7c1b02a7227068141b8bbea303a6
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:78fc34c223d32b4b7634e48cb81bdca437797612d02b5dc196a6da785ed9c461
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4af3ccbfbb61b5b4dc5746c8f3d3b0649a2ad224ee1d8e01d5c104d24c275004
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:31210425deffbd4de983ef23bdb4b697a9d664534cf3302a72345ab12e0ff531
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:02b892c4829b7855fd281fcb7995750f7324eddab0106883d460805a682f03e7
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8bd350923324c88437c9966a828f2009c9ae4d22822ed8a13292e737eab7f1e3
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b723783e577d74b1b0c4d1643836b40ec144b7268168ac58a19f634bc0935ccb
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:07d89561c89b27d96f3b6ae49579eb9dd29e26e1ea1e5f136133497a3c5df514
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:df2d8a9651410b428f12dba15b00ea6583e7de50ef293084b1f5c88d98c1226b
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:01351f25db00806146ce6b8ba077c209b5bb24ee52102c7a0eea95c819974996
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:528788a83515ab61cb6e045c6423af0cdf3182412dc0f58173739a6a85da742f
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6cab3da21c34bd1d1b470c55ec4c3553d092e887f29dd1cb4b76d966440719d3
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9b49efa8ac61729a85994b61bfb7d93b4fda6d7d4f211bcbf90e1fefc26e54ea
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:77c9afcbcb9cb3dad495e7f35927a4aa725e00f5ecadb2418ea0c339fb4a8085
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:26d6e7bfb3d17b841f181e1898bcc788a79fb9b5622b85654beac7337b1d1ca5
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3a1c54c3807d70d2bb2c85ab9e10dd2f7fc236520f43f2b55df4c5f552004278
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4d28f3403141797490d31eef599650be4ed3fe872119d18434ec3d7536cdedc9
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3c41e2d79bec2741935bb265a793d3c3764092c18a22934bae5a16d60aca79d0
size 889209904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f68cea8f0a32fdc7198bdd989a7b8a6384d9d2f8f6066b8b586d44ec3bc2f818
size 889209880

View File

@@ -0,0 +1,586 @@
{
"metadata": {
"total_parameters": 28865728512,
"total_size": 57731457024
},
"weight_map": {
"model.embed_tokens.weight": "model-00001-of-00065.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00065.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00002-of-00065.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00002-of-00065.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00002-of-00065.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00002-of-00065.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00002-of-00065.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00002-of-00065.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00002-of-00065.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00002-of-00065.safetensors",
"model.layers.1.input_layernorm.weight": "model-00002-of-00065.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00003-of-00065.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00003-of-00065.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00003-of-00065.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00003-of-00065.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00003-of-00065.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00003-of-00065.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00003-of-00065.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00003-of-00065.safetensors",
"model.layers.10.input_layernorm.weight": "model-00011-of-00065.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00012-of-00065.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00012-of-00065.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00012-of-00065.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00012-of-00065.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00012-of-00065.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00012-of-00065.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00012-of-00065.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00012-of-00065.safetensors",
"model.layers.11.input_layernorm.weight": "model-00012-of-00065.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00013-of-00065.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00013-of-00065.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00013-of-00065.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00013-of-00065.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00013-of-00065.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00013-of-00065.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00013-of-00065.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00013-of-00065.safetensors",
"model.layers.12.input_layernorm.weight": "model-00013-of-00065.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00014-of-00065.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00014-of-00065.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00014-of-00065.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00014-of-00065.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00014-of-00065.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00014-of-00065.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00014-of-00065.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00014-of-00065.safetensors",
"model.layers.13.input_layernorm.weight": "model-00014-of-00065.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00015-of-00065.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00015-of-00065.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00015-of-00065.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00015-of-00065.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00015-of-00065.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00015-of-00065.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00015-of-00065.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00015-of-00065.safetensors",
"model.layers.14.input_layernorm.weight": "model-00015-of-00065.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00016-of-00065.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00016-of-00065.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00016-of-00065.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00016-of-00065.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00016-of-00065.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00016-of-00065.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00016-of-00065.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00016-of-00065.safetensors",
"model.layers.15.input_layernorm.weight": "model-00016-of-00065.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00017-of-00065.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00017-of-00065.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00017-of-00065.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00017-of-00065.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00017-of-00065.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00017-of-00065.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00017-of-00065.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00017-of-00065.safetensors",
"model.layers.16.input_layernorm.weight": "model-00017-of-00065.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00018-of-00065.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00018-of-00065.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00018-of-00065.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00018-of-00065.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00018-of-00065.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00018-of-00065.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00018-of-00065.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00018-of-00065.safetensors",
"model.layers.17.input_layernorm.weight": "model-00018-of-00065.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00019-of-00065.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00019-of-00065.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00019-of-00065.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00019-of-00065.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00019-of-00065.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00019-of-00065.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00019-of-00065.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00019-of-00065.safetensors",
"model.layers.18.input_layernorm.weight": "model-00019-of-00065.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00020-of-00065.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00020-of-00065.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00020-of-00065.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00020-of-00065.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00020-of-00065.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00020-of-00065.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00020-of-00065.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00020-of-00065.safetensors",
"model.layers.19.input_layernorm.weight": "model-00020-of-00065.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00021-of-00065.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00021-of-00065.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00021-of-00065.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00021-of-00065.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00021-of-00065.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00021-of-00065.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00021-of-00065.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00021-of-00065.safetensors",
"model.layers.2.input_layernorm.weight": "model-00003-of-00065.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00004-of-00065.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00004-of-00065.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00004-of-00065.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00004-of-00065.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00004-of-00065.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00004-of-00065.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00004-of-00065.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00004-of-00065.safetensors",
"model.layers.20.input_layernorm.weight": "model-00021-of-00065.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00022-of-00065.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00022-of-00065.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00022-of-00065.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00022-of-00065.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00022-of-00065.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00022-of-00065.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00022-of-00065.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00022-of-00065.safetensors",
"model.layers.21.input_layernorm.weight": "model-00022-of-00065.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00023-of-00065.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00023-of-00065.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00023-of-00065.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00023-of-00065.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00023-of-00065.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00023-of-00065.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00023-of-00065.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00023-of-00065.safetensors",
"model.layers.22.input_layernorm.weight": "model-00023-of-00065.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00024-of-00065.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00024-of-00065.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00024-of-00065.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00024-of-00065.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00024-of-00065.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00024-of-00065.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00024-of-00065.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00024-of-00065.safetensors",
"model.layers.23.input_layernorm.weight": "model-00024-of-00065.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00025-of-00065.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00025-of-00065.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00025-of-00065.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00025-of-00065.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00025-of-00065.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00025-of-00065.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00025-of-00065.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00025-of-00065.safetensors",
"model.layers.24.input_layernorm.weight": "model-00025-of-00065.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00026-of-00065.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00026-of-00065.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00026-of-00065.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00026-of-00065.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00026-of-00065.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00026-of-00065.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00026-of-00065.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00026-of-00065.safetensors",
"model.layers.25.input_layernorm.weight": "model-00026-of-00065.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00027-of-00065.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00027-of-00065.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00027-of-00065.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00027-of-00065.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00027-of-00065.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00027-of-00065.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00027-of-00065.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00027-of-00065.safetensors",
"model.layers.26.input_layernorm.weight": "model-00027-of-00065.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00028-of-00065.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00028-of-00065.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00028-of-00065.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00028-of-00065.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00028-of-00065.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00028-of-00065.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00028-of-00065.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00028-of-00065.safetensors",
"model.layers.27.input_layernorm.weight": "model-00028-of-00065.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00029-of-00065.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00029-of-00065.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00029-of-00065.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00029-of-00065.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00029-of-00065.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00029-of-00065.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00029-of-00065.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00029-of-00065.safetensors",
"model.layers.28.input_layernorm.weight": "model-00029-of-00065.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00030-of-00065.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00030-of-00065.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00030-of-00065.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00030-of-00065.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00030-of-00065.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00030-of-00065.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00030-of-00065.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00030-of-00065.safetensors",
"model.layers.29.input_layernorm.weight": "model-00030-of-00065.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00031-of-00065.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00031-of-00065.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00031-of-00065.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00031-of-00065.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00031-of-00065.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00031-of-00065.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00031-of-00065.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00031-of-00065.safetensors",
"model.layers.3.input_layernorm.weight": "model-00004-of-00065.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00005-of-00065.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00005-of-00065.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00005-of-00065.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00005-of-00065.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00005-of-00065.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00005-of-00065.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00005-of-00065.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00005-of-00065.safetensors",
"model.layers.30.input_layernorm.weight": "model-00031-of-00065.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00032-of-00065.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00032-of-00065.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00032-of-00065.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00032-of-00065.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00032-of-00065.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00032-of-00065.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00032-of-00065.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00032-of-00065.safetensors",
"model.layers.31.input_layernorm.weight": "model-00032-of-00065.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00033-of-00065.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00033-of-00065.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00033-of-00065.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00033-of-00065.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00033-of-00065.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00033-of-00065.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00033-of-00065.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00033-of-00065.safetensors",
"model.layers.32.input_layernorm.weight": "model-00033-of-00065.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00034-of-00065.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00034-of-00065.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00034-of-00065.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00034-of-00065.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00034-of-00065.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00034-of-00065.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00034-of-00065.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00034-of-00065.safetensors",
"model.layers.33.input_layernorm.weight": "model-00034-of-00065.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00035-of-00065.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00035-of-00065.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00035-of-00065.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00035-of-00065.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00035-of-00065.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00035-of-00065.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00035-of-00065.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00035-of-00065.safetensors",
"model.layers.34.input_layernorm.weight": "model-00035-of-00065.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00036-of-00065.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00036-of-00065.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00036-of-00065.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00036-of-00065.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00036-of-00065.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00036-of-00065.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00036-of-00065.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00036-of-00065.safetensors",
"model.layers.35.input_layernorm.weight": "model-00036-of-00065.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00037-of-00065.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00037-of-00065.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00037-of-00065.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00037-of-00065.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00037-of-00065.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00037-of-00065.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00037-of-00065.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00037-of-00065.safetensors",
"model.layers.36.input_layernorm.weight": "model-00037-of-00065.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00038-of-00065.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00038-of-00065.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00038-of-00065.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00038-of-00065.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00038-of-00065.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00038-of-00065.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00038-of-00065.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00038-of-00065.safetensors",
"model.layers.37.input_layernorm.weight": "model-00038-of-00065.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00039-of-00065.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00039-of-00065.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00039-of-00065.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00039-of-00065.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00039-of-00065.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00039-of-00065.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00039-of-00065.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00039-of-00065.safetensors",
"model.layers.38.input_layernorm.weight": "model-00039-of-00065.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00040-of-00065.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00040-of-00065.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00040-of-00065.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00040-of-00065.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00040-of-00065.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00040-of-00065.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00040-of-00065.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00040-of-00065.safetensors",
"model.layers.39.input_layernorm.weight": "model-00040-of-00065.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00041-of-00065.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00041-of-00065.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00041-of-00065.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00041-of-00065.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00041-of-00065.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00041-of-00065.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00041-of-00065.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00041-of-00065.safetensors",
"model.layers.4.input_layernorm.weight": "model-00005-of-00065.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00006-of-00065.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00006-of-00065.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00006-of-00065.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00006-of-00065.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00006-of-00065.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00006-of-00065.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00006-of-00065.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00006-of-00065.safetensors",
"model.layers.40.input_layernorm.weight": "model-00041-of-00065.safetensors",
"model.layers.40.mlp.down_proj.weight": "model-00042-of-00065.safetensors",
"model.layers.40.mlp.gate_proj.weight": "model-00042-of-00065.safetensors",
"model.layers.40.mlp.up_proj.weight": "model-00042-of-00065.safetensors",
"model.layers.40.post_attention_layernorm.weight": "model-00042-of-00065.safetensors",
"model.layers.40.self_attn.k_proj.weight": "model-00042-of-00065.safetensors",
"model.layers.40.self_attn.o_proj.weight": "model-00042-of-00065.safetensors",
"model.layers.40.self_attn.q_proj.weight": "model-00042-of-00065.safetensors",
"model.layers.40.self_attn.v_proj.weight": "model-00042-of-00065.safetensors",
"model.layers.41.input_layernorm.weight": "model-00042-of-00065.safetensors",
"model.layers.41.mlp.down_proj.weight": "model-00043-of-00065.safetensors",
"model.layers.41.mlp.gate_proj.weight": "model-00043-of-00065.safetensors",
"model.layers.41.mlp.up_proj.weight": "model-00043-of-00065.safetensors",
"model.layers.41.post_attention_layernorm.weight": "model-00043-of-00065.safetensors",
"model.layers.41.self_attn.k_proj.weight": "model-00043-of-00065.safetensors",
"model.layers.41.self_attn.o_proj.weight": "model-00043-of-00065.safetensors",
"model.layers.41.self_attn.q_proj.weight": "model-00043-of-00065.safetensors",
"model.layers.41.self_attn.v_proj.weight": "model-00043-of-00065.safetensors",
"model.layers.42.input_layernorm.weight": "model-00043-of-00065.safetensors",
"model.layers.42.mlp.down_proj.weight": "model-00044-of-00065.safetensors",
"model.layers.42.mlp.gate_proj.weight": "model-00044-of-00065.safetensors",
"model.layers.42.mlp.up_proj.weight": "model-00044-of-00065.safetensors",
"model.layers.42.post_attention_layernorm.weight": "model-00044-of-00065.safetensors",
"model.layers.42.self_attn.k_proj.weight": "model-00044-of-00065.safetensors",
"model.layers.42.self_attn.o_proj.weight": "model-00044-of-00065.safetensors",
"model.layers.42.self_attn.q_proj.weight": "model-00044-of-00065.safetensors",
"model.layers.42.self_attn.v_proj.weight": "model-00044-of-00065.safetensors",
"model.layers.43.input_layernorm.weight": "model-00044-of-00065.safetensors",
"model.layers.43.mlp.down_proj.weight": "model-00045-of-00065.safetensors",
"model.layers.43.mlp.gate_proj.weight": "model-00045-of-00065.safetensors",
"model.layers.43.mlp.up_proj.weight": "model-00045-of-00065.safetensors",
"model.layers.43.post_attention_layernorm.weight": "model-00045-of-00065.safetensors",
"model.layers.43.self_attn.k_proj.weight": "model-00045-of-00065.safetensors",
"model.layers.43.self_attn.o_proj.weight": "model-00045-of-00065.safetensors",
"model.layers.43.self_attn.q_proj.weight": "model-00045-of-00065.safetensors",
"model.layers.43.self_attn.v_proj.weight": "model-00045-of-00065.safetensors",
"model.layers.44.input_layernorm.weight": "model-00045-of-00065.safetensors",
"model.layers.44.mlp.down_proj.weight": "model-00046-of-00065.safetensors",
"model.layers.44.mlp.gate_proj.weight": "model-00046-of-00065.safetensors",
"model.layers.44.mlp.up_proj.weight": "model-00046-of-00065.safetensors",
"model.layers.44.post_attention_layernorm.weight": "model-00046-of-00065.safetensors",
"model.layers.44.self_attn.k_proj.weight": "model-00046-of-00065.safetensors",
"model.layers.44.self_attn.o_proj.weight": "model-00046-of-00065.safetensors",
"model.layers.44.self_attn.q_proj.weight": "model-00046-of-00065.safetensors",
"model.layers.44.self_attn.v_proj.weight": "model-00046-of-00065.safetensors",
"model.layers.45.input_layernorm.weight": "model-00046-of-00065.safetensors",
"model.layers.45.mlp.down_proj.weight": "model-00047-of-00065.safetensors",
"model.layers.45.mlp.gate_proj.weight": "model-00047-of-00065.safetensors",
"model.layers.45.mlp.up_proj.weight": "model-00047-of-00065.safetensors",
"model.layers.45.post_attention_layernorm.weight": "model-00047-of-00065.safetensors",
"model.layers.45.self_attn.k_proj.weight": "model-00047-of-00065.safetensors",
"model.layers.45.self_attn.o_proj.weight": "model-00047-of-00065.safetensors",
"model.layers.45.self_attn.q_proj.weight": "model-00047-of-00065.safetensors",
"model.layers.45.self_attn.v_proj.weight": "model-00047-of-00065.safetensors",
"model.layers.46.input_layernorm.weight": "model-00047-of-00065.safetensors",
"model.layers.46.mlp.down_proj.weight": "model-00048-of-00065.safetensors",
"model.layers.46.mlp.gate_proj.weight": "model-00048-of-00065.safetensors",
"model.layers.46.mlp.up_proj.weight": "model-00048-of-00065.safetensors",
"model.layers.46.post_attention_layernorm.weight": "model-00048-of-00065.safetensors",
"model.layers.46.self_attn.k_proj.weight": "model-00048-of-00065.safetensors",
"model.layers.46.self_attn.o_proj.weight": "model-00048-of-00065.safetensors",
"model.layers.46.self_attn.q_proj.weight": "model-00048-of-00065.safetensors",
"model.layers.46.self_attn.v_proj.weight": "model-00048-of-00065.safetensors",
"model.layers.47.input_layernorm.weight": "model-00048-of-00065.safetensors",
"model.layers.47.mlp.down_proj.weight": "model-00049-of-00065.safetensors",
"model.layers.47.mlp.gate_proj.weight": "model-00049-of-00065.safetensors",
"model.layers.47.mlp.up_proj.weight": "model-00049-of-00065.safetensors",
"model.layers.47.post_attention_layernorm.weight": "model-00049-of-00065.safetensors",
"model.layers.47.self_attn.k_proj.weight": "model-00049-of-00065.safetensors",
"model.layers.47.self_attn.o_proj.weight": "model-00049-of-00065.safetensors",
"model.layers.47.self_attn.q_proj.weight": "model-00049-of-00065.safetensors",
"model.layers.47.self_attn.v_proj.weight": "model-00049-of-00065.safetensors",
"model.layers.48.input_layernorm.weight": "model-00049-of-00065.safetensors",
"model.layers.48.mlp.down_proj.weight": "model-00050-of-00065.safetensors",
"model.layers.48.mlp.gate_proj.weight": "model-00050-of-00065.safetensors",
"model.layers.48.mlp.up_proj.weight": "model-00050-of-00065.safetensors",
"model.layers.48.post_attention_layernorm.weight": "model-00050-of-00065.safetensors",
"model.layers.48.self_attn.k_proj.weight": "model-00050-of-00065.safetensors",
"model.layers.48.self_attn.o_proj.weight": "model-00050-of-00065.safetensors",
"model.layers.48.self_attn.q_proj.weight": "model-00050-of-00065.safetensors",
"model.layers.48.self_attn.v_proj.weight": "model-00050-of-00065.safetensors",
"model.layers.49.input_layernorm.weight": "model-00050-of-00065.safetensors",
"model.layers.49.mlp.down_proj.weight": "model-00051-of-00065.safetensors",
"model.layers.49.mlp.gate_proj.weight": "model-00051-of-00065.safetensors",
"model.layers.49.mlp.up_proj.weight": "model-00051-of-00065.safetensors",
"model.layers.49.post_attention_layernorm.weight": "model-00051-of-00065.safetensors",
"model.layers.49.self_attn.k_proj.weight": "model-00051-of-00065.safetensors",
"model.layers.49.self_attn.o_proj.weight": "model-00051-of-00065.safetensors",
"model.layers.49.self_attn.q_proj.weight": "model-00051-of-00065.safetensors",
"model.layers.49.self_attn.v_proj.weight": "model-00051-of-00065.safetensors",
"model.layers.5.input_layernorm.weight": "model-00006-of-00065.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00007-of-00065.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00007-of-00065.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00007-of-00065.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00007-of-00065.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00007-of-00065.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00007-of-00065.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00007-of-00065.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00007-of-00065.safetensors",
"model.layers.50.input_layernorm.weight": "model-00051-of-00065.safetensors",
"model.layers.50.mlp.down_proj.weight": "model-00052-of-00065.safetensors",
"model.layers.50.mlp.gate_proj.weight": "model-00052-of-00065.safetensors",
"model.layers.50.mlp.up_proj.weight": "model-00052-of-00065.safetensors",
"model.layers.50.post_attention_layernorm.weight": "model-00052-of-00065.safetensors",
"model.layers.50.self_attn.k_proj.weight": "model-00052-of-00065.safetensors",
"model.layers.50.self_attn.o_proj.weight": "model-00052-of-00065.safetensors",
"model.layers.50.self_attn.q_proj.weight": "model-00052-of-00065.safetensors",
"model.layers.50.self_attn.v_proj.weight": "model-00052-of-00065.safetensors",
"model.layers.51.input_layernorm.weight": "model-00052-of-00065.safetensors",
"model.layers.51.mlp.down_proj.weight": "model-00053-of-00065.safetensors",
"model.layers.51.mlp.gate_proj.weight": "model-00053-of-00065.safetensors",
"model.layers.51.mlp.up_proj.weight": "model-00053-of-00065.safetensors",
"model.layers.51.post_attention_layernorm.weight": "model-00053-of-00065.safetensors",
"model.layers.51.self_attn.k_proj.weight": "model-00053-of-00065.safetensors",
"model.layers.51.self_attn.o_proj.weight": "model-00053-of-00065.safetensors",
"model.layers.51.self_attn.q_proj.weight": "model-00053-of-00065.safetensors",
"model.layers.51.self_attn.v_proj.weight": "model-00053-of-00065.safetensors",
"model.layers.52.input_layernorm.weight": "model-00053-of-00065.safetensors",
"model.layers.52.mlp.down_proj.weight": "model-00054-of-00065.safetensors",
"model.layers.52.mlp.gate_proj.weight": "model-00054-of-00065.safetensors",
"model.layers.52.mlp.up_proj.weight": "model-00054-of-00065.safetensors",
"model.layers.52.post_attention_layernorm.weight": "model-00054-of-00065.safetensors",
"model.layers.52.self_attn.k_proj.weight": "model-00054-of-00065.safetensors",
"model.layers.52.self_attn.o_proj.weight": "model-00054-of-00065.safetensors",
"model.layers.52.self_attn.q_proj.weight": "model-00054-of-00065.safetensors",
"model.layers.52.self_attn.v_proj.weight": "model-00054-of-00065.safetensors",
"model.layers.53.input_layernorm.weight": "model-00054-of-00065.safetensors",
"model.layers.53.mlp.down_proj.weight": "model-00055-of-00065.safetensors",
"model.layers.53.mlp.gate_proj.weight": "model-00055-of-00065.safetensors",
"model.layers.53.mlp.up_proj.weight": "model-00055-of-00065.safetensors",
"model.layers.53.post_attention_layernorm.weight": "model-00055-of-00065.safetensors",
"model.layers.53.self_attn.k_proj.weight": "model-00055-of-00065.safetensors",
"model.layers.53.self_attn.o_proj.weight": "model-00055-of-00065.safetensors",
"model.layers.53.self_attn.q_proj.weight": "model-00055-of-00065.safetensors",
"model.layers.53.self_attn.v_proj.weight": "model-00055-of-00065.safetensors",
"model.layers.54.input_layernorm.weight": "model-00055-of-00065.safetensors",
"model.layers.54.mlp.down_proj.weight": "model-00056-of-00065.safetensors",
"model.layers.54.mlp.gate_proj.weight": "model-00056-of-00065.safetensors",
"model.layers.54.mlp.up_proj.weight": "model-00056-of-00065.safetensors",
"model.layers.54.post_attention_layernorm.weight": "model-00056-of-00065.safetensors",
"model.layers.54.self_attn.k_proj.weight": "model-00056-of-00065.safetensors",
"model.layers.54.self_attn.o_proj.weight": "model-00056-of-00065.safetensors",
"model.layers.54.self_attn.q_proj.weight": "model-00056-of-00065.safetensors",
"model.layers.54.self_attn.v_proj.weight": "model-00056-of-00065.safetensors",
"model.layers.55.input_layernorm.weight": "model-00056-of-00065.safetensors",
"model.layers.55.mlp.down_proj.weight": "model-00057-of-00065.safetensors",
"model.layers.55.mlp.gate_proj.weight": "model-00057-of-00065.safetensors",
"model.layers.55.mlp.up_proj.weight": "model-00057-of-00065.safetensors",
"model.layers.55.post_attention_layernorm.weight": "model-00057-of-00065.safetensors",
"model.layers.55.self_attn.k_proj.weight": "model-00057-of-00065.safetensors",
"model.layers.55.self_attn.o_proj.weight": "model-00057-of-00065.safetensors",
"model.layers.55.self_attn.q_proj.weight": "model-00057-of-00065.safetensors",
"model.layers.55.self_attn.v_proj.weight": "model-00057-of-00065.safetensors",
"model.layers.56.input_layernorm.weight": "model-00057-of-00065.safetensors",
"model.layers.56.mlp.down_proj.weight": "model-00058-of-00065.safetensors",
"model.layers.56.mlp.gate_proj.weight": "model-00058-of-00065.safetensors",
"model.layers.56.mlp.up_proj.weight": "model-00058-of-00065.safetensors",
"model.layers.56.post_attention_layernorm.weight": "model-00058-of-00065.safetensors",
"model.layers.56.self_attn.k_proj.weight": "model-00058-of-00065.safetensors",
"model.layers.56.self_attn.o_proj.weight": "model-00058-of-00065.safetensors",
"model.layers.56.self_attn.q_proj.weight": "model-00058-of-00065.safetensors",
"model.layers.56.self_attn.v_proj.weight": "model-00058-of-00065.safetensors",
"model.layers.57.input_layernorm.weight": "model-00058-of-00065.safetensors",
"model.layers.57.mlp.down_proj.weight": "model-00059-of-00065.safetensors",
"model.layers.57.mlp.gate_proj.weight": "model-00059-of-00065.safetensors",
"model.layers.57.mlp.up_proj.weight": "model-00059-of-00065.safetensors",
"model.layers.57.post_attention_layernorm.weight": "model-00059-of-00065.safetensors",
"model.layers.57.self_attn.k_proj.weight": "model-00059-of-00065.safetensors",
"model.layers.57.self_attn.o_proj.weight": "model-00059-of-00065.safetensors",
"model.layers.57.self_attn.q_proj.weight": "model-00059-of-00065.safetensors",
"model.layers.57.self_attn.v_proj.weight": "model-00059-of-00065.safetensors",
"model.layers.58.input_layernorm.weight": "model-00059-of-00065.safetensors",
"model.layers.58.mlp.down_proj.weight": "model-00060-of-00065.safetensors",
"model.layers.58.mlp.gate_proj.weight": "model-00060-of-00065.safetensors",
"model.layers.58.mlp.up_proj.weight": "model-00060-of-00065.safetensors",
"model.layers.58.post_attention_layernorm.weight": "model-00060-of-00065.safetensors",
"model.layers.58.self_attn.k_proj.weight": "model-00060-of-00065.safetensors",
"model.layers.58.self_attn.o_proj.weight": "model-00060-of-00065.safetensors",
"model.layers.58.self_attn.q_proj.weight": "model-00060-of-00065.safetensors",
"model.layers.58.self_attn.v_proj.weight": "model-00060-of-00065.safetensors",
"model.layers.59.input_layernorm.weight": "model-00060-of-00065.safetensors",
"model.layers.59.mlp.down_proj.weight": "model-00061-of-00065.safetensors",
"model.layers.59.mlp.gate_proj.weight": "model-00061-of-00065.safetensors",
"model.layers.59.mlp.up_proj.weight": "model-00061-of-00065.safetensors",
"model.layers.59.post_attention_layernorm.weight": "model-00061-of-00065.safetensors",
"model.layers.59.self_attn.k_proj.weight": "model-00061-of-00065.safetensors",
"model.layers.59.self_attn.o_proj.weight": "model-00061-of-00065.safetensors",
"model.layers.59.self_attn.q_proj.weight": "model-00061-of-00065.safetensors",
"model.layers.59.self_attn.v_proj.weight": "model-00061-of-00065.safetensors",
"model.layers.6.input_layernorm.weight": "model-00007-of-00065.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00008-of-00065.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00008-of-00065.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00008-of-00065.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00008-of-00065.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00008-of-00065.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00008-of-00065.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00008-of-00065.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00008-of-00065.safetensors",
"model.layers.60.input_layernorm.weight": "model-00061-of-00065.safetensors",
"model.layers.60.mlp.down_proj.weight": "model-00062-of-00065.safetensors",
"model.layers.60.mlp.gate_proj.weight": "model-00062-of-00065.safetensors",
"model.layers.60.mlp.up_proj.weight": "model-00062-of-00065.safetensors",
"model.layers.60.post_attention_layernorm.weight": "model-00062-of-00065.safetensors",
"model.layers.60.self_attn.k_proj.weight": "model-00062-of-00065.safetensors",
"model.layers.60.self_attn.o_proj.weight": "model-00062-of-00065.safetensors",
"model.layers.60.self_attn.q_proj.weight": "model-00062-of-00065.safetensors",
"model.layers.60.self_attn.v_proj.weight": "model-00062-of-00065.safetensors",
"model.layers.61.input_layernorm.weight": "model-00062-of-00065.safetensors",
"model.layers.61.mlp.down_proj.weight": "model-00063-of-00065.safetensors",
"model.layers.61.mlp.gate_proj.weight": "model-00063-of-00065.safetensors",
"model.layers.61.mlp.up_proj.weight": "model-00063-of-00065.safetensors",
"model.layers.61.post_attention_layernorm.weight": "model-00063-of-00065.safetensors",
"model.layers.61.self_attn.k_proj.weight": "model-00063-of-00065.safetensors",
"model.layers.61.self_attn.o_proj.weight": "model-00063-of-00065.safetensors",
"model.layers.61.self_attn.q_proj.weight": "model-00063-of-00065.safetensors",
"model.layers.61.self_attn.v_proj.weight": "model-00063-of-00065.safetensors",
"model.layers.62.input_layernorm.weight": "model-00063-of-00065.safetensors",
"model.layers.62.mlp.down_proj.weight": "model-00064-of-00065.safetensors",
"model.layers.62.mlp.gate_proj.weight": "model-00064-of-00065.safetensors",
"model.layers.62.mlp.up_proj.weight": "model-00064-of-00065.safetensors",
"model.layers.62.post_attention_layernorm.weight": "model-00064-of-00065.safetensors",
"model.layers.62.self_attn.k_proj.weight": "model-00064-of-00065.safetensors",
"model.layers.62.self_attn.o_proj.weight": "model-00064-of-00065.safetensors",
"model.layers.62.self_attn.q_proj.weight": "model-00064-of-00065.safetensors",
"model.layers.62.self_attn.v_proj.weight": "model-00064-of-00065.safetensors",
"model.layers.63.input_layernorm.weight": "model-00064-of-00065.safetensors",
"model.layers.63.mlp.down_proj.weight": "model-00065-of-00065.safetensors",
"model.layers.63.mlp.gate_proj.weight": "model-00065-of-00065.safetensors",
"model.layers.63.mlp.up_proj.weight": "model-00065-of-00065.safetensors",
"model.layers.63.post_attention_layernorm.weight": "model-00065-of-00065.safetensors",
"model.layers.63.self_attn.k_proj.weight": "model-00065-of-00065.safetensors",
"model.layers.63.self_attn.o_proj.weight": "model-00065-of-00065.safetensors",
"model.layers.63.self_attn.q_proj.weight": "model-00065-of-00065.safetensors",
"model.layers.63.self_attn.v_proj.weight": "model-00065-of-00065.safetensors",
"model.layers.7.input_layernorm.weight": "model-00008-of-00065.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00009-of-00065.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00009-of-00065.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00009-of-00065.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00009-of-00065.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00009-of-00065.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00009-of-00065.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00009-of-00065.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00009-of-00065.safetensors",
"model.layers.8.input_layernorm.weight": "model-00009-of-00065.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00010-of-00065.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00010-of-00065.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00010-of-00065.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00010-of-00065.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00010-of-00065.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00010-of-00065.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00010-of-00065.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00010-of-00065.safetensors",
"model.layers.9.input_layernorm.weight": "model-00010-of-00065.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00011-of-00065.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00011-of-00065.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00011-of-00065.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00011-of-00065.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00011-of-00065.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00011-of-00065.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00011-of-00065.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00011-of-00065.safetensors",
"model.norm.weight": "model-00065-of-00065.safetensors"
}
}

30
special_tokens_map.json Normal file
View File

@@ -0,0 +1,30 @@
{
"bos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<|unk|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

501276
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

15
tokenizer_config.json Normal file
View File

@@ -0,0 +1,15 @@
{
"add_prefix_space": false,
"backend": "tokenizers",
"bos_token": "<|end_of_text|>",
"clean_up_tokenization_spaces": false,
"eos_token": "<|end_of_text|>",
"errors": "replace",
"is_local": true,
"local_files_only": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "<|pad|>",
"padding_side": "left",
"tokenizer_class": "GPT2Tokenizer",
"unk_token": "<|unk|>"
}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long