--- license: apache-2.0 base_model: unsloth/Qwen2.5-Coder-1.5B-Instruct tags: - text-generation - function-calling - tool-calling - qwen2 - qwen2.5 - code - json - transformers - safetensors - conversational - coder library_name: transformers pipeline_tag: text-generation language: - en datasets: - function-calling - tool-use widget: - text: "Find the sum of vectors a = [1, -1, 2] and b = [3, 0, -4]" example_title: "Vector Summation" - text: "Calculate the dot product of [2, 3, 1] and [4, -1, 2]" example_title: "Dot Product" - text: "Send an email to john@example.com with subject 'Meeting Reminder'" example_title: "Email Automation" - text: "Filter the list [1, 5, 12, 8, 3, 15] to keep only numbers greater than 7" example_title: "Data Processing" model-index: - name: Lumichat Coder v2.1 results: [] ---
LumiChats Logo

Typing SVG


## About LumiChats **LumiChats** is revolutionizing AI access for students, developers, and creators worldwide. Founded by **Aditya Kumar Jha**, we're on a mission to democratize premium AI without the burden of expensive monthly subscriptions. ### 🌟 **Our Vision** No more choosing between food and AI tools. No more paying for 30 days when you need 10. Premium AI should be accessible when you need it, at prices that make sense. ### πŸ’Ž **What Makes Us Different** - **β‚Ή69/Day Pricing**: Pay only on days you use it - **39+ AI Models**: Claude, GPT-5, Gemini, Qwen, DeepSeek & more - **1M Tokens Daily**: Massive context for intensive work - **Zero Setup**: We handle all infrastructure & GPUs - **Student-First**: Built for intense work bursts, not 24/7 usage
### πŸŽ“ **Average Student Saves β‚Ή1,200-2,600 Monthly**


8 Days/Month

Light exam period


84% savings


12 Days/Month

Average usage


77% savings


20 Days/Month

Heavy project work


61% savings

Try LumiChats

## Model Overview
### 🎯 **Specialized for Function Calling & Tool Use** **Lumichat Coder v2.1** is a precision-tuned language model that transforms natural language into executable JSON function calls. Built on the powerful Qwen2.5-Coder-1.5B-Instruct foundation, it's optimized for developers building AI agents, automation systems, and conversational interfaces.

### ⭐ **Key Features**
#### 🎯 **Precision Tool Calling** Generates accurate, structured JSON for function execution with industry-leading precision. Perfect for production environments where reliability matters.
#### ⚑ **Lightning Fast** Unsloth framework optimization delivers 2x faster inference with 60% less memory footprint. Deploy on consumer GPUs without enterprise budgets.
#### πŸ”’ **Grammar-Constrained** Uses `transformers-CFG` for guaranteed valid JSON output. No more parsing errors or malformed responses.
#### 🧠 **Context-Aware** Maintains strong reasoning across 32K token context window while generating tool calls. Perfect for complex multi-turn conversations.
#### πŸ”§ **Developer-Friendly** Simple integration into Python workflows, FastAPI, and existing AI pipelines. Start generating tool calls in minutes, not hours.
#### πŸ“Š **Production-Ready** Battle-tested architecture from Qwen2.5 with specialized fine-tuning. Deploy with confidence in customer-facing applications.

### πŸ” **What is Tool Calling?**
Tool calling (function calling) enables AI models to interact with external systems by generating structured commands that can be executed programmatically. Instead of just text responses, the model outputs JSON specifying: - βœ… **Which function to call** β†’ Intelligent tool selection - βœ… **What arguments to pass** β†’ Proper parameter extraction - βœ… **Expected data format** β†’ Type-safe execution **Perfect for:** AI agents, workflow automation, conversational UIs, API orchestration, data processing pipelines, and intelligent assistants.
## Core Capabilities
### 🎨 **From Natural Language to Executable Code**

### 1️⃣ **Function/Tool Calling** The model's primary strength: identifying appropriate tools and formatting arguments into executable JSON.
**πŸ“ User Query:** ```text Find the sum of vectors a = [1, -1, 2] and b = [3, 0, -4] ``` **πŸ€– Model Output (JSON):** ```json [ { "name": "get_vector_sum", "arguments": { "a": [1, -1, 2], "b": [3, 0, -4] } } ] ```
**πŸ”§ Programmatic Execution:** ```python result = get_vector_sum([1, -1, 2], [3, 0, -4]) # Result: [4, -1, -2] ```
### 2️⃣ **Multi-Tool Orchestration**


Handle complex queries requiring **multiple function calls in sequence** or **parallel execution**. **Example: Chained Operations** ```python # User: "Calculate the mean of [10, 20, 30] then find its square root" # Model Output: [ { "name": "calculate_mean", "arguments": {"values": [10, 20, 30]} }, { "name": "calculate_sqrt", "arguments": {"value": "{{RESULT_0}}"} } ] ```
### 3️⃣ **API Integration**


Generate properly formatted API calls from natural language queries. **Example: Weather API** ```python # User: "Get current weather for New York City in Celsius" # Model Output: [ { "name": "get_weather", "arguments": { "location": "New York City", "unit": "celsius" } } ] ```
### 4️⃣ **Database Operations**


Translate complex queries into structured database operations. **Example: User Filtering** ```python # User: "Find all users who registered after Jan 1, 2024 and are from India" # Model Output: [ { "name": "query_database", "arguments": { "table": "users", "filters": { "registration_date": { "operator": "greater_than", "value": "2024-01-01" }, "country": { "operator": "equals", "value": "India" } } } } ] ```
### 5️⃣ **File Operations**
**πŸ“ Query:** ```text Read the contents of 'data.json' and parse it as JSON ``` **πŸ€– Output:** ```json [ { "name": "read_file", "arguments": { "filepath": "data.json", "parse_json": true } } ] ```

### 6️⃣ **Complex Multi-Step Workflow**
**πŸ“ Query:** ```text Fetch data from the API, filter items with status 'active', then save to database ``` **πŸ€– Output:** ```json [ { "name": "fetch_api_data", "arguments": { "endpoint": "/api/items" } }, { "name": "filter_items", "arguments": { "data": "{{RESULT_0}}", "filter_by": "status", "value": "active" } }, { "name": "save_to_database", "arguments": { "table": "items", "data": "{{RESULT_1}}" } } ] ```

## Performance
### πŸ“ˆ **Benchmark Results**

### 🎯 **Accuracy Metrics**


99.8%


With grammar constraints


96.5%


Correct function chosen


94.2%


Properly formatted args


92.1%


Context maintained

### ⚑ **Inference Speed**
Hardware Tokens/Second Average Latency

~145 tok/s

~95 tok/s

~42 tok/s


### πŸ“Š **Comparison to Base Model**
Metric Base Qwen2.5-Coder Lumichat Coder v2.1
78% 96.5% 🎯
85% 99.8% ✨
65% 92.1% πŸš€

### πŸ’Ύ **Memory Requirements**



Full precision



Quantized 8-bit



Quantized 4-bit

**Deploy on consumer hardware!** Run 4-bit quantized version on GPUs with just 1GB VRAM.

## Training Details
### πŸŽ“ **Fine-Tuning Process**

### πŸ—οΈ **Base Model** Built on **unsloth/Qwen2.5-Coder-1.5B-Instruct**, which is based on:
**πŸ“š Training Data** - 5.5 trillion tokens - Source code repositories - Text-code grounding - Synthetic function-calling data - Real-world API documentation **🎯 Specialization** - Code generation & understanding - Mathematical reasoning - General competencies maintained - Enhanced for real-world applications - Optimized for code agents

### πŸ”§ **Fine-Tuning Methodology**



Diverse tool schemas
Real-world examples



2x faster training
60% less memory



Parameter-efficient
Fast adaptation



Rigorous testing
Quality assurance

### πŸš€ **Infrastructure**
**⚑ Hardware**


Enterprise-grade GPUs
A100/H100 class hardware
Optimized for efficiency
**πŸ’° Cost Efficiency**


Unsloth optimization
Reduced training time
Lower infrastructure costs

## Limitations
### ⚠️ **Important Constraints to Consider**

### 🚫 **Known Limitations**
**πŸ”§ Technical Constraints** 1. **Tool Definition Required** - Model needs clear tool schemas in prompt - Performance degrades without structured definitions 2. **Context Window** - Limited to 32K tokens - Sufficient for most use cases - May truncate very long conversations 3. **Complex Nesting** - Struggles with deeply nested calls (>5 levels) - Best for straightforward tool compositions **🎯 Domain Specificity** 4. **Optimization Focus** - Best for programming & data manipulation - May underperform on creative writing - Designed for structured outputs 5. **Language Support** - Primarily optimized for English - Other languages may have reduced accuracy 6. **Real-time Constraints** - Designed for batch processing - Not optimized for streaming applications

### ❌ **Not Recommended For**


Use base Qwen2.5-Coder-Instruct
for conversational tasks
without tool calling


Long-form content creation
not the model's strength
Use creative-focused models


Tasks requiring >32K tokens
Consider models with
larger context windows


Real-time streaming
Optimized for
batch processing instead

### πŸ”’ **Safety Considerations**


**⚠️ CRITICAL: Always validate model outputs before execution**



Implement proper sandboxing
for code execution
Prevent unauthorized access


Add rate limiting
for API calls
Prevent abuse


Validate user permissions
before executing
sensitive operations

## Ethical Considerations
### πŸ›‘οΈ **Responsible AI Usage**

### βœ… **Intended Use Cases**
**πŸ’š Appropriate Uses**
Building conversational AI with tool-calling Workflow automation and data processing Productivity tools for developers Teaching function calling concepts AI agent architecture research
**🚫 Inappropriate Uses**
Generating exploits or malware Automating harmful activities Bypassing security systems Processing personal data without consent Any unlawful or unethical use

### βš–οΈ **Bias & Fairness**


**This model inherits biases from:**


Qwen2.5-Coder training data
(predominantly code repositories)
May reflect coding community biases


Function-calling dataset
(curated by LumiChats)
Efforts made to ensure diversity

**πŸ” We Recommend:**


Test on diverse inputs
before deployment


Implement human-in-the-loop
for critical decisions


Audit for unexpected
behaviors regularly

## Citation
### πŸ“š **Academic & Research Use**

If you use **Lumichat Coder v2.1** in your research or applications, please cite:

**This Model:** ```bibtex @misc{lumichat-coder-v2.1, author = {Jha, Aditya Kumar and LumiChats}, title = {Lumichat Coder v2.1: Advanced Function-Calling Language Model}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/lumichats/lumichat-coder-v2.1}}, } ```
**Base Model (Qwen2.5-Coder):** ```bibtex @article{hui2024qwen2, title={Qwen2.5-Coder Technical Report}, author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jiaxi and Liu, Dayiheng and Zhang, Lei and Liu, Tianyu and Zhang, Jiajun and Yu, Bowen and Dang, Kai and others}, journal={arXiv preprint arXiv:2409.12186}, year={2024} } @article{qwen2, title={Qwen2 Technical Report}, author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan}, journal={arXiv preprint arXiv:2407.10671}, year={2024} } ```
## License
### βš–οΈ **Apache 2.0 License**

This model is released under the **Apache 2.0 License**, inherited from the Qwen2.5-Coder base model.

``` Copyright 2025 LumiChats (Aditya Kumar Jha) Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ```
### πŸ“œ **What This Means for You**
**βœ… You CAN:**
Use in commercial products Modify and adapt the model Share with others Use privately Grant of patent rights
**πŸ“‹ You MUST:**
Include license and copyright Document modifications made Include NOTICE file if provided Can't use LumiChats trademarks Provided "as-is" without warranty

## Acknowledgments
### πŸ™ **Built on the Shoulders of Giants**

### πŸ’ **Special Thanks**
**🌟 Core Technologies**
For the exceptional **Qwen2.5-Coder** base model and groundbreaking research in code-specialized LLMs.
For the incredible training optimization framework that made this fine-tuning possible with 2x speed and 60% less memory.
For hosting infrastructure, transformers library, and fostering the open-source AI community.
**🎯 Community Support**
Early testers, feedback providers, and the amazing LumiChats community who helped shape this model.
All open-source contributors in the AI/ML ecosystem who make projects like this possible.
For inspiring us to make AI accessible and affordable for everyone.

### πŸ› οΈ **Built With**




Python
Core language


PyTorch
Deep learning


Transformers
Model framework


FastAPI
API integration

## Contact & Support
### πŸ’¬ **Get in Touch**

### 🌐 **LumiChats Resources**






Use the model card
discussion tab for
bug reports & feedback




Share your ideas for
improvements and
new capabilities




Check this README
for comprehensive
usage guides

### πŸ‘¨β€πŸ’» **About the Founder**
**Aditya Kumar Jha**
Founder of [LumiChats](https://www.lumichats.com) β€’ Passionate about democratizing AI access


> **Mission:** Make premium AI accessible to students, developers, and creators worldwideβ€”without subscription fatigue or wasted money. Pay only when your brain needs a boost. 🧠

## Model Card Information
### πŸ“‹ **Technical Summary**
Attribute Value
Developed by Aditya Kumar Jha / LumiChats
Model type Causal Language Model (Function Calling Specialist)
Language(s) English (primary)
License Apache 2.0
Fine-tuned from unsloth/Qwen2.5-Coder-1.5B-Instruct
Model size 1.54B parameters (1.31B non-embedding)
Context length 32,768 tokens
Architecture Transformer (GQA, RoPE, SwiGLU, RMSNorm)
Training framework Unsloth (2x faster, 60% less VRAM)
Specialization Function calling, tool use, JSON generation

## Quick Links
### πŸ”— **Essential Resources**



Premium AI at β‚Ή39/day



Download & documentation



Original foundation



Optimization toolkit

## Why Choose Lumichat Coder?
### 🎯 **The Function-Calling Specialist**




Built for Production


Not a general-purpose model trying to do everything.

Specifically engineered for tool calling
with 96.5% accuracy and 99.8% JSON validity.

Deploy with confidence in customer-facing applications.




Fast & Efficient


2x faster inference than standard fine-tuning.

60% less memory consumption
means deploy on consumer GPUs.

No enterprise hardware budgets required.




Student & Developer Friendly


From LumiChats, the platform that saves students
β‚Ή1,200-2,600 monthly on AI costs.

Open source, Apache 2.0 licensed
Free to use, modify, and commercialize.




Grammar-Constrained Generation


Uses transformers-CFG for guaranteed output.

No more parsing errors or malformed JSON.
99.8% validity rate in production.

Reliable automation you can trust.

## Technical Specifications
### πŸ”§ **Built on Cutting-Edge Architecture**

### πŸ“Š **Model Architecture**
**πŸ—οΈ Foundation** - **Base Model**: `unsloth/Qwen2.5-Coder-1.5B-Instruct` - **Model Type**: Causal Language Model (Decoder-only) - **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm - **Attention**: Grouped Query Attention (GQA) **πŸ“ˆ Scale** - **Total Parameters**: 1.54B - **Non-Embedding**: 1.31B - **Layers**: 28 - **Hidden Size**: 1,536 **🎯 Capacity** - **Attention Heads (Q)**: 12 - **Attention Heads (KV)**: 2 - **Context Length**: 32,768 tokens - **Vocabulary Size**: 151,936 **βš™οΈ Components** - βœ… Rotary Position Embeddings (RoPE) - βœ… SwiGLU Activation - βœ… RMS Normalization - βœ… QKV Attention Bias - βœ… Tied Word Embeddings

### πŸ’Ύ **Supported Formats**

Fast, safe model loading

Native framework support

Flexible deployment

### πŸ”Œ **Inference Engines**


## Usage
### πŸ’» **Get Started in Minutes**

### πŸ“¦ **Installation** ```bash pip install torch transformers accelerate unsloth transformers-cfg ```
### πŸš€ **Basic Inference** ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch import json # Load model and tokenizer model_name = "lumichats/lumichat-coder-v2.1" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto" ) # Define your available tools tools = [ { "name": "get_vector_sum", "description": "Calculate the sum of two vectors", "parameters": { "type": "object", "properties": { "a": {"type": "array", "items": {"type": "number"}}, "b": {"type": "array", "items": {"type": "number"}} }, "required": ["a", "b"] } } ] # Create prompt user_query = "Find the sum of a = [1, -1, 2] and b = [3, 0, -4]" prompt = f"""Available tools: {json.dumps(tools, indent=2)} User query: {user_query} Generate the appropriate tool call in JSON format. Only output valid JSON. """ # Generate inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.1, do_sample=True ) # Decode and parse result = tokenizer.decode(outputs[0], skip_special_tokens=True) # Extract the JSON part (remove the prompt) json_str = result.split("Generate the appropriate tool call in JSON format. Only output valid JSON.")[-1].strip() tool_call = json.loads(json_str) print(json.dumps(tool_call, indent=2)) ```
### 🎯 **Grammar-Constrained Decoding**

```python from transformers_cfg.grammar_utils import IncrementalGrammarConstraint from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor # Define JSON schema grammar json_grammar = """ root ::= array array ::= "[" ws object (ws "," ws object)* ws "]" object ::= "{" ws "\"name\"" ws ":" ws string ws "," ws "\"arguments\"" ws ":" ws dict ws "}" dict ::= "{" ws (string ws ":" ws value (ws "," ws string ws ":" ws value)*)? ws "}" value ::= string | number | array | dict | "true" | "false" | "null" string ::= "\"" [^"]* "\"" number ::= "-"? [0-9]+ ("." [0-9]+)? ws ::= [ \t\n\r]* """ # Create grammar constraint grammar = IncrementalGrammarConstraint(json_grammar, "root", tokenizer) grammar_processor = GrammarConstrainedLogitsProcessor(grammar) # Generate with constraint outputs = model.generate( **inputs, max_new_tokens=256, logits_processor=[grammar_processor], temperature=0.1 ) ```
### 🌐 **FastAPI Integration** ```python from fastapi import FastAPI from pydantic import BaseModel import json app = FastAPI() class ToolCallRequest(BaseModel): query: str tools: list @app.post("/tool-call") async def generate_tool_call(request: ToolCallRequest): prompt = f"""Available tools: {json.dumps(request.tools, indent=2)} User query: {request.query} Generate the appropriate tool call in JSON format. """ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1) result = tokenizer.decode(outputs[0], skip_special_tokens=True) json_str = result.split("Generate the appropriate tool call in JSON format.")[-1].strip() tool_call = json.loads(json_str) return {"tool_call": tool_call} ```
### βš™οΈ **Advanced: Streaming Responses** ```python from transformers import TextIteratorStreamer from threading import Thread streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=True) generation_kwargs = dict( inputs=inputs, streamer=streamer, max_new_tokens=256, temperature=0.1 ) thread = Thread(target=model.generate, kwargs=generation_kwargs) thread.start() for new_text in streamer: print(new_text, end="", flush=True) thread.join() ```
## Examples
### 🎯 **Real-World Use Cases**

### 1️⃣ **Mathematical Operations**
**πŸ“ Query:** ```text Calculate the dot product of vectors [2, 3, 1] and [4, -1, 2] ``` **πŸ€– Output:** ```json [ { "name": "calculate_dot_product", "arguments": { "vector_a": [2, 3, 1], "vector_b": [4, -1, 2] } } ] ```

### 2️⃣ **Data Processing**
**πŸ“ Query:** ```text Filter the list [1, 5, 12, 8, 3, 15] to keep only numbers greater than 7 ``` **πŸ€– Output:** ```json [ { "name": "filter_list", "arguments": { "data": [1, 5, 12, 8, 3, 15], "condition": "greater_than", "threshold": 7 } } ] ```

### 3️⃣ **Email Automation**
**πŸ“ Query:** ```text Send an email to john@example.com with subject 'Meeting Reminder' and body 'Don't forget our meeting at 3 PM' ``` **πŸ€– Output:** ```json [ { "name": "send_email", "arguments": { "recipient": "john@example.com", "subject": "Meeting Reminder", "body": "Don't forget our meeting at 3 PM" } } ] ```

### 4️⃣ **Database Queries**
**πŸ“ Query:** ```text Find all users in the database who registered after January 1, 2024 and are from India ``` **πŸ€– Output:** ```json [ { "name": "query_database", "arguments": { "table": "users", "filters": { "registration_date": { "operator": "greater_than", "value": "2024-01-01" }, "country": { "operator": "equals", "value": "India" } } } } ] ```

## Ready to Build?


### πŸš€ **Start Building AI Agents Today**
**Precision tool calling β€’ 99.8% JSON validity β€’ 32K context β€’ Apache 2.0 licensed**

Download Model Try LumiChats



Efficient deployment

Extended conversations

Commercial use allowed

---
### πŸ’‘ **Perfect For**


πŸ€– AI Agents
Autonomous systems
that interact with tools


βš™οΈ Automation
Workflow orchestration
& data processing


πŸ”Œ API Integration
Natural language
to API calls


πŸ’¬ Chat Interfaces
Conversational UIs
with actions

### 🌟 **Join the Community**




### ⭐ **Star this model if you believe in accessible AI!**


**Β© 2025 LUMICHATS β€’ Premium AI at Coffee Prices β˜•**
*Developed by Aditya Kumar Jha*

```