140 lines
5.0 KiB
Markdown
140 lines
5.0 KiB
Markdown
---
|
|
language:
|
|
- en
|
|
tags:
|
|
- robotics
|
|
- ros2
|
|
- json
|
|
- unsloth
|
|
- qwen
|
|
- text-generation
|
|
- hardware
|
|
base_model: unsloth/Qwen2.5-0.5B-Instruct
|
|
pipeline_tag: text-generation
|
|
license: apache-2.0
|
|
---
|
|
|
|
# TASX-Command-0.5B
|
|
|
|
**TASX-Command-0.5B** is a highly specialized, lightweight language model designed specifically for robotics. It translates natural language (including slang, typos, and complex phrasing) into strict, execution-ready JSON command sequences for ROS2, SLAM, and physical robot control.
|
|
|
|
By fine-tuning the **Qwen2.5-0.5B** base model, we created a "robot brain" that is small and fast enough to run locally on edge hardware (like a Raspberry Pi) via `llama.cpp` while retaining the intelligence to understand complex human intent.
|
|
|
|
## 📦 Quantized Versions (GGUF)
|
|
For high-performance inference , use these GGUF quants:
|
|
|
|
* **[TASX-Cmd-0.5B-GGUF (mradermacher)](https://huggingface.co/mradermacher/TASX-Cmd-0.5B-GGUF)** — *Includes high-quality iMatrix and IQ quants.*
|
|
* **[TASX-Cmd-0.5B-Q8_0 (ReXeeD)](https://huggingface.co/ReXeeD/TASX-Cmd-0.5B-GGUF)** — *Standard high-precision 8-bit quantization.*
|
|
|
|
## 🌟 Key Features
|
|
* **Strict JSON Output:** Never outputs conversational filler; only valid JSON arrays.
|
|
* **Typo & Slang Immunity:** Successfully maps messy speech (e.g., "scoot forward lik 3 point 5 meeters") to perfect floats and commands.
|
|
* **Dynamic Location Extraction:** Converts any spoken room or location name (e.g., "Professor Xavier's Office") into clean `snake_case` (e.g., `professor_xavier_office`).
|
|
* **Physical Constraint Logic:** Automatically generates implicit macro sequences (like `sit` -> `stand` -> `move`) for fetching and delivering items without needing explicit user instruction.
|
|
|
|
---
|
|
|
|
## 🛠️ Supported Actions & Commands
|
|
|
|
The model is trained to strictly output one or more of the following 20 commands formatted as a JSON array of `actions`.
|
|
|
|
### 1. Teleop (Movement & Speed)
|
|
* `{"type": "teleop", "cmd": "move_forward", "distance": <float>}`
|
|
* `{"type": "teleop", "cmd": "move_backward", "distance": <float>}`
|
|
* `{"type": "teleop", "cmd": "rotate_left", "angle": <float>}`
|
|
* `{"type": "teleop", "cmd": "rotate_right", "angle": <float>}`
|
|
* `{"type": "teleop", "cmd": "set_speed", "level": "slow" | "normal" | "fast"}`
|
|
* `{"type": "teleop", "cmd": "stop"}` *(For casual pauses)*
|
|
* `{"type": "teleop", "cmd": "e_stop"}` *(For panicked/emergency stops)*
|
|
|
|
### 2. Nav2 (Autonomous Navigation)
|
|
* `{"type": "nav2", "cmd": "go_to_waypoint", "target": "<snake_case_string>"}`
|
|
* `{"type": "nav2", "cmd": "cancel_goal"}`
|
|
|
|
### 3. Stunts (Posture & Tricks)
|
|
* `{"type": "stunt", "cmd": "full_sit"}`
|
|
* `{"type": "stunt", "cmd": "half_sit"}`
|
|
* `{"type": "stunt", "cmd": "stand_up"}`
|
|
* `{"type": "stunt", "cmd": "spin", "direction": "clockwise" | "anticlockwise"}`
|
|
|
|
---
|
|
|
|
## 🧠 Advanced Behaviors (Macros)
|
|
|
|
TASX-Command-0.5B has been taught physical robotics logic. It knows a robot cannot drive while sitting.
|
|
|
|
If you ask it to perform a delivery (e.g., *"Fetch my laptop from the server room and bring it to John's desk"*), it will automatically output the required posture macros:
|
|
```json
|
|
{
|
|
"actions": [
|
|
{"type": "nav2", "cmd": "go_to_waypoint", "target": "server_room"},
|
|
{"type": "stunt", "cmd": "full_sit"},
|
|
{"type": "stunt", "cmd": "stand_up"},
|
|
{"type": "nav2", "cmd": "go_to_waypoint", "target": "john_desk"},
|
|
{"type": "stunt", "cmd": "full_sit"}
|
|
]
|
|
}
|
|
|
|
```
|
|
## TEST SCRIPT
|
|
```
|
|
# ============================================================
|
|
# ============================================================
|
|
from unsloth import FastLanguageModel
|
|
import torch
|
|
|
|
MODEL_PATH = "./tasx_sft_merged"
|
|
MAX_SEQ_LENGTH = 512
|
|
|
|
print("⏳ Loading your fine-tuned TASX model...")
|
|
model, tokenizer = FastLanguageModel.from_pretrained(
|
|
model_name = MODEL_PATH,
|
|
max_seq_length = MAX_SEQ_LENGTH,
|
|
dtype = torch.float16,
|
|
load_in_4bit = False,
|
|
)
|
|
|
|
FastLanguageModel.for_inference(model)
|
|
|
|
print("\n" + "="*50)
|
|
print("TASX ROBOT COMMAND TESTER READY")
|
|
print("Type a command in the box below and press Enter.")
|
|
print("Type 'quit' or 'q' to stop.")
|
|
print("="*50 + "\n")
|
|
|
|
|
|
while True:
|
|
user_text = input("🎤 You: ")
|
|
|
|
if user_text.lower() in ['quit', 'exit', 'q']:
|
|
print("Stopping inference. Great job!")
|
|
break
|
|
|
|
if not user_text.strip():
|
|
continue
|
|
|
|
|
|
prompt = f"<|im_start|>user\n{user_text}<|im_end|>\n<|im_start|>assistant\n"
|
|
|
|
|
|
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
|
|
|
|
# Generate the output
|
|
outputs = model.generate(
|
|
**inputs,
|
|
max_new_tokens=150,
|
|
use_cache=True,
|
|
temperature=0.1,
|
|
do_sample=True,
|
|
)
|
|
|
|
# Decode the output (slice off the prompt so we only see the assistant's new text)
|
|
response = tokenizer.batch_decode(
|
|
outputs[:, inputs.input_ids.shape[1]:],
|
|
skip_special_tokens=True
|
|
)[0]
|
|
|
|
print(f"TASX: {response.strip()}\n")
|
|
```
|
|
## Contact
|
|
Need a custom version of this model for your specific robot's API or hardware? Contact: [albinthomas7034@gmail.com] |