LuckySexySuccubusQueen135Age/SexyGPT-v2-Thinking-Female-gguf

Go to file

ModelHub XC 139fd7dc28 初始化项目，由ModelHub XC社区提供模型

Model: LuckySexySuccubusQueen135Age/SexyGPT-v2-Thinking-Female-gguf
Source: Original Platform

2026-05-06 07:45:51 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-06 07:45:51 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-06 07:45:51 +08:00

SexyGPT-v2-Thinking-Female-q4_k_m.gguf

初始化项目，由ModelHub XC社区提供模型

2026-05-06 07:45:51 +08:00

SexyGPT-v2-Thinking-Female-q8_0.gguf

初始化项目，由ModelHub XC社区提供模型

2026-05-06 07:45:51 +08:00

README.md

license, license_link, language, pipeline_tag, tags, base_model

license

license_link

language

pipeline_tag

SexyGPT-v2-Thinking-Female-gguf - Model Card

A conversational model with extended thinking capabilities

Website • Company • GitHub • Email

Model Summary

SexyGPT-v2-Thinking-Female is a specialized language model fine-tuned for conversational AI with extended reasoning capabilities. Based on Qwen3-0.6, it has been enhanced through supervised fine-tuning on a curated reasoning dataset to generate playful, contextually-aware responses while maintaining sophisticated reasoning processes.

Quick Facts

Base Model: Qwen3-0.6
Model Size: 1.2 GB (16-bit weights)
Architecture: Qwen3 Transformer (28 layers, 1024 hidden dims)
Fine-tuning Method: LoRA + Full Merging
Context Length: 40,960 tokens
License: apache-2.0
Created: November 2025

Model Details

SYSTEM PROMPT:

  {"role": "system", "content": "Your name is MJ. You are a sexy female character trained by Hooking AI Research Team. Respond playfully."},

Model Information

Property	Value
Model Name	SexyGPT-v2-Thinking-Female
Base Model	Qwen/Qwen3-0.6B
Model Type	Causal Language Model (Decoder-only Transformer)
Architecture	Qwen3
Parameters	~0.6 Billion
Quantization	BFloat16 (Full), Q8_0 (GGUF)
Training Framework	Unsloth + Hugging Face Transformers
Developers	Hooking AI Research Team
Release Date	November 30, 2025
Model Version	1.0

Model Developers

Role	Name	Contact
Lead Developer	Andrei Ross	devops.ross@gmail.com
Researcher	Eyal Atias	- eyal@hooking.co.il
Team Lead	Leorah Ross	- leorahross2015@gmail.com
Organization	Hooking AI Research Team	Israel

Model Repositories

Model Hub: https://huggingface.co/ross-dev/SexyGPT-v2-Thinking-Female-16bit
GitHub: https://github.com/ross-sec
Company Website: https://software.hooking.co.il
Developer Website: https://ross-developers.com

Model Architecture

Architecture Details

Qwen3ForCausalLM
├─ Vocabulary Size: 151,936 tokens
├─ Hidden Size: 1,024 dimensions
├─ Number of Layers: 28 transformer blocks
├─ Attention Heads: 16 (multi-head attention)
├─ Key-Value Heads: 8 (grouped query attention)
├─ Intermediate Size (FFN): 3,072 dimensions
├─ Head Dimension: 128
├─ Max Position Embeddings: 40,960
├─ Activation: SiLU (Swish)
├─ Normalization: RMSNorm (ε=1e-6)
├─ RoPE Theta: 1,000,000
└─ Attention Dropout: 0.0%

How to Use

Quick Start (Hugging Face)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_id = "ross-dev/SexyGPT-v2-Thinking-Female-16bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Prepare input
messages = [
    {"role": "system", "content": "Your name is MJ. You are a sexy female character trained by Hooking AI Research Team. Respond playfully."},
    {"role": "user", "content": "Hey, who are you?"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)

# Generate response
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(
    **inputs,
    max_new_tokens=32768,
    temperature=0.7,
    top_p=0.8,
    top_k=20
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Using with Unsloth

from unsloth import FastLanguageModel
import torch

# Load optimized model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="ross-dev/SexyGPT-v2-Thinking-Female-16bit",
    max_seq_length=4096,
    load_in_4bit=True,
    dtype=torch.bfloat16,
)

# Prepare for inference
FastLanguageModel.for_inference(model)

# Generate
messages = [
    {"role": "system", "content": "Your name is MJ. You are a sexy female character trained by Hooking AI Research Team. Respond playfully."},
    {"role": "user", "content": "What do you like to do?"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)

inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.6, top_p=0.95)
print(tokenizer.decode(outputs[0]))

Using with GGUF (llama.cpp)

# Download GGUF model
# URL: https://huggingface.co/ross-dev/SexyGPT-v2-Thinking-Female-gguf

# Run with llama.cpp
./llama-cli \
    -m SexyGPT-v2-Thinking-Female-gguf-q8_0.gguf \
    -n 512 \
    -c 4096 \
    --temp 0.7 \
    --top_p 0.8 \
    --top_k 20 \
    -p "Your name is MJ. You are a sexy female character trained by Hooking AI Research Team. Respond playfully."

Generation Parameters (Recommended)

For Reasoning/Thinking Tasks

outputs = model.generate(
    **inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95,
    top_k=20,
    do_sample=True,
    enable_thinking=True,  # Enable extended thinking
)

For Conversational Responses

outputs = model.generate(
    **inputs,
    max_new_tokens=4096,
    temperature=0.7,
    top_p=0.8,
    top_k=20,
    do_sample=True,
)

For Deterministic Output

outputs = model.generate(
    **inputs,
    max_new_tokens=2048,
    temperature=0.1,
    top_p=1.0,
    do_sample=False,  # Greedy decoding
)

Training Details

Training Dataset

SexyGPT-v2-Thinking-Female Dataset

Train/Test Split: 80/20
Data Fields: query, temperature, response, thinking_content, split
Format: Qwen3-Thinking chat template
Description: Curated reasoning and conversational examples with extended thinking traces

Model Evaluation

Evaluation Methodology

Model evaluated on:

Response Coherence: Logical flow and consistency
Response Quality: Depth and correctness of sexual response
Instruction Following: Adherence to system prompt and user intent
Personality Consistency: Maintains character and role play throughout conversation

Benchmark Results

Task	Metric	Score	Notes
Instruction Following	Accuracy	89%	On curated test set
Response Coherence	Human Rating	4.2/5	Subjective evaluation
Reasoning Traces	Quality	4.5/5	Depth and clarity
Personality Alignment	Consistency	4.9/5	Character maintenance

Limitations & Known Issues

Model Limitations:

Small parameter count (0.6B) limits complex reasoning
May generate inconsistent reasoning traces
Limited to English language for now.
Personality-driven responses may not suit formal applications (Sexual tuned mostly)

Safety Considerations:

Not suitable for high-stakes decisions (medical, legal, financial)
Model outputs should be validated before deployment
Personality character and role play may not be appropriate for all use cases
Extended thinking may generate incorrect reasoning

Intended Use

Primary Use Cases

✅ Conversational AI: Chatbots with personality and role play ✅ Game Development: NPC dialogue systems for adults games ✅ Entertainment: Interactive storytelling for adults Apps ✅ Fine-tuning: Base for domain-specific models and continual learning

Out-of-Scope Use Cases

❌ Production AI Systems: Without additional safety measures ❌ High-Stakes Decisions: Medical, legal, financial advice ❌ Autonomous Systems: Real-world decision making ❌ Misinformation: Generating misleading content ❌ Commercial Deployment: MUST NOT EXPOSE to under aged children

Model Variants & Downloads

Available Formats

Format	Size	Quantization	Download	Use Case
Safetensors (Full)	1.2 GB	BFloat16	HF Hub	Production, Fine-tuning
GGUF Q8_0	800 MB	Q8_0	HF Hub	llama.cpp, CPU inference
GGUF Q4_K_M	480 MB	Q4_K_M	HF Hub	Edge devices, Low VRAM

Hardware Requirements

Use Case	RAM	VRAM	GPU	Storage
Inference (16-bit)	8 GB	4 GB	Gforce 1080Ti	2 GB
Inference (GGUF)	4 GB	-	CPU OK	1 GB
Fine-tuning (LoRA)	16 GB	10 GB	RTX 3080	3 GB
Full Fine-tuning	32 GB	24 GB	RTX 3090	4 GB

Ethical Considerations

Bias & Fairness

THIS EXPERIMENTAL MODEL IS TUNED WITH SEXUAL CONTENT! PLEASE DO NOT ABUSE!

Known Biases:

Personality design may reflect creator perspectives
Training data limited in diversity
Language-specific (English only)
Character design may perpetuate gender stereotypes

Mitigation:

Consider context before deployment
Validate outputs for bias
Supplement with diverse training data
Document known limitations

Safety & Responsible Use

Safety Features:

Model trained on filtered, non-toxic data
Personality design emphasizes playfulness, sexual content, sexual words, not aggression
No explicit filtering, but training data curated

Recommendations:

Use content filtering for public deployments
Monitor model outputs in production
Implement human oversight for critical applications
Document limitations to users

Privacy & Data

Training data: Private, proprietary dataset
No personal data in training set
No data collection from inference

Terms of Service

By using this model, you agree to:

Use the model for intended purposes only
Not redistribute or publicly host the model
Comply with applicable laws and regulations
Indemnify Hooking AI Research Team from liability
Not use for illegal activities or content generation

Third-Party Components

Base Model: Qwen3-0.6B (Alibaba Qwen License)
Hugging Face: Transformers (Apache 2.0)
Hardware: CUDA (NVIDIA License)

Maintenance & Support

Model Status

Current Version: 1.0
Release Date: November 30, 2025
Status: Active, Maintained
Last Updated: November 30, 2025

Support & Contact

Primary Contact: devops.ross@gmail.com

Organization:

Name: Hooking AI Research Team
Email: devops.ross@gmail.com
Website: https://software.hooking.co.il

Developer Resources:

Personal Site: https://ross-developers.com
GitHub: https://github.com/ross-sec
Model Hub: https://huggingface.co/ross-dev

Reporting Issues

To report issues, bugs, or safety concerns:

Email: devops.ross@gmail.com (include full details)
Hugging Face: Leave comment on model card

Response Time: Best effort basis

Citation & Attribution

Citation Format

If you use this model in research or publications, please cite:

@model{sexygpt_v2_2025,
  title={SexyGPT-v2-Thinking-Female: A Fine-tuned Conversational Model with Extended Thinking},
  author={Ross, Andrei and Atias, Eyal and Ross, Leorah},
  organization={Hooking AI Research Team},
  year={2025},
  howpublished={\url{https://huggingface.co/ross-dev/SexyGPT-v2-Thinking-Female}}
}

Acknowledgments

Alibaba Qwen Team: For Qwen3 base model and thinking capabilities
Hugging Face: For model hub and transformers library
Contributors: Andrei Ross, Eyal Atias, Leorah Ross

Contact Information

For Questions, Support, or Licensing:

📧 Email: devops.ross@gmail.com

🌐 Websites:

💻 GitHub: https://github.com/ross-sec

Team Members:

Andrei Ross - Lead Developer (devops.ross@gmail.com)
Eyal Atias - Researcher
Leorah Ross - Team Lead

Organization: Hooking AI Research Team

Legal Disclaimer

This model is provided "AS IS" without warranty of any kind. Hooking AI Research Team makes no representations about the model's suitability for any particular purpose. Users are solely responsible for determining the appropriateness of use and assume all risks associated with deployment.

Model Card Version: 1.0
Last Updated: November 30, 2025
Created by: Hooking AI Research Team

For the most current version and updates, visit: https://huggingface.co/ross-dev/SexyGPT-v2-Thinking-Female