gemma-3-4b-abliterated/README.md

---
base_model:
- gghfez/gemma-3-4b-novision
license: gemma
pipeline_tag: text-generation
library_name: transformers
---

# Gemma 3 4B (abliterated text-only) model card
This is an abliterated text-only version of [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it), created using Baukit.

The vision encoders were removed by [gghf](https://huggingface.co/gghfez). Please note that this model may exhibit a reduced performance.

## Model Description

- **Original Model**: The original Gemma-3-4b-it is a multimodal model released by Google that can process both text and images
- **This Version**: This version has been modified to use the same architecture as the text-only 1b model, with the vision components removed
- **Parameters**: 4 billion parameters
- **Conversion Process**: Vision-related components were stripped while maintaining the text generation capabilities

## Usage

You can load and use this model the same way you would use the text-only [google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it) version:

```python
from transformers import AutoTokenizer, BitsAndBytesConfig, Gemma3ForCausalLM
import torch

model_id = "gghfez/gemma-3-4b-novision"

quantization_config = BitsAndBytesConfig(load_in_8bit=True)

model = Gemma3ForCausalLM.from_pretrained(
    model_id, quantization_config=quantization_config
).eval()

tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [
    [
        {
            "role": "system",
            "content": [{"type": "text", "text": "You are a helpful assistant."},]
        },
        {
            "role": "user",
            "content": [{"type": "text", "text": "Write a poem on Hugging Face, the company"},]
        },
    ],
]
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device).to(torch.bfloat16)


with torch.inference_mode():
    outputs = model.generate(**inputs, max_new_tokens=64)

outputs = tokenizer.batch_decode(outputs)
```
初始化项目，由ModelHub XC社区提供模型 Model: lunahr/gemma-3-4b-abliterated Source: Original Platform 2026-05-03 15:50:39 +08:00			`---`
			`base_model:`
			`- gghfez/gemma-3-4b-novision`
			`license: gemma`
			`pipeline_tag: text-generation`
			`library_name: transformers`
			`---`

			`# Gemma 3 4B (abliterated text-only) model card`
			`This is an abliterated text-only version of [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it), created using Baukit.`

			`The vision encoders were removed by [gghf](https://huggingface.co/gghfez). Please note that this model may exhibit a reduced performance.`

			`## Model Description`

			`- Original Model: The original Gemma-3-4b-it is a multimodal model released by Google that can process both text and images`
			`- This Version: This version has been modified to use the same architecture as the text-only 1b model, with the vision components removed`
			`- Parameters: 4 billion parameters`
			`- Conversion Process: Vision-related components were stripped while maintaining the text generation capabilities`

			`## Usage`

			`You can load and use this model the same way you would use the text-only [google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it) version:`

			```python
			`from transformers import AutoTokenizer, BitsAndBytesConfig, Gemma3ForCausalLM`
			`import torch`

			`model_id = "gghfez/gemma-3-4b-novision"`

			`quantization_config = BitsAndBytesConfig(load_in_8bit=True)`

			`model = Gemma3ForCausalLM.from_pretrained(`
			`model_id, quantization_config=quantization_config`
			`).eval()`

			`tokenizer = AutoTokenizer.from_pretrained(model_id)`

			`messages = [`
			`[`
			`{`
			`"role": "system",`
			`"content": [{"type": "text", "text": "You are a helpful assistant."},]`
			`},`
			`{`
			`"role": "user",`
			`"content": [{"type": "text", "text": "Write a poem on Hugging Face, the company"},]`
			`},`
			`],`
			`]`
			`inputs = tokenizer.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
			`return_tensors="pt",`
			`).to(model.device).to(torch.bfloat16)`


			`with torch.inference_mode():`
			`outputs = model.generate(**inputs, max_new_tokens=64)`

			`outputs = tokenizer.batch_decode(outputs)`
			```