初始化项目,由ModelHub XC社区提供模型
Model: electroglyph/Qwen3-4B-Instruct-2507-uncensored-unslop-v2 Source: Original Platform
This commit is contained in:
40
README.md
Normal file
40
README.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
license_link: https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507/blob/main/LICENSE
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
This is a GRPO finetune to remove slop from this model: [Qwen3-4B-Instruct-2507-uncensored](https://huggingface.co/electroglyph/Qwen3-4B-Instruct-2507-uncensored)
|
||||
|
||||
It's not perfect, there are some before and after examples below.
|
||||
|
||||
I used the same method (mostly) as this model: [gemma-3-4b-it-unslop-GRPO-v3](https://huggingface.co/electroglyph/gemma-3-4b-it-unslop-GRPO-v3)
|
||||
|
||||
Note: This is *not* an RP tune, it's a compliant model with a different style from regular Qwen3 4B 2507.
|
||||
|
||||
My uncensoring dataset was generated by Gemma 3 27B abliterated model, which added a lot of Gemma writing style to this model.
|
||||
|
||||
It also added some Gemma style slop, which this finetune has helped mitigate.
|
||||
|
||||
I've uploaded a UD-Q4_K_XL GGUF with settings that I grabbed from Unsloth's quant using my lil utility: [quant_clone](https://github.com/electroglyph/quant_clone)
|
||||
|
||||
Here are some pics of before and after output with the slop highlighted and total at the bottom:
|
||||
|
||||
Prompt = "write a short story about a gothic romance, it should be around 500 words long"
|
||||
|
||||
Before:
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
After:
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
Reference in New Issue
Block a user