41 lines
1.2 KiB
Markdown
41 lines
1.2 KiB
Markdown
---
|
|
base_model: HuggingFaceTB/SmolLM2-135M-Instruct
|
|
library_name: transformers
|
|
license: apache-2.0
|
|
tags:
|
|
- text-generation
|
|
- query-expansion
|
|
- fine-tuned
|
|
- smollm2
|
|
- lora
|
|
language:
|
|
- en
|
|
---
|
|
|
|
# SmolLM2-135M — Query Expansion (Fine-tuned)
|
|
|
|
<p align="center">
|
|
<img src="./query-expansion.png" alt="Query Expansion Architecture" width="600"/>
|
|
</p>
|
|
|
|
|
|
Fine-tuned version of [SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct)
|
|
for the task of **search query expansion**.
|
|
|
|
## Training
|
|
- Base model: `HuggingFaceTB/SmolLM2-135M-Instruct`
|
|
- Method: LoRA (merged into base weights at export)
|
|
- Task: Query expansion / reformulation
|
|
|
|
## Quick usage
|
|
```python
|
|
from transformers import AutoTokenizer, pipeline
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("PraneelChetty-1/smollm2-135m-query-expansion")
|
|
pipe = pipeline("text-generation", model="PraneelChetty-1/smollm2-135m-query-expansion", device_map="auto")
|
|
|
|
messages = [{"role": "user", "content": "yield monitoring systems"}]
|
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
|
|
|
out = pipe(prompt, max_new_tokens=80, do_sample=True, temperature=0.7)
|
|
print(out[0]["generated_text"][len(prompt):]) |