Files
Bespoke-MiniChart-7B/README.md
ModelHub XC 9fa607f024 初始化项目,由ModelHub XC社区提供模型
Model: bespokelabs/Bespoke-MiniChart-7B
Source: Original Platform
2026-04-25 20:19:56 +08:00

168 lines
5.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
language:
- en
pipeline_tag: text-generation
---
<p align="center">
<img src="./Bespoke-Labs-Logo.png" width="550">
</p>
# Bespoke-MiniChart-7B
<a href="https://playground.bespokelabs.ai/minichart">
<img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/g-QaXrmPLYk5m3Hq5vFtr.png" width="200px" />
</a>
This is an opensource chartunderstanding VisionLanguage Model (VLM) developed at [Bespoke Labs](https://www.bespokelabs.ai/) and maintained by [Liyan Tang](https://www.tangliyan.com/) and Bespoke Labs. It sets a new stateoftheart in chart questionanswering (ChartQA) for 7 billionparameter models, outperforming much larger closed models such as Gemini1.5Pro and Claude3.5 on seven public benchmarks.
1. **Blog Post**: https://www.bespokelabs.ai/blog/bespoke-minichart-7b
2. **Playground**: https://playground.bespokelabs.ai/minichart
---
# Example Outputs
The examples below showcase how Bespoke-MiniChart-7B can perform both visual perception and textual reasoning.
<p align="left">
<img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/E5WGhi_fVNzCsrKeNeIs3.png" width="700">
</p>
<p align="left">
<img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/bYKXRm3sfOdX3zd_5qUpK.png" width="700">
</p>
# Model Performance
Bespoke-MiniChart-7B achieves state-of-the-art performance on chart understanding among models with similar sizes. In addition to that, the model can even surpass closed-models such as Gemini-1.5-Pro and Claude-3.5.
<p align="left">
<img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/5pejAyzPG_tRBU6FwH7PA.png" width="700">
</p>
We also compare the performance of our model finetuned using SFT+DPO vs SFT only.
In the table below, M1 and M2 are finetuned models with 270K and 1M SFT examples respsectively, and Bespoke-MiniChart-7B is the model finetuned using SFT+DPO.
<p align="left">
<img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/WRsPs437niUrXmYtkRajG.png" width="700">
</p>
# Model Use:
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1FEmlwGgn9209iQO-rs2-9UHPLoytwZMH?usp=sharing)
The model is available on the playground here: https://playground.bespokelabs.ai/minichart
You can also run the model with the following snippet:
```python
import requests
from PIL import Image
from io import BytesIO
import base64
import matplotlib.pyplot as plt
from vllm import LLM, SamplingParams
QA_PROMPT = """Please answer the question using the chart image.
Question: [QUESTION]
Please first generate your reasoning process and then provide the user with the answer. Use the following format:
<think>
... your thinking process here ...
</think>
<answer>
... your final answer (entity(s) or number) ...
</answer>"""
def get_image_from_url(image_url):
try:
response = requests.get(image_url, stream=True)
response.raise_for_status()
return Image.open(BytesIO(response.content))
except Exception as e:
print(f"Error with image: {e}")
return None
def get_answer(image_url, question, display=True):
image = get_image_from_url(image_url)
if display:
plt.figure(figsize=(10, 8))
plt.imshow(image)
plt.axis('off')
plt.show()
if not image:
return "Error downloading image"
buffered = BytesIO()
image.save(buffered, format=image.format or 'JPEG')
encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
messages = [{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
{"type": "text", "text": QA_PROMPT.replace("[QUESTION]", question)}
]
}]
response = llm.chat([messages], sampling_params=SamplingParams(temperature=0, max_tokens=500))
return response[0].outputs[0].text
# Initialize the LLM
llm = LLM(
model="bespokelabs/Bespoke-MiniChart-7B",
tokenizer_mode="auto",
max_model_len=15000,
tensor_parallel_size=1,
gpu_memory_utilization=0.9,
mm_processor_kwargs={"max_pixels": 1600*28*28},
seed=2025,
trust_remote_code=True,
)
# Running inference
image_url = "https://github.com/bespokelabsai/minichart-playground-examples/blob/main/images/ilyc9wk4jf8b1.png?raw=true"
question = "How many global regions maintained their startup funding losses below 30% in 2022?"
print("\n\n=================Model Output:===============\n\n", get_answer(image_url, question))
```
---
# Licence
This work is licensed under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
For commercial licensing, please contact company@bespokelabs.ai.
# Citation
```
@misc{bespoke_minichart_7b,
title = {Bespoke-MiniChart-7B: pushing the frontiers of open VLMs for chart understanding},
author = {Liyan Tang and Shreyas Pimpalgaonkar and Kartik Sharma and Alexandros G. Dimakis and Mahesh Sathiamoorthy and Greg Durrett},
howpublished = {blog post},
year = {2025},
url={https://huggingface.co/bespokelabs/Bespoke-MiniChart-7B},
}
```
# Acknowledgements
**Bespoke Labs** team:
- Liyan Tang
- Shreyas Pimpalgaonkar
- Kartik Sharma
- Alex Dimakis
- Mahesh Sathiamoorthy
- Greg Durrett
*Model perfected at Bespoke Labs — where careful curation meets cuttingedge modeling.*