初始化项目，由ModelHub XC社区提供模型

Model: bespokelabs/Bespoke-MiniChart-7B Source: Original Platform
2026-04-25 20:19:56 +08:00
commit 9fa607f024
19 changed files with 152710 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,168 @@
+---
+language:
+- en
+pipeline_tag: text-generation
+---
+
+<p align="center">
+    <img src="./Bespoke-Labs-Logo.png" width="550">
+</p>
+
+# Bespoke-MiniChart-7B
+
+<a href="https://playground.bespokelabs.ai/minichart">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/g-QaXrmPLYk5m3Hq5vFtr.png" width="200px" />
+</a>
+
+This is an open‑source chart‑understanding Vision‑Language Model (VLM) developed at [Bespoke Labs](https://www.bespokelabs.ai/) and maintained by [Liyan Tang](https://www.tangliyan.com/) and Bespoke Labs. It sets a new state‑of‑the‑art in chart question‑answering (Chart‑QA) for 7 billion‑parameter models, outperforming much larger closed models such as Gemini‑1.5‑Pro and Claude‑3.5 on seven public benchmarks.
+
+1. **Blog Post**: https://www.bespokelabs.ai/blog/bespoke-minichart-7b
+2. **Playground**: https://playground.bespokelabs.ai/minichart
+---
+
+# Example Outputs
+
+The examples below showcase how Bespoke-MiniChart-7B can perform both visual perception and textual reasoning.
+
+
+<p align="left">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/E5WGhi_fVNzCsrKeNeIs3.png" width="700">
+</p>
+
+<p align="left">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/bYKXRm3sfOdX3zd_5qUpK.png" width="700">
+</p>
+
+
+# Model Performance
+
+Bespoke-MiniChart-7B achieves state-of-the-art performance on chart understanding among models with similar sizes. In addition to that, the model can even surpass closed-models such as Gemini-1.5-Pro and Claude-3.5.
+
+<p align="left">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/5pejAyzPG_tRBU6FwH7PA.png" width="700">
+</p>
+
+We also compare the performance of our model finetuned using SFT+DPO vs SFT only. 
+
+In the table below, M1 and M2 are finetuned models with 270K and 1M SFT examples respsectively, and Bespoke-MiniChart-7B is the model finetuned using SFT+DPO. 
+
+<p align="left">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/WRsPs437niUrXmYtkRajG.png" width="700">
+</p>
+
+
+# Model Use:
+
+[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1FEmlwGgn9209iQO-rs2-9UHPLoytwZMH?usp=sharing)
+
+The model is available on the playground here: https://playground.bespokelabs.ai/minichart
+
+You can also run the model with the following snippet: 
+
+```python
+import requests
+from PIL import Image
+from io import BytesIO
+import base64
+import matplotlib.pyplot as plt
+from vllm import LLM, SamplingParams
+
+QA_PROMPT = """Please answer the question using the chart image.
+
+Question: [QUESTION]
+
+Please first generate your reasoning process and then provide the user with the answer. Use the following format:
+
+<think> 
+... your thinking process here ... 
+</think> 
+<answer> 
+... your final answer (entity(s) or number) ...
+</answer>"""
+
+def get_image_from_url(image_url):
+    try:
+        response = requests.get(image_url, stream=True)
+        response.raise_for_status()
+        return Image.open(BytesIO(response.content))
+    except Exception as e:
+        print(f"Error with image: {e}")
+        return None
+
+def get_answer(image_url, question, display=True):
+    image = get_image_from_url(image_url)
+
+    if display:
+      plt.figure(figsize=(10, 8))
+      plt.imshow(image)
+      plt.axis('off')
+      plt.show()
+
+    if not image:
+        return "Error downloading image" 
+  
+    buffered = BytesIO()
+    image.save(buffered, format=image.format or 'JPEG')
+    encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
+    
+    messages = [{
+        "role": "user",
+        "content": [
+            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
+            {"type": "text", "text": QA_PROMPT.replace("[QUESTION]", question)}
+        ]
+    }]
+
+    response = llm.chat([messages], sampling_params=SamplingParams(temperature=0, max_tokens=500))
+    return response[0].outputs[0].text
+    
+# Initialize the LLM
+llm = LLM(
+    model="bespokelabs/Bespoke-MiniChart-7B",
+    tokenizer_mode="auto",
+    max_model_len=15000,
+    tensor_parallel_size=1,
+    gpu_memory_utilization=0.9,
+    mm_processor_kwargs={"max_pixels": 1600*28*28},
+    seed=2025,
+    trust_remote_code=True,
+)
+
+# Running inference
+image_url = "https://github.com/bespokelabsai/minichart-playground-examples/blob/main/images/ilyc9wk4jf8b1.png?raw=true"
+question = "How many global regions maintained their startup funding losses below 30% in 2022?"
+
+print("\n\n=================Model Output:===============\n\n", get_answer(image_url, question))
+```
+
+---
+# Licence
+
+This work is licensed under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
+For commercial licensing, please contact company@bespokelabs.ai.
+
+# Citation
+
+```
+@misc{bespoke_minichart_7b,
+  title  = {Bespoke-MiniChart-7B: pushing the frontiers of open VLMs for chart understanding},
+  author = {Liyan Tang and Shreyas Pimpalgaonkar and Kartik Sharma and Alexandros G. Dimakis and Mahesh Sathiamoorthy and Greg Durrett},
+  howpublished = {blog post},
+  year   = {2025},
+  url={https://huggingface.co/bespokelabs/Bespoke-MiniChart-7B},
+}
+```
+
+# Acknowledgements
+
+**Bespoke Labs** team:
+
+- Liyan Tang
+- Shreyas Pimpalgaonkar
+- Kartik Sharma
+- Alex Dimakis
+- Mahesh Sathiamoorthy
+- Greg Durrett
+
+
+*Model perfected at Bespoke Labs — where careful curation meets cutting‑edge modeling.*