Files
willow-alpha/vision_adapter
ModelHub XC 1767ed14d9 初始化项目,由ModelHub XC社区提供模型
Model: North-ML1/willow-alpha
Source: Original Platform
2026-06-10 15:02:23 +08:00
..

license
license
mit

Forge-1V Micro Vision Adapter Scaffold

This folder contains an experimental, untrained micro vision-adapter scaffold for future Forge-1V work.

It is intentionally separate from the main text checkpoint:

  • The main config.json remains Llama-compatible.
  • The GGUF export remains text-only.
  • These files do not make the released model able to view images.

Suggested target design:

  • Tiny patch encoder: 3-channel images to a small vision width.
  • Projection: vision width to the 1024-dimensional Forge text hidden size.
  • Prefix tokens: projected visual tokens can be prepended to the text sequence in a future custom multimodal training run.

Approximate extra parameters for the scaffold design are well under 1M, keeping the total system under 400M parameters.