57 lines
3.2 KiB
Markdown
57 lines
3.2 KiB
Markdown
|
|
---
|
|||
|
|
license: apache-2.0
|
|||
|
|
tags:
|
|||
|
|
- llm
|
|||
|
|
- hyperbolic
|
|||
|
|
- geometry
|
|||
|
|
- adapter
|
|||
|
|
- peft
|
|||
|
|
- research
|
|||
|
|
base_model:
|
|||
|
|
- Qwen/Qwen2.5-7B
|
|||
|
|
pipeline_tag: text-generation
|
|||
|
|
library_name: peft
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# ManifoldGL – Information‑Geometric Adapter for LLMs
|
|||
|
|
|
|||
|
|
ManifoldGL is a parameter‑efficient adapter that enforces **hyperbolic geometry** on the latent space of large language models. It treats the meaning of a token as a **fiber** over a hyperbolic base manifold (a Poincaré ball), rather than a single vector in flat Euclidean space. Latent states are projected onto the ball, and attentions are computed using geodesic distance. A sheaf‑theoretic consistency loss and natural gradient optimization maintain semantic structure during training.
|
|||
|
|
|
|||
|
|
## Motivation and theoretical background
|
|||
|
|
|
|||
|
|
Modern LLMs embed tokens in a Euclidean vector space. While convenient, Euclidean geometry has limited capacity to represent hierarchical structures: flat space grows polynomially, whereas hierarchical trees expand exponentially. By contrast, **hyperbolic space** grows exponentially and preserves both local and global relationships in a hierarchy【247949143190903†L115-L124】. Hyperbolic embeddings outperform Euclidean ones for lexical entailment, similarity and analogy tasks【247949143190903†L154-L169】. ManifoldGL leverages these properties by modelling the latent space as a fiber bundle over a hyperbolic base: each point in the Poincaré ball encodes a context, and its fiber contains a distribution of semantic components.
|
|||
|
|
|
|||
|
|
## Results on ARC‑AGI benchmark
|
|||
|
|
|
|||
|
|
ManifoldGL fine‑tuned on Qwen2.5‑7B improves task accuracy on the ARC‑AGI benchmark from **12.4 %** to **28.7 %**, a **131.5 % relative improvement**. The model also achieves a **Manifold Faithfulness Rate (MFR) of 94.2 %**, indicating high adherence to the hyperbolic constraints, and maintains a curvature close to the target κ = ‑1 (mean ‑0.98 ± 0.04). Ablation studies show that removing curvature regularization, natural gradients, sheaf consistency or the hyperbolic target significantly reduces accuracy; the Euclidean target ablation causes the largest drop (–10.9 %), highlighting the importance of hyperbolic geometry.
|
|||
|
|
|
|||
|
|
## Files in this repository
|
|||
|
|
|
|||
|
|
This model card accompanies adapter weights trained with ManifoldGL. The files follow the structure of the original repository:
|
|||
|
|
|
|||
|
|
- `adapter_config.json` – configuration for PEFT/LoRA loading
|
|||
|
|
- `pytorch_adapter.bin` – adapter weights
|
|||
|
|
- `README.md` – this model card
|
|||
|
|
|
|||
|
|
## Quick start
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from transformers import AutoModelForCausalLM
|
|||
|
|
from peft import PeftModel
|
|||
|
|
|
|||
|
|
# Load the base model (Qwen2.5-7B)
|
|||
|
|
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B")
|
|||
|
|
|
|||
|
|
# Load the ManifoldGL adapter
|
|||
|
|
model = PeftModel.from_pretrained(base_model, "jesusvilela/manifoldgl")
|
|||
|
|
|
|||
|
|
# Now use model.generate(...) to generate text with hyperbolic adapters
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Usage
|
|||
|
|
|
|||
|
|
This adapter can be loaded with [PEFT](https://github.com/huggingface/peft) on top of any compatible Qwen2.5‑7B model. During generation, latent states are projected into hyperbolic space and meaning is represented as fibers. We recommend using FP32 precision for maximum stability.
|
|||
|
|
|
|||
|
|
## Citation
|
|||
|
|
|
|||
|
|
If you use ManifoldGL in your work, please cite the accompanying thesis and repository.
|