初始化项目，由ModelHub XC社区提供模型

Model: tabularisai/Faust-1 Source: Original Platform
2026-05-27 05:22:17 +08:00
commit 4696ec8d99
14 changed files with 490660 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,41 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+faust-1-dpo-golden-v1-1601-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+faust_bench.png filter=lfs diff=lfs merge=lfs -text
+logo-faust.webp filter=lfs diff=lfs merge=lfs -text
+faust-1-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+faust_1_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+tokenizer_faust.png filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,395 @@
+---
+library_name: transformers
+license_link: https://huggingface.co/Qwen/Qwen3-1.7B/blob/main/LICENSE
+pipeline_tag: text-generation
+license: cc-by-nc-4.0
+extra_gated_prompt: >
+  ### FAUST-1 NON-COMMERCIAL LICENSE AGREEMENT
+
+
+  Version 1.0 — January 2025
+
+
+  "Faust-1" refers to the language model weights, code, and documentation made
+  available by Tabularis AI GmbH ("Tabularis") under this agreement.
+
+
+  1. License Grant
+
+  You are granted a non-exclusive, non-transferable, royalty-free license to
+  use, copy, and modify Faust-1 for non-commercial research and personal
+  purposes only.
+
+
+  2. Non-Commercial Use
+
+  "Non-commercial" means academic research, personal projects, and educational
+  use. Any use intended to generate revenue, provide commercial services, or
+  benefit a for-profit entity requires a separate commercial license.
+
+
+  3. Commercial Licensing
+
+  For commercial use, please contact: info@tabularis.ai
+
+
+  4. Attribution
+
+  You must include "Built with Faust-1 by Tabularis AI" in any derivative work
+  or publication.
+
+
+  5. No Warranty
+
+  Faust-1 is provided "as is" without warranties of any kind.
+
+
+  6. Termination
+
+  This license terminates automatically if you violate any terms.
+
+
+  ---
+
+  ### Additional Access Requirement
+
+  Access to this repository is approval-based.
+
+  You must join our Discord server: https://discord.gg/7WqEKw652R
+extra_gated_fields:
+  Name: text
+  Email: text
+  Affiliation: text
+  I have joined the Tabularis AI Discord server: checkbox
+  I accept the Faust-1 Non-Commercial License Agreement: checkbox
+extra_gated_description: |
+  Faust-1 is for non-commercial use only.
+  For commercial licensing contact info@tabularis.ai
+
+  Approval requires Discord membership.
+  Join: https://discord.gg/7WqEKw652R
+extra_gated_button_content: Submit
+language:
+- de
+- en
+
+
+
+
+
+tags:
+- llama.cpp
+- synthetic data
+
+
+
+
+---
+
+
+
+<!-- <a href="https://faust.tabularis.ai/" target="_blank" style="margin: 2px;">
+  <img
+    alt="Faust-1 Demo"
+    src="https://img.shields.io/badge/%E2%9C%A8%20Faust--1%20Demo-2b2b2b?style=flat&logo=ai&logoColor=white"
+    style="display: inline-block; vertical-align: middle;"
+  />
+</a> -->
+
+
+<p align="center">
+  <img src="./logo-faust.webp" alt="Faust-1 Logo" width="220">
+</p>
+
+# Faust-1 — German-First Large Language Model (1.6B)
+
+Faust-1 is a German-first large language model with 1.6B parameters, trained entirely from scratch. Model development comprises large-scale data collection and synthetic data generation, followed by data cleaning, normalization, and deduplication to reduce contamination and redundancy. Pre-training is performed on a predominantly German corpus using a decoder-only language modeling objective, resulting in a foundation model for the German language that captures lexical, syntactic, and semantic regularities at scale.
+
+Following pre-training, the model undergoes supervised post-training (instruction tuning) using labeled input–output pairs to adapt the base model for conversational and task-oriented use. In later stages, preference-based optimization, including Direct Preference Optimization (DPO), is applied to improve response quality, stability, and alignment with human expectations, while preserving the efficiency constraints required for small-scale and local deployment.
+
+<!-- Demo: [faust.tabularis.ai](https://faust.tabularis.ai)
+ -->
+
+
+
+
+
+
+
+> [!TIP]
+> **Designed for local and cost-efficient deployment.**  
+> Faust-1 is deliberately sized and optimized to run on **consumer-grade hardware** and **does not require expensive data-center GPUs**.
+---
+
+## Model summary
+
+- Repository: tabularisai/Faust-1  
+- Model type: decoder-only causal language model  
+- Parameters: 1.6B  
+- Interface: conversational / instruction (chat template provided)  
+- Primary language: German (~90%)
+- Custom State-of-the-Art tokenizer for German language
+
+---
+
+## Quickstart
+
+
+
+
+### Conversational usage (recommended)
+
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM  
+import torch  
+
+model_id = "tabularisai/Faust-1"
+
+tokenizer = AutoTokenizer.from_pretrained(model_id)  
+model = AutoModelForCausalLM.from_pretrained(  
+    model_id,  
+    torch_dtype=torch.float16,  
+    device_map="auto",  
+)
+
+messages = [  
+    {"role": "user", "content": "Gib mir eine kurze Einführung in große Sprachmodelle (LLM)."}  
+]
+
+inputs = tokenizer.apply_chat_template(  
+    messages,  
+    add_generation_prompt=True,  
+    return_tensors="pt",  
+).to(model.device)
+
+outputs = model.generate(  
+    inputs,  
+    max_new_tokens=256,  
+    temperature=0.6,  
+    do_sample=True,  
+)
+
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+---
+
+## Conditional Generation
+
+```python
+!pip install git+https://github.com/tabularis-ai/guidegen.git
+
+import sys
+import os
+import json
+import time
+
+import guidegen as gg
+from pydantic import BaseModel, Field
+from typing import Literal, List
+
+# Hugging Face access token - set via environment variable or .env file
+# You can set it with: export HUGGINGFACE_HUB_TOKEN=your_token_here
+# Or create a .env file with: HUGGINGFACE_HUB_TOKEN=your_token_here
+
+MODEL_NAME = "tabularisai/Faust-1"
+
+
+# --- Schema ---
+class EmailSummary(BaseModel):
+    """Structured summary of an email."""
+    Absender: str = Field(description="Der Name des Absenders.")
+    Betreff: str = Field(description="Worum geht es in der E-Mail? (max 5 Wörter)")
+    Zusammenfassung: str = Field(description="Kurze Zusammenfassung (max 2 Sätze).")
+    Prioritaet: Literal["hoch", "mittel", "niedrig"] = Field(description="Wie wichtig die E-Mail ist.")
+    # AntwortNoetig: bool = Field(description="Muss man auf die E-Mail antworten?")
+
+
+# --- Input ---
+email_text = """Hallo Jens,
+
+wir hatten uns bei CampusFounders im Rahmen unserer Pre-Seed-Runde kennengelernt.
+Seitdem haben wir große Fortschritte gemacht und bereiten aktuell unsere Seed-Runde vor.
+
+Wir entwickeln eine Infrastruktur für hocheffiziente, lokal trainierbare KI-Modelle – vollständig ohne Cloud.
+Sehr gern würden wir uns mit dir austauschen und prüfen, ob ein Intro zu US-VCs oder ein Gespräch mit Crestlight möglich wäre.
+
+Anbei ein kurzer OnePager zur Weiterleitung.
+
+Beste Grüße  
+Ricard"""
+
+
+
+
+
+# --- Prompt ---
+prompt = f"""
+Du bist ein intelligenter Assistent, der E-Mails analysiert und als JSON zusammenfasst.
+Halte die Zusammenfassung kurz (1-2 Sätze). Betreff maximal 5 Wörter.
+
+--- Beispiel ---
+E-Mail-Text:
+Sehr geehrte Damen und Herren, ich wollte nur nachfragen, ob meine Bestellung #12345 schon versandt wurde. Vielen Dank, Max Mustermann
+JSON-Antwort:
+{{
+  "Absender": "Max Mustermann",
+  "Betreff": "Bestellstatus Anfrage",
+  "Zusammenfassung": "Anfrage zum Versandstatus der Bestellung #12345.",
+  "Prioritaet": "mittel",
+}}
+--- Ende Beispiel ---
+
+Jetzt analysiere die folgende E-Mail und erstelle das JSON-Objekt.
+
+E-Mail-Text:
+{email_text}
+"""
+
+
+def main():
+    print("=" * 60)
+    print("EMAIL SUMMARIZATION WITH GUIDEGEN")
+    print("=" * 60)
+
+    print(f"\nLoading model: {MODEL_NAME}")
+    load_start = time.time()
+
+    gen = gg.GuideGen(
+        MODEL_NAME,
+        verbose=True,
+        use_chat_template=True,
+        enable_thinking=False,
+    )
+
+    load_time = time.time() - load_start
+    print(f"Model loaded in {load_time:.2f}s")
+
+    # --- Generate ---
+    print("\nGenerating structured summary...")
+    gen_start = time.time()
+
+    options = gg.GuideGenOptions(
+        temperature=0.6,
+        max_tokens=400,
+        do_sample=False,
+    )
+
+    summary = gen.generate(prompt, EmailSummary, options=options)
+
+    gen_time = time.time() - gen_start
+    print(f"Generation complete in {gen_time:.2f}s")
+
+    # --- Output ---
+    print("\n--- Email Summary (JSON) ---")
+    print(json.dumps(summary.model_dump(), indent=2, ensure_ascii=False))
+    print(f"\n  Model load: {load_time:.2f}s | Generation: {gen_time:.2f}s | Total: {load_time + gen_time:.2f}s")
+```
+
+---
+
+## Training focus
+
+### German-first data distribution
+
+Faust-1 is trained from scratch with a German-dominant corpus. German syntax, compounding, morphology, and typical reasoning patterns are treated as the default operating regime rather than an edge case.
+
+### Verified synthetic data
+
+A substantial portion of the training signal comes from synthetic data. To keep this signal usable, generation is paired with explicit verification and filtering:
+
+- LLM-as-judge style evaluations  
+- rule-based and programmatic checks  
+- consistency and self-agreement filtering  
+
+This allows broad coverage of instruction-following and reasoning patterns while maintaining quality control.
+
+---
+
+## Tokenizer optimized for German
+
+Faust-1 uses a custom tokenizer optimized for German morphology and compounding. Token efficiency is treated as a deployment constraint, not just a preprocessing detail.
+
+![Tokenizer efficiency on German language](tokenizer_bench.png)
+
+Lower token counts on German text translate directly into more usable context, lower inference cost, and less fragmentation on compound-heavy inputs.
+
+
+<img src="tokenizer_faust.png" alt="Faust-1 vs OpenAI Tokenizers" width="800">
+
+
+---
+
+## German benchmark performance
+
+Faust-1 is evaluated on a set of standard German-language benchmarks:
+
+- ARC_de  
+- GSM8K_de  
+- HellaSwag_de  
+- MMLU_de  
+- TruthfulQA_de  
+
+![German benchmark performance](faust_bench.png)
+
+The target is best-in-class performance within the 1–2B parameter range for German-focused models, using benchmarks that are easy to reproduce in Hugging Face-based evaluation pipelines.
+
+---
+
+## Deployment examples
+
+Faust-1 can be deployed with common inference stacks that support decoder-only language models.
+
+vLLM (OpenAI-compatible API)
+```sh
+vllm serve tabularisai/Faust-1 --dtype float16
+```
+
+SGLang
+```sh
+python -m sglang.launch_server \
+  --model-path tabularisai/Faust-1 \
+  --dtype float16
+```
+
+llama.cpp (GGUF, local / on-device)
+```sh
+./llama-cli \
+  -m faust_1_q8_0.gguf \
+  -p "Erkläre kurz, was ein großes Sprachmodell ist."
+```
+
+The repository includes a prebuilt Q8_0 GGUF file for efficient local inference.
+
+---
+
+## Intended use
+
+- German conversational assistants  
+- research and benchmarking on German NLP tasks  
+- local and privacy-sensitive deployments  
+- on-device or edge experimentation  
+
+---
+
+## Roadmap
+
+- Reasoning-focused variant  (comming soon)
+- Agent-oriented variant  (comming soon)
+
+---
+
+## Citation
+
+A technical paper describing training methodology, tokenizer design, and evaluation is in preparation.
+
+
+
+
+
+
+
+
+
+
+Developed by [tabularis.ai](https://tabularis.ai) in Tübingen.
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,14 @@
+{% for m in messages %}
+{%   if m['role'] == 'system' %}
+<|im_start|>system
+{{ m['content'] }}<|im_end|>
+{%   elif m['role'] == 'user' %}
+<|im_start|>user
+{{ m['content'] }}<|im_end|>
+{%   elif m['role'] == 'assistant' %}
+<|im_start|>assistant
+{% generation %}{{ m['content'] }}{{ eos_token }}{% endgeneration %}
+{%   endif %}
+{% endfor %}
+{% if add_generation_prompt %}<|im_start|>assistant
+{% endif %}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,61 @@
+{
+  "architectures": [
+    "Qwen3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 3,
+  "dtype": "bfloat16",
+  "eos_token_id": 6,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 2048,
+  "initializer_range": 0.02,
+  "intermediate_size": 6144,
+  "layer_types": [
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 40960,
+  "max_window_layers": 28,
+  "model_type": "qwen3",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 28,
+  "num_key_value_heads": 8,
+  "pad_token_id": 1,
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 1000000,
+  "sliding_window": null,
+  "tie_word_embeddings": true,
+  "transformers_version": "4.57.5",
+  "use_cache": false,
+  "use_sliding_window": false,
+  "vocab_size": 100000
+}
--- a/faust_1_q8_0.gguf
+++ b/faust_1_q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5c31eb18c8eade06e0e6593382017f55a19b2e830b860a29c96806d64073f582
+size 1719280512
--- a/faust_bench.png
+++ b/faust_bench.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:78231d070ee7b586ae09d4601e84b0629d616c7459428dbdf25ae72ec352a05f
+size 106310
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,13 @@
+{
+  "bos_token_id": 3,
+  "do_sample": true,
+  "eos_token_id": [
+    6,
+    4
+  ],
+  "pad_token_id": 1,
+  "temperature": 0.6,
+  "top_k": 20,
+  "top_p": 0.95,
+  "transformers_version": "4.57.5"
+}
--- a/logo-faust.webp
+++ b/logo-faust.webp
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:60e1441eecdde9fab1e9d681f1102d979046b981f5174d1f074f6849dff5b2a6
+size 1002788
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:771c4911227f2792ce43c5f4e285bb4ec67942b95fa15f376b20cd2227879de6
+size 3228455704
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,45 @@
+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|im_sep|>",
+    "<|special_0|>",
+    "<|special_1|>",
+    "<|special_2|>",
+    "<|special_3|>",
+    "<|special_4|>",
+    "<|special_5|>",
+    "<|special_6|>",
+    "<|special_7|>",
+    "<|special_8|>",
+    "<|special_9|>"
+  ],
+  "bos_token": {
+    "content": "<|bos|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|pad|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<|unk|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_bench.png
+++ b/tokenizer_bench.png
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,180 @@
+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<|pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "<|unk|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<|bos|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "<|eos|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "5": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "6": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "7": {
+      "content": "<|im_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "8": {
+      "content": "<|special_0|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "9": {
+      "content": "<|special_1|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "10": {
+      "content": "<|special_2|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "11": {
+      "content": "<|special_3|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "12": {
+      "content": "<|special_4|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "13": {
+      "content": "<|special_5|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "14": {
+      "content": "<|special_6|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "15": {
+      "content": "<|special_7|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "16": {
+      "content": "<|special_8|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "17": {
+      "content": "<|special_9|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|im_sep|>",
+    "<|special_0|>",
+    "<|special_1|>",
+    "<|special_2|>",
+    "<|special_3|>",
+    "<|special_4|>",
+    "<|special_5|>",
+    "<|special_6|>",
+    "<|special_7|>",
+    "<|special_8|>",
+    "<|special_9|>"
+  ],
+  "bos_token": "<|bos|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "extra_special_tokens": {},
+  "max_length": 2048,
+  "model_max_length": 8192,
+  "pad_token": "<|pad|>",
+  "stride": 0,
+  "tokenizer_class": "PreTrainedTokenizerFast",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "<|unk|>",
+  "return_token_type_ids": false,
+  "model_input_names": [
+    "input_ids",
+    "attention_mask"
+  ]
+}
--- a/tokenizer_faust.png
+++ b/tokenizer_faust.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:43356a78c9e9925f7b8c3b6fc8531517cba19bbf9e73ae81619d1e928fa488dc
+size 453747