初始化项目，由ModelHub XC社区提供模型

Model: ai-guru/lakhclean_mmmtrack_4bars_d-2048 Source: Original Platform
2026-06-07 06:55:16 +08:00
commit 9dc06ed753
8 changed files with 491 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,28 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bin.* filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zstandard filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,107 @@
+---
+tags:
+- gpt2
+- text-generation
+- music-modeling
+- music-generation
+widget:
+- text: PIECE_START
+- text: PIECE_START PIECE_START TRACK_START INST=34 DENSITY=8
+- text: PIECE_START TRACK_START INST=1
+---
+
+
+# GPT-2 for Music
+
+Language Models such as GPT-2 can be used for Music Generation. The idea is to represent pieces of music as texts, effectively reducing the task to Language Generation.
+
+This model is a rather small instance of GPT-2 trained the [Lakhclean dataset](https://colinraffel.com/projects/lmd/). The model generates 4 bars at a time at a 16th note resolution with 4/4 meter.
+
+If you want to contribute, if you want to say hello, if you want to know more, find me here:
+
+- https://www.linkedin.com/in/dr-tristan-behrens-734967a2/
+- https://www.youtube.com/@drtristanbehrens 
+- https://twitter.com/DrTBehrens
+- https://github.com/AI-Guru
+- https://huggingface.co/TristanBehrens
+- https://huggingface.co/ai-guru
+
+Run the model on Google Colab: https://colab.research.google.com/drive/1Mz-KJ8vX4Wylr4mzvgP-MclDwQJ06KSq?usp=sharing
+
+## License
+
+You are free to use this model in any open-source context without charge. If you do so, please credit me.
+
+However, if you wish to use the model for commercial purposes, please contact me to discuss licensing terms. Depending on the specific use case, there may be fees associated with commercial use. I am open to negotiating the terms of the license to meet your needs and ensure that the model is used appropriately. Please feel free to reach out to me at your earliest convenience to discuss further.
+
+## Model description
+
+The model is GPT-2 with 6 decoders and 8 attention heads each. The context length is 2048. The embedding dimensions are 512.
+
+## Model family
+
+This model is part of a huge group of Transformers I have trained. Most of them are not publicly available.
+
+If you are interested in using andor licensing one of the models, please get in touch. 
+
+### Lakhclean
+
+These models were trained on roundabout 15K MIDI files (the same as the model you are viewing now) from the Lakhclean dataset.
+
+- lakhclean_mmmbar_4bars_d-2048: 4 bars resolution, bar inpainting, note density conditioning
+- lakhclean_mmmbar_8bars_d-2048: 8 bars resolution, bar inpainting, note density conditioning
+- lakhclean_mmmtrack_4bars_chords: 4 bars resolution, chord conditioning
+- lakhclean_mmmtrack_4bars_d-2048: 4 bars resolution, note density conditioning (this model)
+- lakhclean_mmmtrack_4bars_simple-2048: 4 bars resolution
+- lakhclean_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+
+### Lakhfull
+
+These models were trained on roundabout 175K MIDI files from the Lakh dataset.
+
+- lakhfull_mmmtrack_4bars_d-2048: 4 bars resolution, note density conditioning (the big brother of this model)
+- lakhfull_mmmtrack_4bars_simple-2048: 4 bars resolution
+
+### Metal
+
+These models were trained on roundabout 7K MIDI files from my own collections. They contain genre conditioning.
+
+- metal_mmmbar_4bars_d-2048: 4 bars resolution, bar inpainting, note density conditioning
+- metal_mmmbar_8bars_d-2048: 8 bars resolution, bar inpainting, note density conditioning
+- metal_mmmtrack_4bars_d-2048: 4 bars resolution, note density conditioning
+- metal_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+
+### MetaMIDI Dataset genres
+
+These models were trained on genre-specific subsets of the MetaMIDI dataset.
+
+- mmd-baroque_mmmtrack_4bars_d-2048: 4 bars resolution, note density conditioning
+- mmd-baroque_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+- mmd-classical_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+- mmd-noncontemporary_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+- mmd-pop_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+- mmd-renaissance_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+
+### MetaMIDI Dataset full
+
+These models were trained on roundabout 400K MIDI files from the MetaMIDI dataset.
+
+- mmd-full_mmmtrack_4bars_d-2048: 4 bars resolution, note density conditioning
+- mmd-full_mmmtrack_8bars_d-2048: 8 bars resolution, note density conditioning
+- mmd-full_mmmtrack_4bars_chords-d-2048: 4 bars resolution, note density conditioning, chord conditioning (most powerful model in the entire group)
+
+## Intended uses & limitations
+
+This model is just a proof of concept. It shows that HuggingFace can be used to compose music.
+
+### How to use
+
+There is a notebook in the repo that you can use to generate symbolic music and then render it.
+
+### Limitations and bias
+
+Since this model has been trained on a very small corpus of music, it is overfitting heavily. 
+
+### Acknowledgements
+
+This model has been created with support from NVIDIA. I am very grateful for the GPU compute they provided!
--- a/config.json
+++ b/config.json
@@ -0,0 +1,34 @@
+{
+  "_name_or_path": "../transformermusic/bin/checkpoints/lakhclean_mmmtrack_4bars_d-2048/20220317-1538/checkpoint-120000",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 2048,
+  "n_embd": 512,
+  "n_head": 8,
+  "n_inner": null,
+  "n_layer": 6,
+  "n_positions": 2048,
+  "pad_token_id": 3,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.16.2",
+  "use_cache": true,
+  "vocab_size": 422
+}
--- a/lakhclean_gpt2_generation.ipynb
+++ b/lakhclean_gpt2_generation.ipynb
@@ -0,0 +1,316 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "DWLOSBkp0A2U"
+      },
+      "source": [
+        "# GPT-2 for music - By Dr. Tristan Behrens\n",
+        "\n",
+        "This notebook shows you how to generate music with GPT-2\n",
+        "\n",
+        "--- \n",
+        "\n",
+        "## Find me online\n",
+        "\n",
+        "- https://www.linkedin.com/in/dr-tristan-behrens-734967a2/\n",
+        "- https://twitter.com/DrTBehrens\n",
+        "- https://github.com/AI-Guru\n",
+        "- https://huggingface.co/TristanBehrens\n",
+        "- https://huggingface.co/ai-guru\n",
+        "\n",
+        "\n",
+        "---\n",
+        "\n",
+        "## Install depencencies.\n",
+        "\n",
+        "The following cell sets up fluidsynth and pyfluidsynth on colaboratory."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "if \"google.colab\" in str(get_ipython()):\n",
+        "    print(\"Installing dependencies...\")\n",
+        "    !apt-get update -qq && apt-get install -qq libfluidsynth2 build-essential libasound2-dev libjack-dev\n",
+        "    !pip install -qU pyfluidsynth"
+      ],
+      "metadata": {
+        "id": "k1a8sd2KZCz9"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "6J_AnhV8D5p6"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install transformers\n",
+        "!pip install note_seq"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "RzhHhFll0JVl"
+      },
+      "source": [
+        "## Load the tokenizer and the model from 🤗 Hub."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "os.environ[\"PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION\"] = \"python\""
+      ],
+      "metadata": {
+        "id": "zGupj_vuZ9f2"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "g3ih12FMD7bs"
+      },
+      "outputs": [],
+      "source": [
+        "from transformers import AutoTokenizer, AutoModelForCausalLM\n",
+        "\n",
+        "tokenizer = AutoTokenizer.from_pretrained(\"ai-guru/lakhclean_mmmtrack_4bars_d-2048\")\n",
+        "model = AutoModelForCausalLM.from_pretrained(\"ai-guru/lakhclean_mmmtrack_4bars_d-2048\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "YfHXFugA0WdI"
+      },
+      "source": [
+        "## Convert the generated tokens to music that you can listen to.\n",
+        "\n",
+        "This uses note_seq, which is something like MIDI coming from Google Magenta. You could even use it to load and save MIDI files. Check their repo if you want to learn more.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "L3QMj8NyEBqs"
+      },
+      "outputs": [],
+      "source": [
+        "import note_seq\n",
+        "\n",
+        "NOTE_LENGTH_16TH_120BPM = 0.25 * 60 / 120\n",
+        "BAR_LENGTH_120BPM = 4.0 * 60 / 120\n",
+        "\n",
+        "def token_sequence_to_note_sequence(token_sequence, use_program=True, use_drums=True, instrument_mapper=None, only_piano=False):\n",
+        "\n",
+        "    if isinstance(token_sequence, str):\n",
+        "        token_sequence = token_sequence.split()\n",
+        "\n",
+        "    note_sequence = empty_note_sequence()\n",
+        "\n",
+        "    # Render all notes.\n",
+        "    current_program = 1\n",
+        "    current_is_drum = False\n",
+        "    current_instrument = 0\n",
+        "    track_count = 0\n",
+        "    for token_index, token in enumerate(token_sequence):\n",
+        "\n",
+        "        if token == \"PIECE_START\":\n",
+        "            pass\n",
+        "        elif token == \"PIECE_END\":\n",
+        "            print(\"The end.\")\n",
+        "            break\n",
+        "        elif token == \"TRACK_START\":\n",
+        "            current_bar_index = 0\n",
+        "            track_count += 1\n",
+        "            pass\n",
+        "        elif token == \"TRACK_END\":\n",
+        "            pass\n",
+        "        elif token == \"KEYS_START\":\n",
+        "            pass\n",
+        "        elif token == \"KEYS_END\":\n",
+        "            pass\n",
+        "        elif token.startswith(\"KEY=\"):\n",
+        "            pass\n",
+        "        elif token.startswith(\"INST\"):\n",
+        "            instrument = token.split(\"=\")[-1]\n",
+        "            if instrument != \"DRUMS\" and use_program:\n",
+        "                if instrument_mapper is not None:\n",
+        "                    if instrument in instrument_mapper:\n",
+        "                        instrument = instrument_mapper[instrument]\n",
+        "                current_program = int(instrument)\n",
+        "                current_instrument = track_count\n",
+        "                current_is_drum = False\n",
+        "            if instrument == \"DRUMS\" and use_drums:\n",
+        "                current_instrument = 0\n",
+        "                current_program = 0\n",
+        "                current_is_drum = True\n",
+        "        elif token == \"BAR_START\":\n",
+        "            current_time = current_bar_index * BAR_LENGTH_120BPM\n",
+        "            current_notes = {}\n",
+        "        elif token == \"BAR_END\":\n",
+        "            current_bar_index += 1\n",
+        "            pass\n",
+        "        elif token.startswith(\"NOTE_ON\"):\n",
+        "            pitch = int(token.split(\"=\")[-1])\n",
+        "            note = note_sequence.notes.add()\n",
+        "            note.start_time = current_time\n",
+        "            note.end_time = current_time + 4 * NOTE_LENGTH_16TH_120BPM\n",
+        "            note.pitch = pitch\n",
+        "            note.instrument = current_instrument\n",
+        "            note.program = current_program\n",
+        "            note.velocity = 80\n",
+        "            note.is_drum = current_is_drum\n",
+        "            current_notes[pitch] = note\n",
+        "        elif token.startswith(\"NOTE_OFF\"):\n",
+        "            pitch = int(token.split(\"=\")[-1])\n",
+        "            if pitch in current_notes:\n",
+        "                note = current_notes[pitch]\n",
+        "                note.end_time = current_time\n",
+        "        elif token.startswith(\"TIME_DELTA\"):\n",
+        "            delta = float(token.split(\"=\")[-1]) * NOTE_LENGTH_16TH_120BPM\n",
+        "            current_time += delta\n",
+        "        elif token.startswith(\"DENSITY=\"):\n",
+        "            pass\n",
+        "        elif token == \"[PAD]\":\n",
+        "            pass\n",
+        "        else:\n",
+        "            #print(f\"Ignored token {token}.\")\n",
+        "            pass\n",
+        "\n",
+        "    # Make the instruments right.\n",
+        "    instruments_drums = []\n",
+        "    for note in note_sequence.notes:\n",
+        "        pair = [note.program, note.is_drum]\n",
+        "        if pair not in instruments_drums:\n",
+        "            instruments_drums += [pair]\n",
+        "        note.instrument = instruments_drums.index(pair)\n",
+        "\n",
+        "    if only_piano:\n",
+        "        for note in note_sequence.notes:\n",
+        "            if not note.is_drum:\n",
+        "                note.instrument = 0\n",
+        "                note.program = 0\n",
+        "\n",
+        "    return note_sequence\n",
+        "\n",
+        "def empty_note_sequence(qpm=120.0, total_time=0.0):\n",
+        "    note_sequence = note_seq.protobuf.music_pb2.NoteSequence()\n",
+        "    note_sequence.tempos.add().qpm = qpm\n",
+        "    note_sequence.ticks_per_quarter = note_seq.constants.STANDARD_PPQ\n",
+        "    note_sequence.total_time = total_time\n",
+        "    return note_sequence"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Generate music\n",
+        "\n",
+        "This will generate one track of music and render it. "
+      ],
+      "metadata": {
+        "id": "4kr2dECziaFA"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "generated_sequence = \"PIECE_START\""
+      ],
+      "metadata": {
+        "id": "cUg1DrlygzgT"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Note: Run the following cell multiple times to generate more tracks."
+      ],
+      "metadata": {
+        "id": "SinUPIHyimr5"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "ZYpukydNESDF"
+      },
+      "outputs": [],
+      "source": [
+        "# Encode the conditioning tokens.\n",
+        "input_ids = tokenizer.encode(generated_sequence, return_tensors=\"pt\")\n",
+        "#print(input_ids)\n",
+        "\n",
+        "# Generate more tokens.\n",
+        "eos_token_id = tokenizer.encode(\"TRACK_END\")[0]\n",
+        "temperature = 1.0\n",
+        "generated_ids = model.generate(\n",
+        "    input_ids, \n",
+        "    max_length=2048,\n",
+        "    do_sample=True,\n",
+        "    temperature=temperature,\n",
+        "    eos_token_id=eos_token_id,\n",
+        ")\n",
+        "generated_sequence = tokenizer.decode(generated_ids[0])\n",
+        "print(generated_sequence)\n",
+        "\n",
+        "note_sequence = token_sequence_to_note_sequence(generated_sequence)\n",
+        "\n",
+        "synth = note_seq.fluidsynth\n",
+        "note_seq.plot_sequence(note_sequence)\n",
+        "note_seq.play_sequence(note_sequence, synth)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "d1x6HeF90kkO"
+      },
+      "source": [
+        "# Thank you!"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3 (ipykernel)",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.9.7"
+    },
+    "accelerator": "GPU",
+    "gpuClass": "standard"
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:42bda0df7ff96166407c5c6602126716e103eba0a18614c247f6e2fc1d0e08b2
+size 105915613
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1 @@
+{"pad_token": "[PAD]"}
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1 @@
+{"tokenizer_class": "PreTrainedTokenizerFast"}
				`@@ -0,0 +1 @@`
				`{"tokenizer_class": "PreTrainedTokenizerFast"}`