初始化项目，由ModelHub XC社区提供模型

Model: AG2307/gpt2-finetuned-latex Source: Original Platform
2026-05-06 13:42:51 +08:00
commit 55ac8454c4
10 changed files with 5756 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,35 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,199 @@
 ---
 library_name: transformers
 tags: []
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 - **Developed by:** [More Information Needed]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
 - **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
 - **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 [More Information Needed]
 ### Downstream Use [optional]
 <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 [More Information Needed]
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 [More Information Needed]
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 [More Information Needed]
 ### Recommendations
 <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
 [More Information Needed]
 ## Training Details
 ### Training Data
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 [More Information Needed]
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
 [More Information Needed]
 #### Training Hyperparameters
 - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
 <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 [More Information Needed]
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
 <!-- This should link to a Dataset Card if possible. -->
 [More Information Needed]
 #### Factors
 <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 [More Information Needed]
 #### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 [More Information Needed]
 ### Results
 [More Information Needed]
 #### Summary
 ## Model Examination [optional]
 <!-- Relevant interpretability work for the model goes here -->
 [More Information Needed]
 ## Environmental Impact
 <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
 Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
 - **Hardware Type:** [More Information Needed]
 - **Hours used:** [More Information Needed]
 - **Cloud Provider:** [More Information Needed]
 - **Compute Region:** [More Information Needed]
 - **Carbon Emitted:** [More Information Needed]
 ## Technical Specifications [optional]
 ### Model Architecture and Objective
 [More Information Needed]
 ### Compute Infrastructure
 [More Information Needed]
 #### Hardware
 [More Information Needed]
 #### Software
 [More Information Needed]
 ## Citation [optional]
 <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
 [More Information Needed]
 **APA:**
 [More Information Needed]
 ## Glossary [optional]
 <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
 [More Information Needed]
 ## More Information [optional]
 [More Information Needed]
 ## Model Card Authors [optional]
 [More Information Needed]
 ## Model Card Contact
 [More Information Needed]
--- a/config.json
+++ b/config.json
@@ -0,0 +1,38 @@
 {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 0,
  "dtype": "float32",
  "embd_pdrop": 0.1,
  "eos_token_id": 0,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.57.1",
  "use_cache": true,
  "vocab_size": 1114
 }
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,6 @@
 {
  "_from_model_config": true,
  "bos_token_id": 0,
  "eos_token_id": 0,
  "transformers_version": "4.57.1"
 }
--- a/merges.txt
+++ b/merges.txt
@@ -0,0 +1,858 @@
 #version: 0.2
 Ġ }
 Ġ {
 Ġ \
 Ġ _
 Ġ ^
 Ġ 2
 Ġ )
 Ġ (
 Ġ =
 Ġ 1
 Ġ -
 Ġ ,
 r a
 ra c
 f rac
 Ġ +
 t a
 a l
 a r
 r i
 Ġ i
 m a
 Ġ 0
 p h
 l e
 s i
 Ġ x
 Ġ .
 t i
 Ġ a
 f t
 e ta
 Ġ d
 a m
 m u
 Ġ n
 m e
 g h
 ri gh
 t h
 le ft
 i n
 righ t
 d e
 Ġ k
 Ġ r
 p ar
 ti al
 par tial
 Ġ e
 Ġ m
 ph a
 al pha
 d a
 Ġ A
 Ġ t
 Ġ p
 ph i
 Ġ j
 Ġ 3
 p si
 Ġ c
 Ġ 4
 l ta
 ma th
 b da
 am bda
 n u
 p ri
 pri me
 Ġ g
 p i
 c al
 Ġ z
 am ma
 o t
 Ġ b
 Ġ s
 d ot
 r m
 math rm
 Ġ l
 q u
 b ar
 Ġ |
 de lta
 g ma
 l ambda
 Ġ N
 in t
 Ġ q
 b eta
 Ġ ]
 Ġ f
 g a
 me ga
 l o
 Ġ T
 Ġ R
 v e
 th eta
 Ġ M
 Ġ D
 lo n
 Ġ &
 Ġ [
 h i
 Ġ L
 r t
 si gma
 Ġ B
 Ġ\ \
 Ġ y
 a d
 qu ad
 psi lon
 l de
 ti lde
 a t
 e psilon
 h at
 g amma
 Ġ /
 Ġ F
 Ġ S
 s u
 h o
 r ho
 s q
 n g
 sq rt
 su m
 ng le
 Ġ u
 v ar
 ta u
 Ġ H
 b f
 o mega
 ra y
 ar ray
 Ġ C
 Ġ h
 Ġ P
 Ġ X
 r o
 Ġ G
 Ġ I
 l a
 Ġ E
 Ġ V
 x i
 P hi
 Ġ J
 ve c
 Ġ v
 i g
 Ġ Q
 c dot
 G amma
 ft y
 in fty
 g e
 ra ngle
 p m
 Ġ K
 L ambda
 Ġ 5
 a p
 ta r
 b e
 ap p
 var phi
 Ġ o
 D e
 De lta
 Ġ *
 e n
 g in
 be gin
 en d
 Ġ U
 Ġ 6
 c hi
 O mega
 ro w
 Ġ w
 Ġ W
 l in
 lin e
 r line
 Ġ ;
 tar row
 q quad
 k app
 kapp a
 Ġ Z
 Ġ\ }
 righ tarrow
 P si
 e qu
 i v
 equ iv
 Ġ 8
 da g
 la ngle
 dot s
 b ig
 o ve
 e x
 ti me
 time s
 Ġ\ !
 ove rline
 Ġ\ {
 e l
 \ {
 B ig
 Ġ >
 s t
 ge r
 dag ger
 si n
 z eta
 var epsilon
 c o
 a b
 n ab
 nab la
 Ġ O
 ex p
 co s
 w i
 wi de
 S i
 Si gma
 Ġ :
 cdot s
 Ġ Y
 math cal
 l n
 l dots
 Ġ <
 el l
 Ġ\ :
 p ro
 t ri
 o times
 t o
 w e
 d ge
 we dge
 si m
 P i
 Ġ !
 h bar
 pro d
 Ġ 7
 y le
 st yle
 ve rt
 ma tri
 matri x
 Ġ 9
 \ }
 wide tilde
 big g
 m i
 T h
 Th eta
 mi d
 s p
 u n
 c k
 un de
 unde rline
 math bf
 a st
 ro x
 app rox
 le q
 lo g
 s tar
 d i
 e r
 r el
 wide hat
 s ta
 ck rel
 sta ckrel
 l i
 sin h
 li m
 p t
 y style
 la ystyle
 sp laystyle
 di splaystyle
 p er
 per p
 s c
 cos h
 ri pt
 sc ript
 de t
 n e
 t ex
 tex t
 o p
 m p
 si me
 sime q
 ge q
 var theta
 s e
 Big r
 Ġ '
 c i
 lon g
 r c
 ci rc
 ne q
 ta n
 Big l
 b rac
 o int
 u s
 long rightarrow
 l us
 op lus
 bigg l
 bigg r
 b o
 n ot
 ra l
 text style
 Big g
 d dot
 Ġ\ ,
 l d
 big l
 big r
 se t
 X i
 bo ld
 bold math
 a ngle
 tri angle
 script style
 a c
 o nu
 brac k
 c h
 e ck
 ch eck
 p to
 pro pto
 m be
 n onu
 mbe r
 nonu mber
 f o
 left rightarrow
 var rho
 sp ac
 spac e
 i t
 script scriptstyle
 Ġ- -
 ral l
 fo rall
 i math
 Ġ\ |
 z e
 si ze
 brac e
 l brack
 tan h
 b set
 su bset
 p st
 pst o
 r e
 i o
 io ta
 b ot
 u p
 l l
 ove r
 a ral
 p aral
 le l
 paral lel
 R i
 ma psto
 gh tarrow
 Ri ghtarrow
 s f
 text rm
 script size
 o m
 s h
 b in
 bin om
 b u
 h space
 u t
 n to
 pha nto
 phanto m
 j math
 r brace
 a se
 c ase
 ti n
 case s
 over rightarrow
 tin y
 Bigg r
 co ng
 Bigg l
 s la
 sla sh
 U psilon
 bu l
 b re
 c up
 g g
 le t
 bul let
 d o
 w n
 ar row
 bre ve
 L o
 w p
 big oplus
 do wn
 m al
 s mal
 ve e
 smal l
 c ot
 Lo ng
 su p
 bo x
 k er
 c ap
 down arrow
 ar c
 at op
 r brack
 var pi
 c t
 v dots
 ker n
 s b
 sup set
 u par
 upar row
 co th
 I m
 ar ge
 f tarrow
 u psilon
 arc tan
 \ |
 Ġ\ #
 le ftarrow
 mi t
 ar p
 R e
 unde rbrace
 d dots
 e q
 ac ut
 acut e
 i se
 ra ise
 Long rightarrow
 h line
 m in
 e ct
 l brace
 p r
 ot ect
 pr otect
 var sigma
 big triangle
 di m
 math sf
 o r
 f lo
 flo or
 L e
 ft rightarrow
 Le ftrightarrow
 \ :
 Ġ ~
 la p
 bigtriangle up
 f i
 n d
 long leftrightarrow
 ot not
 en space
 fo otnot
 footnot e
 h fi
 arp o
 footnote size
 L arge
 V e
 k e
 p ut
 v space
 le ng
 Ve rt
 leng th
 Ġ `
 text bf
 Long leftrightarrow
 hfi ll
 e m
 l arge
 Ġ "
 righ th
 onu p
 arpo onup
 righth arpoonup
 ma ke
 make box
 y set
 de g
 cdot p
 pt yset
 em ptyset
 i p
 k ip
 s l
 s kip
 raise box
 l lap
 r floor
 math op
 dot eq
 a ck
 b ack
 v phantom
 ma x
 back slash
 i ld
 o nd
 ar g
 am ond
 di amond
 re f
 bu ild
 build rel
 big otimes
 sh arp
 n o
 u re
 pi ct
 over leftarrow
 pict ure
 t t
 r ut
 st rut
 o dot
 big cup
 over brace
 \ ,
 Ġ\ /
 math it
 la be
 labe l
 n i
 gma psto
 lon gmapsto
 un it
 subset eq
 unit length
 en skip
 co lon
 text up
 set length
 f l
 f box
 th in
 big wedge
 fl at
 thin space
 l floor
 o d
 d s
 o un
 p oun
 poun ds
 Ġ ?
 in us
 circ le
 om inus
 v line
 Ġ\ _
 e ph
 k rightarrow
 o krightarrow
 al eph
 rm al
 ho okrightarrow
 se c
 no rmal
 \ #
 b m
 n at
 r lap
 u ral
 al ig
 big cap
 bm od
 nat ural
 alig n
 d dagger
 s s
 su it
 la nd
 diamond suit
 no align
 Ġ\ -
 ti t
 tex tit
 arc sin
 bigtriangle down
 \ !
 Ġ\ &
 sq cup
 la x
 big m
 long leftarrow
 set min
 re lax
 arc cos
 setmin us
 f ra
 i ck
 me box
 th ick
 line s
 normal size
 fra mebox
 thick lines
 c sc
 e i
 m kern
 o n
 s ma
 v da
 nu ll
 pm od
 ap e
 to p
 text sf
 text normal
 it sh
 supset eq
 ei l
 sma sh
 vda sh
 itsh ape
 A R
 G E
 L AR
 c c
 c eil
 d skip
 e xi
 e ci
 g ro
 l ceil
 p re
 r d
 t e
 t er
 al g
 am alg
 me dskip
 in f
 lo we
 su re
 su cc
 su rd
 en sure
 st s
 sp eci
 se p
 ci te
 small skip
 LAR GE
 exi sts
 gro up
 pre c
 lowe r
 ensure math
 speci al
 A A
 a e
 e w
 g ra
 h arpo
 l ti
 m ma
 r en
 s mi
 s ke
 t ch
 v skip
 v ss
 Ġ\ '
 Ġ\ *
 mu lti
 left eq
 left harpo
 right leftharpo
 ve r
 sq cap
 ngle ftarrow
 array st
 co un
 co mma
 un boldmath
 Bigg m
 set coun
 Ġ-- -
 re tch
 Lo ngleftarrow
 fbox sep
 on s
 ew comma
 gra ve
 ren ewcomma
 smi le
 ske w
 lefteq n
 rightleftharpo ons
 arrayst retch
 setcoun ter
 renewcomma nd
 - -
 D o
 H u
 S S
 a k
 c en
 d dag
 h phantom
 h ss
 l g
 l q
 l group
 m skip
 o slash
 r group
 s y
 w arrow
 Ġ @
 ta bul
 si on
 th de
 par box
 math bin
 math ver
 line bre
 big sqcup
 Big m
 co l
 wi thde
 lim s
 lim sup
 ne arrow
 text tt
 se arrow
 triangle right
 wn arrow
 arrow vert
 small int
 protect u
 fi ll
 hfi l
 no tin
 no linebre
 succ eq
 multi put
 Do wnarrow
 Hu ge
 tabul ar
 mathver sion
 withde lims
 nolinebre ak
 \ /
 a sy
 c lo
 c rc
 l lde
 m n
 m ar
 m bo
 n warrow
 o f
 o o
 s warrow
 u mn
 v cen
 Ġ Ġ
 math rel
 math ac
 math or
 math clo
 nu llde
 array col
 ro ot
 vec to
 rightarrow fill
 big vee
 bigg m
 er space
 li mit
 lim inf
 rc eil
 triangle left
 over withdelims
 atop withdelims
 protect Z
 protect e
 protect m
 footnote mar
 mathop en
 prec eq
 multi col
 ver b
 -- -
 cen t
 sy mbo
 asy mp
 crc r
 oo align
 vcen ter
 mathac cent
 mathor d
 mathclo se
 nullde limit
 arraycol sep
 vecto r
 footnotemar k
 multicol umn
 symbo l
 nulldelimit erspace
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:227cc96abb25da368037121ba9c2e1fae82580b8599e36c7df912ca9245db79d
 size 346806912
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,5 @@
 {
  "bos_token": "<|endoftext|>",
  "eos_token": "<|endoftext|>",
  "unk_token": "<|endoftext|>"
 }
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,20 @@
 {
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "0": {
      "content": "<|endoftext|>",
      "lstrip": false,
      "normalized": true,
      "rstrip": false,
      "single_word": false,
      "special": true
    }
  },
  "bos_token": "<|endoftext|>",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|endoftext|>",
  "extra_special_tokens": {},
  "model_max_length": 1024,
  "tokenizer_class": "GPT2Tokenizer",
  "unk_token": "<|endoftext|>"
 }
--- a/vocab.json
+++ b/vocab.json