354 lines
14 KiB
Markdown
354 lines
14 KiB
Markdown
|
|
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
|
||
|
|
|
||
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||
|
|
the License. You may obtain a copy of the License at
|
||
|
|
|
||
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
||
|
|
|
||
|
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||
|
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||
|
|
specific language governing permissions and limitations under the License.
|
||
|
|
|
||
|
|
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
|
||
|
|
rendered properly in your Markdown viewer.
|
||
|
|
|
||
|
|
-->
|
||
|
|
|
||
|
|
# Model debugging toolboxes
|
||
|
|
|
||
|
|
This page lists all the debugging and model adding tools used by the library, as well as the utility functions it
|
||
|
|
provides for it.
|
||
|
|
|
||
|
|
Most of those are only useful if you are adding new models in the library.
|
||
|
|
|
||
|
|
## Model addition debuggers
|
||
|
|
|
||
|
|
### Model addition debugger - context manager for model adders
|
||
|
|
|
||
|
|
This context manager is a power user tool intended for model adders. It tracks all forward calls within a model forward
|
||
|
|
and logs a slice of each input and output on a nested JSON. To note, this context manager enforces `torch.no_grad()`.
|
||
|
|
|
||
|
|
### Rationale
|
||
|
|
|
||
|
|
When porting models to transformers, even from python to python, model adders often have to do a lot of manual
|
||
|
|
operations, involving saving and loading tensors, comparing dtypes, etc. This small tool can hopefully shave off some
|
||
|
|
time.
|
||
|
|
|
||
|
|
### Usage
|
||
|
|
|
||
|
|
Add this context manager as follows to debug a model:
|
||
|
|
|
||
|
|
```python
|
||
|
|
import torch
|
||
|
|
from PIL import Image
|
||
|
|
import requests
|
||
|
|
from transformers import LlavaProcessor, LlavaForConditionalGeneration
|
||
|
|
from transformers.model_debugging_utils import model_addition_debugger_context
|
||
|
|
torch.random.manual_seed(673)
|
||
|
|
|
||
|
|
# load pretrained model and processor
|
||
|
|
model_id = "llava-hf/llava-1.5-7b-hf"
|
||
|
|
processor = LlavaProcessor.from_pretrained(model_id)
|
||
|
|
model = LlavaForConditionalGeneration.from_pretrained(model_id)
|
||
|
|
|
||
|
|
# create random image input
|
||
|
|
random_image = Image.fromarray(torch.randint(0, 256, (224, 224, 3), dtype=torch.uint8).numpy())
|
||
|
|
|
||
|
|
# prompt
|
||
|
|
prompt = "<image>Describe this image."
|
||
|
|
|
||
|
|
# process inputs
|
||
|
|
inputs = processor(text=prompt, images=random_image, return_tensors="pt")
|
||
|
|
|
||
|
|
# call forward method (not .generate!)
|
||
|
|
with model_addition_debugger_context(
|
||
|
|
model,
|
||
|
|
debug_path="optional_path_to_your_directory",
|
||
|
|
do_prune_layers=False # This will output ALL the layers of a model.
|
||
|
|
):
|
||
|
|
output = model.forward(**inputs)
|
||
|
|
|
||
|
|
```
|
||
|
|
|
||
|
|
### Reading results
|
||
|
|
|
||
|
|
The debugger generates two files from the forward call, both with the same base name, but ending either with
|
||
|
|
`_SUMMARY.json` or with `_FULL_TENSORS.json`.
|
||
|
|
|
||
|
|
The first one will contain a summary of each module's _input_ and _output_ tensor values and shapes.
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"module_path": "MolmoForConditionalGeneration",
|
||
|
|
"inputs": {
|
||
|
|
"args": [],
|
||
|
|
"kwargs": {
|
||
|
|
"input_ids": {
|
||
|
|
"shape": "torch.Size([1, 589])",
|
||
|
|
"dtype": "torch.int64"
|
||
|
|
},
|
||
|
|
"attention_mask": {
|
||
|
|
"shape": "torch.Size([1, 589])",
|
||
|
|
"dtype": "torch.int64"
|
||
|
|
},
|
||
|
|
"pixel_values": {
|
||
|
|
"shape": "torch.Size([1, 5, 576, 588])",
|
||
|
|
"dtype": "torch.float32",
|
||
|
|
"mean": "tensor(-8.9514e-01, device='cuda:0')",
|
||
|
|
"std": "tensor(9.2586e-01, device='cuda:0')",
|
||
|
|
"min": "tensor(-1.7923e+00, device='cuda:0')",
|
||
|
|
"max": "tensor(1.8899e+00, device='cuda:0')"
|
||
|
|
}
|
||
|
|
},
|
||
|
|
"children": [
|
||
|
|
{
|
||
|
|
"module_path": "MolmoForConditionalGeneration.language_model.model.embed_tokens",
|
||
|
|
"inputs": {
|
||
|
|
"args": [
|
||
|
|
{
|
||
|
|
"shape": "torch.Size([1, 589])",
|
||
|
|
"dtype": "torch.int64"
|
||
|
|
}
|
||
|
|
]
|
||
|
|
},
|
||
|
|
"outputs": {
|
||
|
|
"shape": "torch.Size([1, 589, 3584])",
|
||
|
|
"dtype": "torch.float32",
|
||
|
|
"mean": "tensor(6.5460e-06, device='cuda:0')",
|
||
|
|
"std": "tensor(2.3807e-02, device='cuda:0')",
|
||
|
|
"min": "tensor(-3.3398e-01, device='cuda:0')",
|
||
|
|
"max": "tensor(3.9453e-01, device='cuda:0')"
|
||
|
|
}
|
||
|
|
},
|
||
|
|
{
|
||
|
|
"module_path": "MolmoForConditionalGeneration.vision_tower",
|
||
|
|
"inputs": {
|
||
|
|
"args": [
|
||
|
|
{
|
||
|
|
"shape": "torch.Size([5, 1, 576, 588])",
|
||
|
|
"dtype": "torch.float32",
|
||
|
|
"mean": "tensor(-8.9514e-01, device='cuda:0')",
|
||
|
|
"std": "tensor(9.2586e-01, device='cuda:0')",
|
||
|
|
"min": "tensor(-1.7923e+00, device='cuda:0')",
|
||
|
|
"max": "tensor(1.8899e+00, device='cuda:0')"
|
||
|
|
}
|
||
|
|
],
|
||
|
|
"kwargs": {
|
||
|
|
"output_hidden_states": "True"
|
||
|
|
}
|
||
|
|
},
|
||
|
|
"children": [
|
||
|
|
{ ... and so on
|
||
|
|
```
|
||
|
|
|
||
|
|
The `_FULL_TENSORS.json` file will display a full view of all tensors, which is useful for comparing two files.
|
||
|
|
|
||
|
|
```json
|
||
|
|
"pixel_values": {
|
||
|
|
"shape": "torch.Size([1, 5, 576, 588])",
|
||
|
|
"dtype": "torch.float32",
|
||
|
|
"value": [
|
||
|
|
"tensor([[[[-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" ...,",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00]],",
|
||
|
|
"",
|
||
|
|
" [[-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" ...,",
|
||
|
|
" [-1.4857e+00, -1.4820e+00, -1.2100e+00, ..., -6.0979e-01, -5.9650e-01, -3.8527e-01],",
|
||
|
|
" [-1.6755e+00, -1.7221e+00, -1.4518e+00, ..., -7.5577e-01, -7.4658e-01, -5.5592e-01],",
|
||
|
|
" [-7.9957e-01, -8.2162e-01, -5.7014e-01, ..., -1.3689e+00, -1.3169e+00, -1.0678e+00]],",
|
||
|
|
"",
|
||
|
|
" [[-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" ...,",
|
||
|
|
" [-3.0322e-01, -5.0645e-01, -5.8436e-01, ..., -6.2439e-01, -7.9160e-01, -8.1188e-01],",
|
||
|
|
" [-4.4921e-01, -6.5653e-01, -7.2656e-01, ..., -3.4702e-01, -5.2146e-01, -5.1326e-01],",
|
||
|
|
" [-3.4702e-01, -5.3647e-01, -5.4170e-01, ..., -1.0915e+00, -1.1968e+00, -1.0252e+00]],",
|
||
|
|
"",
|
||
|
|
" [[-1.1207e+00, -1.2718e+00, -1.0678e+00, ..., 1.2013e-01, -1.3126e-01, -1.7197e-01],",
|
||
|
|
" [-6.9738e-01, -9.1166e-01, -8.5454e-01, ..., -5.5050e-02, -2.8134e-01, -4.2793e-01],",
|
||
|
|
" [-3.4702e-01, -5.5148e-01, -5.8436e-01, ..., 1.9312e-01, -8.6235e-02, -2.1463e-01],",
|
||
|
|
" ...,",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00]],",
|
||
|
|
"",
|
||
|
|
" [[-1.0039e+00, -9.5669e-01, -6.5546e-01, ..., -1.4711e+00, -1.4219e+00, -1.1389e+00],",
|
||
|
|
" [-1.0039e+00, -9.5669e-01, -6.5546e-01, ..., -1.7193e+00, -1.6771e+00, -1.4091e+00],",
|
||
|
|
" [-1.6317e+00, -1.6020e+00, -1.2669e+00, ..., -1.2667e+00, -1.2268e+00, -8.9720e-01],",
|
||
|
|
" ...,",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
|
||
|
|
" [-1.7923e+00, -1.7521e+00, -1.4802e+00, ..., -1.7923e+00, -1.7521e+00, -1.4802e+00]]]], device='cuda:0')"
|
||
|
|
],
|
||
|
|
"mean": "tensor(-8.9514e-01, device='cuda:0')",
|
||
|
|
"std": "tensor(9.2586e-01, device='cuda:0')",
|
||
|
|
"min": "tensor(-1.7923e+00, device='cuda:0')",
|
||
|
|
"max": "tensor(1.8899e+00, device='cuda:0')"
|
||
|
|
},
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Saving tensors to disk
|
||
|
|
|
||
|
|
Some model adders may benefit from logging full tensor values to disk to support, for example, numerical analysis
|
||
|
|
across implementations.
|
||
|
|
|
||
|
|
Set `use_repr=False` to write tensors to disk using [SafeTensors](https://huggingface.co/docs/safetensors/en/index).
|
||
|
|
|
||
|
|
```python
|
||
|
|
with model_addition_debugger_context(
|
||
|
|
model,
|
||
|
|
debug_path="optional_path_to_your_directory",
|
||
|
|
do_prune_layers=False,
|
||
|
|
use_repr=False, # Defaults to True
|
||
|
|
):
|
||
|
|
output = model.forward(**inputs)
|
||
|
|
```
|
||
|
|
|
||
|
|
When using `use_repr=False`, tensors are written to the same disk location as the `_SUMMARY.json` and
|
||
|
|
`_FULL_TENSORS.json` files. The `value` property of entries in the `_FULL_TENSORS.json` file will contain a relative
|
||
|
|
path reference to the associated `.safetensors` file. Each tensor is written to its own file as the `data` property of
|
||
|
|
the state dictionary. File names are constructed using the `module_path` as a prefix with a few possible postfixes that
|
||
|
|
are built recursively.
|
||
|
|
|
||
|
|
* Module inputs are denoted with the `_inputs` and outputs by `_outputs`.
|
||
|
|
* `list` and `tuple` instances, such as `args` or function return values, will be postfixed with `_{index}`.
|
||
|
|
* `dict` instances will be postfixed with `_{key}`.
|
||
|
|
|
||
|
|
### Comparing between implementations
|
||
|
|
|
||
|
|
Once the forward passes of two models have been traced by the debugger, one can compare the `json` output files. See
|
||
|
|
below: we can see slight differences between these two implementations' key projection layer. Inputs are mostly
|
||
|
|
identical, but not quite. Looking through the file differences makes it easier to pinpoint which layer is wrong.
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
### Limitations and scope
|
||
|
|
|
||
|
|
This feature will only work for torch-based models. Models relying heavily on external kernel calls may work, but trace will
|
||
|
|
probably miss some things. Regardless, any python implementation that aims at mimicking another implementation can be
|
||
|
|
traced once instead of reran N times with breakpoints.
|
||
|
|
|
||
|
|
If you pass `do_prune_layers=False` to your model debugger, ALL the layers will be outputted to `json`. Else, only the
|
||
|
|
first and last layer will be shown. This is useful when some layers (typically cross-attention) appear only after N
|
||
|
|
layers.
|
||
|
|
|
||
|
|
[[autodoc]] model_addition_debugger_context
|
||
|
|
|
||
|
|
## Analyzer of skipped tests
|
||
|
|
|
||
|
|
### Scan skipped tests - for model adders and maintainers
|
||
|
|
|
||
|
|
This small util is a power user tool intended for model adders and maintainers. It lists all test methods
|
||
|
|
existing in `test_modeling_common.py`, inherited by all model tester classes, and scans the repository to measure
|
||
|
|
how many tests are being skipped and for which models.
|
||
|
|
|
||
|
|
### Rationale
|
||
|
|
|
||
|
|
When porting models to transformers, tests fail as they should, and sometimes `test_modeling_common` feels irreconcilable with the peculiarities of our brand new model. But how can we be sure we're not breaking everything by adding a seemingly innocent skip?
|
||
|
|
|
||
|
|
This utility:
|
||
|
|
- scans all test_modeling_common methods
|
||
|
|
- looks for times where a method is skipped
|
||
|
|
- returns a summary json you can load as a DataFrame/inspect
|
||
|
|
|
||
|
|
**For instance test_inputs_embeds is skipped in a whooping 39% proportion at the time of writing this util.**
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
### Usage
|
||
|
|
|
||
|
|
You can run the skipped test analyzer in two ways:
|
||
|
|
|
||
|
|
#### Full scan (default)
|
||
|
|
|
||
|
|
From the root of `transformers` repo, scans all common test methods and outputs the results to a JSON file (default: `all_tests_scan_result.json`).
|
||
|
|
|
||
|
|
```bash
|
||
|
|
python utils/scan_skipped_tests.py --output_dir path/to/output
|
||
|
|
```
|
||
|
|
|
||
|
|
- `--output_dir` (optional): Directory where the JSON results will be saved. Defaults to the current directory.
|
||
|
|
|
||
|
|
**Example output:**
|
||
|
|
|
||
|
|
```text
|
||
|
|
🔬 Parsing 331 model test files once each...
|
||
|
|
📝 Aggregating 224 tests...
|
||
|
|
(224/224) test_update_candidate_strategy_with_matches_1es_3d_is_nonecodet_schedule_fa_kwargs
|
||
|
|
✅ Scan complete.
|
||
|
|
|
||
|
|
📄 JSON saved to /home/pablo/git/transformers/all_tests_scan_result.json
|
||
|
|
|
||
|
|
```
|
||
|
|
|
||
|
|
And it will generate `all_tests_scan_result.json` file that you can inspect. The JSON is indexed by method name, and each entry follows this schema, indicating the origin as well (from `common`or `GenerationMixin`.)
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"<method_name>": {
|
||
|
|
"origin": "<test suite>"
|
||
|
|
"models_ran": ["<model_name>", ...],
|
||
|
|
"models_skipped": ["<model_name>", ...],
|
||
|
|
"skipped_proportion": <float>,
|
||
|
|
"reasons_skipped": ["<model_name>: <reason>",
|
||
|
|
...
|
||
|
|
]
|
||
|
|
},
|
||
|
|
...
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
Which you can visualise as above with e.g. `pandas`
|
||
|
|
|
||
|
|
```python
|
||
|
|
df = pd.read_json('all_tests_scan_result.json').T
|
||
|
|
df.sort_values(by=['skipped_proportion'], ascending=False)
|
||
|
|
|
||
|
|
```
|
||
|
|
|
||
|
|
### Scan a single test method
|
||
|
|
|
||
|
|
You can focus on a specific test method using `--test_method_name`:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
$ python utils/scan_skipped_tests.py --test_method_name test_inputs_embeds --output_dir path/to/output
|
||
|
|
```
|
||
|
|
|
||
|
|
- `--test_method_name`: Name of the test method to scan (e.g., `test_inputs_embeds`).
|
||
|
|
- `--output_dir` (optional): Directory where the JSON result will be saved.
|
||
|
|
|
||
|
|
**Example output:**
|
||
|
|
|
||
|
|
```bash
|
||
|
|
$ python utils/scan_skipped_tests.py --test_method_name test_inputs_embeds
|
||
|
|
|
||
|
|
🔬 Parsing 331 model test files once each...
|
||
|
|
|
||
|
|
== test_inputs_embeds ==
|
||
|
|
|
||
|
|
Ran : 199/323
|
||
|
|
Skipped : 124/323 (38.4%)
|
||
|
|
- aimv2: Aimv2 does not use inputs_embeds
|
||
|
|
- align: Inputs_embeds is tested in individual model tests
|
||
|
|
- altclip: Inputs_embeds is tested in individual model tests
|
||
|
|
- audio_spectrogram_transformer: AST does not use inputs_embeds
|
||
|
|
- beit: BEiT does not use inputs_embeds
|
||
|
|
- bit: Bit does not use inputs_embeds
|
||
|
|
- blip: Blip does not use inputs_embeds
|
||
|
|
- blip_2: Inputs_embeds is tested in individual model tests
|
||
|
|
- bridgetower:
|
||
|
|
- canine: CANINE does not have a get_input_embeddings() method.
|
||
|
|
- ...
|
||
|
|
|
||
|
|
📄 JSON saved to /home/pablo/git/transformers/scan_test_inputs_embeds.json
|
||
|
|
|
||
|
|
```
|