Hugging Face Hub Integration¶

AutoTimm seamlessly integrates with Hugging Face Hub, giving you access to thousands of timm-compatible pretrained models with version control, model cards, and community contributions.

HF Hub Integration Flow¶

graph TD
    A[HF Hub] --> A1[huggingface.co/models]
    A1 --> A2[Browse Repository]
    A2 --> B[Search Models]

    B --> B1[Define Criteria]
    B1 --> B2[Set Filters]
    B2 --> C[list_hf_hub_backbones]

    C --> C1[Query API]
    C1 --> C2[Fetch Results]
    C2 --> D{Filter}

    D -->|By Name| E1[model_name='resnet']
    E1 --> E1a[Match Pattern]
    E1a --> E1b[Return Matches]

    D -->|By Author| E2[author='timm']
    E2 --> E2a[Filter Publisher]
    E2a --> E2b[Return Matches]

    D -->|By Tag| E3[tag='image-classification']
    E3 --> E3a[Filter Tags]
    E3a --> E3b[Return Matches]

    E1b --> F[Model List]
    E2b --> F
    E3b --> F

    F --> F1[Display Results]
    F1 --> F2[Show Metadata]
    F2 --> G[Select Model]

    G --> G1[Review Model Card]
    G1 --> G2[Check Performance]
    G2 --> G3[Verify Compatibility]
    G3 --> H[hf-hub:timm/model]

    H --> H1[Download Model]
    H1 --> H2[Cache Locally]
    H2 --> I{Task}

    I -->|Classification| J1[ImageClassifier]
    J1 --> J1a[backbone='hf-hub:...']
    J1a --> J1b[Load Weights]
    J1b --> J1c[Add Head]

    I -->|Detection| J2[ObjectDetector]
    J2 --> J2a[backbone='hf-hub:...']
    J2a --> J2b[Load Weights]
    J2b --> J2c[Add Detector]

    I -->|Segmentation| J3[SemanticSegmentor]
    J3 --> J3a[backbone='hf-hub:...']
    J3a --> J3b[Load Weights]
    J3b --> J3c[Add Decoder]

    J1c --> K[Training]
    J2c --> K
    J3c --> K

    K --> K1[Configure Trainer]
    K1 --> K2[Start Training]
    K2 --> K3[Save Checkpoint]
    K3 --> K4[Upload to Hub]

    style A fill:#2196F3,stroke:#1976D2
    style C fill:#1976D2,stroke:#1565C0
    style F fill:#2196F3,stroke:#1976D2
    style H fill:#1976D2,stroke:#1565C0
    style K fill:#2196F3,stroke:#1976D2

Overview¶

Load models directly from Hugging Face Hub using the hf-hub: prefix. This provides:

Centralized hosting: Access thousands of pretrained models
Version control: Use specific model versions and configurations
Model cards: View training details, datasets, and performance
Community models: Share and use custom trained models
Same API: Works exactly like standard timm models

Quick Start¶

import autotimm as at  # recommended alias
from autotimm import ImageClassifier

# Discover models on HF Hub
models = at.list_hf_hub_backbones(model_name="resnet", limit=5)

# Use HF Hub model as backbone
model = ImageClassifier(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=10,
)

Model Discovery¶

Search by Architecture¶

import autotimm as at  # recommended alias

# ResNet models
resnets = at.list_hf_hub_backbones(model_name="resnet", limit=10)

# Vision Transformers
vits = at.list_hf_hub_backbones(model_name="vit", limit=10)

# ConvNeXt models
convnexts = at.list_hf_hub_backbones(model_name="convnext", limit=10)

Search by Author¶

# Official timm models
timm_models = at.list_hf_hub_backbones(author="timm", limit=20)

# Facebook models
fb_models = at.list_hf_hub_backbones(author="facebook", limit=10)

Supported Prefixes¶

You can use any of these formats:

hf-hub:timm/model_name
hf_hub:timm/model_name
timm/model_name

Usage with All Tasks¶

Image Classification¶

from autotimm import ImageClassifier

model = ImageClassifier(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=10,
)

Semantic Segmentation¶

from autotimm import SemanticSegmentor

model = SemanticSegmentor(
    backbone="hf-hub:timm/convnext_tiny.fb_in22k",
    num_classes=19,
    head_type="deeplabv3plus",
)

Object Detection¶

from autotimm import ObjectDetector

model = ObjectDetector(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=80,
)

Instance Segmentation¶

from autotimm import InstanceSegmentor

model = InstanceSegmentor(
    backbone="hf-hub:timm/resnext50_32x4d.a1_in1k",
    num_classes=80,
)

PyTorch Lightning Compatibility¶

Result: FULLY COMPATIBLE

HF Hub models work seamlessly with PyTorch Lightning. All Lightning features are supported:

Core Features¶

LightningModule interface
training_step/validation_step/test_step
configure_optimizers
Automatic gradient management
Device placement (CPU/GPU/TPU)

Advanced Features¶

Mixed precision (AMP)
Distributed training (DDP)
Multi-GPU training
Model checkpointing
Resume from checkpoint
Gradient accumulation

Logging & Monitoring¶

TensorBoard, MLflow, Weights & Biases
Multiple loggers simultaneously
Hyperparameter logging
All Lightning callbacks

Example: Distributed Training¶

from autotimm import AutoTrainer, ImageClassifier

model = ImageClassifier(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=10,
)

# Multi-GPU training works out of the box
trainer = AutoTrainer(
    max_epochs=100,
    accelerator="gpu",
    devices=4,  # Use 4 GPUs
    strategy="ddp",  # Distributed Data Parallel
)

trainer.fit(model, datamodule=data)

Example: Mixed Precision¶

trainer = AutoTrainer(
    max_epochs=100,
    precision="16-mixed",  # FP16 mixed precision
)

trainer.fit(model, datamodule=data)

Example: Checkpointing¶

trainer = AutoTrainer(
    max_epochs=100,
    checkpoint_monitor="val/accuracy",
    checkpoint_mode="max",
)

trainer.fit(model, datamodule=data)

# Re-supply ignored params for continued training
loaded_model = ImageClassifier.load_from_checkpoint(
    "checkpoints/best-epoch=42-val_accuracy=0.9543.ckpt",
    backbone="hf-hub:timm/resnet50.a1_in1k",
    metrics=metrics,  # not saved in checkpoint
)

Model Naming Convention¶

HF Hub models follow a structured naming convention:

hf-hub:timm/<architecture>_<variant>.<recipe>_<dataset>

Examples¶

hf-hub:timm/resnet50.a1_in1k
- Architecture: ResNet-50
- Recipe: a1 (training configuration)
- Dataset: ImageNet-1k
hf-hub:timm/convnext_base.fb_in22k_ft_in1k
- Architecture: ConvNeXt Base
- Recipe: fb (Facebook)
- Pretraining: ImageNet-22k
- Fine-tuned on: ImageNet-1k
hf-hub:timm/vit_small_patch16_224.augreg_in21k_ft_in1k
- Architecture: Vision Transformer Small
- Patch size: 16x16
- Input: 224x224
- Recipe: augreg (augmentation + regularization)
- Pretraining: ImageNet-21k
- Fine-tuned on: ImageNet-1k

Performance¶

No Performance Impact¶

Using HF Hub models has zero performance overhead compared to standard timm models:

Training speed: Identical
Inference speed: Identical
Memory usage: Identical
GPU utilization: Identical

The only difference is the initial model download from Hugging Face Hub (cached after first use).

Optimization Tips¶

First Run: Model is downloaded and cached
Subsequent Runs: Uses cached version (fast)
Offline Mode: Can use cached models without internet
Version Control: Pin specific model versions for reproducibility

Troubleshooting¶

For HuggingFace Hub integration issues, see the Troubleshooting - HuggingFace including:

Model download is slow
Checkpoint loading fails
RuntimeError about Trainer attachment
Model not found errors