Skip to content

Hugging Face Hub Integration

AutoTimm seamlessly integrates with Hugging Face Hub, giving you access to thousands of timm-compatible pretrained models with version control, model cards, and community contributions.

HF Hub Integration Flow

graph TD
    A[HF Hub] --> A1[huggingface.co/models]
    A1 --> A2[Browse Repository]
    A2 --> B[Search Models]

    B --> B1[Define Criteria]
    B1 --> B2[Set Filters]
    B2 --> C[list_hf_hub_backbones]

    C --> C1[Query API]
    C1 --> C2[Fetch Results]
    C2 --> D{Filter}

    D -->|By Name| E1[model_name='resnet']
    E1 --> E1a[Match Pattern]
    E1a --> E1b[Return Matches]

    D -->|By Author| E2[author='timm']
    E2 --> E2a[Filter Publisher]
    E2a --> E2b[Return Matches]

    D -->|By Tag| E3[tag='image-classification']
    E3 --> E3a[Filter Tags]
    E3a --> E3b[Return Matches]

    E1b --> F[Model List]
    E2b --> F
    E3b --> F

    F --> F1[Display Results]
    F1 --> F2[Show Metadata]
    F2 --> G[Select Model]

    G --> G1[Review Model Card]
    G1 --> G2[Check Performance]
    G2 --> G3[Verify Compatibility]
    G3 --> H[hf-hub:timm/model]

    H --> H1[Download Model]
    H1 --> H2[Cache Locally]
    H2 --> I{Task}

    I -->|Classification| J1[ImageClassifier]
    J1 --> J1a[backbone='hf-hub:...']
    J1a --> J1b[Load Weights]
    J1b --> J1c[Add Head]

    I -->|Detection| J2[ObjectDetector]
    J2 --> J2a[backbone='hf-hub:...']
    J2a --> J2b[Load Weights]
    J2b --> J2c[Add Detector]

    I -->|Segmentation| J3[SemanticSegmentor]
    J3 --> J3a[backbone='hf-hub:...']
    J3a --> J3b[Load Weights]
    J3b --> J3c[Add Decoder]

    J1c --> K[Training]
    J2c --> K
    J3c --> K

    K --> K1[Configure Trainer]
    K1 --> K2[Start Training]
    K2 --> K3[Save Checkpoint]
    K3 --> K4[Upload to Hub]

    style A fill:#2196F3,stroke:#1976D2
    style C fill:#1976D2,stroke:#1565C0
    style F fill:#2196F3,stroke:#1976D2
    style H fill:#1976D2,stroke:#1565C0
    style K fill:#2196F3,stroke:#1976D2

Overview

Load models directly from Hugging Face Hub using the hf-hub: prefix. This provides:

  • Centralized hosting: Access thousands of pretrained models
  • Version control: Use specific model versions and configurations
  • Model cards: View training details, datasets, and performance
  • Community models: Share and use custom trained models
  • Same API: Works exactly like standard timm models

Quick Start

import autotimm as at  # recommended alias
from autotimm import ImageClassifier

# Discover models on HF Hub
models = at.list_hf_hub_backbones(model_name="resnet", limit=5)

# Use HF Hub model as backbone
model = ImageClassifier(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=10,
)

Model Discovery

Search by Architecture

import autotimm as at  # recommended alias

# ResNet models
resnets = at.list_hf_hub_backbones(model_name="resnet", limit=10)

# Vision Transformers
vits = at.list_hf_hub_backbones(model_name="vit", limit=10)

# ConvNeXt models
convnexts = at.list_hf_hub_backbones(model_name="convnext", limit=10)

Search by Author

# Official timm models
timm_models = at.list_hf_hub_backbones(author="timm", limit=20)

# Facebook models
fb_models = at.list_hf_hub_backbones(author="facebook", limit=10)

Supported Prefixes

You can use any of these formats:

  • hf-hub:timm/model_name
  • hf_hub:timm/model_name
  • timm/model_name

Usage with All Tasks

Image Classification

from autotimm import ImageClassifier

model = ImageClassifier(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=10,
)

Semantic Segmentation

from autotimm import SemanticSegmentor

model = SemanticSegmentor(
    backbone="hf-hub:timm/convnext_tiny.fb_in22k",
    num_classes=19,
    head_type="deeplabv3plus",
)

Object Detection

from autotimm import ObjectDetector

model = ObjectDetector(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=80,
)

Instance Segmentation

from autotimm import InstanceSegmentor

model = InstanceSegmentor(
    backbone="hf-hub:timm/resnext50_32x4d.a1_in1k",
    num_classes=80,
)

PyTorch Lightning Compatibility

Result: FULLY COMPATIBLE

HF Hub models work seamlessly with PyTorch Lightning. All Lightning features are supported:

Core Features

  • LightningModule interface
  • training_step/validation_step/test_step
  • configure_optimizers
  • Automatic gradient management
  • Device placement (CPU/GPU/TPU)

Advanced Features

  • Mixed precision (AMP)
  • Distributed training (DDP)
  • Multi-GPU training
  • Model checkpointing
  • Resume from checkpoint
  • Gradient accumulation

Logging & Monitoring

  • TensorBoard, MLflow, Weights & Biases
  • Multiple loggers simultaneously
  • Hyperparameter logging
  • All Lightning callbacks

Example: Distributed Training

from autotimm import AutoTrainer, ImageClassifier

model = ImageClassifier(
    backbone="hf-hub:timm/resnet50.a1_in1k",
    num_classes=10,
)

# Multi-GPU training works out of the box
trainer = AutoTrainer(
    max_epochs=100,
    accelerator="gpu",
    devices=4,  # Use 4 GPUs
    strategy="ddp",  # Distributed Data Parallel
)

trainer.fit(model, datamodule=data)

Example: Mixed Precision

trainer = AutoTrainer(
    max_epochs=100,
    precision="16-mixed",  # FP16 mixed precision
)

trainer.fit(model, datamodule=data)

Example: Checkpointing

trainer = AutoTrainer(
    max_epochs=100,
    checkpoint_monitor="val/accuracy",
    checkpoint_mode="max",
)

trainer.fit(model, datamodule=data)

# Re-supply ignored params for continued training
loaded_model = ImageClassifier.load_from_checkpoint(
    "checkpoints/best-epoch=42-val_accuracy=0.9543.ckpt",
    backbone="hf-hub:timm/resnet50.a1_in1k",
    metrics=metrics,  # not saved in checkpoint
)

Model Naming Convention

HF Hub models follow a structured naming convention:

hf-hub:timm/<architecture>_<variant>.<recipe>_<dataset>

Examples

  • hf-hub:timm/resnet50.a1_in1k

    • Architecture: ResNet-50
    • Recipe: a1 (training configuration)
    • Dataset: ImageNet-1k
  • hf-hub:timm/convnext_base.fb_in22k_ft_in1k

    • Architecture: ConvNeXt Base
    • Recipe: fb (Facebook)
    • Pretraining: ImageNet-22k
    • Fine-tuned on: ImageNet-1k
  • hf-hub:timm/vit_small_patch16_224.augreg_in21k_ft_in1k

    • Architecture: Vision Transformer Small
    • Patch size: 16x16
    • Input: 224x224
    • Recipe: augreg (augmentation + regularization)
    • Pretraining: ImageNet-21k
    • Fine-tuned on: ImageNet-1k

Performance

No Performance Impact

Using HF Hub models has zero performance overhead compared to standard timm models:

  • Training speed: Identical
  • Inference speed: Identical
  • Memory usage: Identical
  • GPU utilization: Identical

The only difference is the initial model download from Hugging Face Hub (cached after first use).

Optimization Tips

  1. First Run: Model is downloaded and cached
  2. Subsequent Runs: Uses cached version (fast)
  3. Offline Mode: Can use cached models without internet
  4. Version Control: Pin specific model versions for reproducibility

Troubleshooting

For HuggingFace Hub integration issues, see the Troubleshooting - HuggingFace including:

  • Model download is slow
  • Checkpoint loading fails
  • RuntimeError about Trainer attachment
  • Model not found errors

Resources