Skip to content

add torchvision detector functionality#164

Open
deruyter92 wants to merge 7 commits intomainfrom
jaap/add_torchvision_detector
Open

add torchvision detector functionality#164
deruyter92 wants to merge 7 commits intomainfrom
jaap/add_torchvision_detector

Conversation

@deruyter92
Copy link
Collaborator

@deruyter92 deruyter92 commented Feb 18, 2026

Motivation:
In the existing code, the TorchvisionDetectorAdaptor is excluded for regular use (e.g. missing from the model registry). This means that pretrained torchvision detectors cannot be loaded automatically, requiring workarounds such as downloading the weights manually, and cleaning up the state dicts. An exemplary case is when working with the superanimal_humanbody_rtmpose_x from the modelzoo. This model requires a pretrained torchvision detector as it does not have any saved detector weights, and can therefore not be used directly in combination with the export functionality.

This PR aims to address this issue by including the TorchVisionDetectorAdaptor as a regular detector model, and updating the runner accordingly. The modelzoo export is also updated to enable export of the superanimal_humanbody_rtmpose_x

Changes:

  • The TorchVisionDetectorAdaptor is registered as detector model
  • The runner is adapted to load a model either when model weights are provided OR when config specifies pretrained=true
  • A specific export functionality is added for the superanimal_humanbody_rtmpose_x, by including a hardcoded config (hacky..) .
# Example destination path
destination_path = "_chekpoints/my_exported_humanbody.pt"
model_name = "rtmpose_x"
super_animal = "superanimal_humanbody"
detector_name = "fasterrcnn_mobilenet_v3_large_fpn"

export_modelzoo_model(
    export_path= destination_path,
    super_animal=super_animal,
    model_name=model_name,
    detector_name=detector_name ,
)

Copy link
Collaborator

@C-Achard C-Achard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!
Is this ready to test, or how can I help ?
Otherwise what do you think a sensible way of exposing/documenting any built-in configs would be ? I think it's fine if we clearly centralize/document it.

@deruyter92 deruyter92 marked this pull request as ready for review February 19, 2026 13:16
@deruyter92 deruyter92 changed the title [dev] add torchvision detector functionality add torchvision detector functionality Feb 19, 2026
@deruyter92
Copy link
Collaborator Author

@C-Achard yes feel free to test & merge if you agree and it works.

@C-Achard C-Achard added the lint required Please run: pre-commit run --files [files] locally ONLY on changed files label Feb 20, 2026
@deruyter92 deruyter92 added enhancement New feature or request and removed lint required Please run: pre-commit run --files [files] locally ONLY on changed files labels Mar 10, 2026
@deruyter92 deruyter92 requested review from C-Achard and Copilot March 10, 2026 10:26
@deruyter92 deruyter92 added the lint required Please run: pre-commit run --files [files] locally ONLY on changed files label Mar 10, 2026
@deruyter92
Copy link
Collaborator Author

(pre-commit hooks are ran on the changed files, but not on the entire repository)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes torchvision-based detectors first-class citizens in the model registry and updates config/loading/export paths so humanbody top-down models can rely on pretrained torchvision detector weights rather than bundled detector checkpoints.

Changes:

  • Register TorchvisionDetectorAdaptor in the detectors registry and expose it through the package.
  • Update the runner to build/load detectors when either detector weights are present or the detector config indicates pretrained/weights usage.
  • Extend modelzoo config + export logic to support superanimal_humanbody_* models that depend on a torchvision detector.

Reviewed changes

Copilot reviewed 6 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/test_modelzoo.py Adds targeted tests for humanbody detector-name requirements and torchvision detector config wiring.
dlclive/pose_estimation_pytorch/runner.py Loads detector based on either checkpoint presence or pretrained/weights settings; safer access to detector inference config.
dlclive/pose_estimation_pytorch/models/detectors/torchvision.py Registers TorchvisionDetectorAdaptor and introduces a supported-detector allowlist.
dlclive/pose_estimation_pytorch/models/detectors/init.py Exposes/imports the torchvision adaptor so it’s registered and available.
dlclive/modelzoo/utils.py Injects a torchvision detector config for humanbody and validates supported detector choices.
dlclive/modelzoo/pytorch_model_zoo_export.py Skips detector checkpoint download for humanbody exports; formatting/doc tweaks.
dlclive/modelzoo/project_configs/superanimal_humanbody.yaml YAML indentation/formatting fix.
dlclive/modelzoo/model_configs/ssdlite.yaml YAML indentation/formatting fix.
dlclive/modelzoo/model_configs/fasterrcnn_resnet50_fpn_v2.yaml YAML indentation/formatting fix.
dlclive/modelzoo/init.py Formatting fix for export import.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

BaseDetector,
)

SUPPORTED_TORCHVISION_DETECTORS = ["fasterrcnn_mobilenet_v3_large_fpn"]
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class docstring lists fasterrcnn_resnet50_fpn_v2 as supported, but SUPPORTED_TORCHVISION_DETECTORS only includes fasterrcnn_mobilenet_v3_large_fpn. This inconsistency will confuse users (and currently the modelzoo validation rejects the resnet50 variant). Either update the docstring to match the allowlist, or expand SUPPORTED_TORCHVISION_DETECTORS and ensure the humanbody path supports that detector.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, better suggestion, don't list the models in the docstring and just say "one of {variable name}"



@DETECTORS.register_module
class TorchvisionDetectorAdaptor(BaseDetector):
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class docstring lists fasterrcnn_resnet50_fpn_v2 as supported, but SUPPORTED_TORCHVISION_DETECTORS only includes fasterrcnn_mobilenet_v3_large_fpn. This inconsistency will confuse users (and currently the modelzoo validation rejects the resnet50 variant). Either update the docstring to match the allowlist, or expand SUPPORTED_TORCHVISION_DETECTORS and ensure the humanbody path supports that detector.

Copilot uses AI. Check for mistakes.
Comment on lines 33 to 34
- fasterrcnn_mobilenet_v3_large_fpn
- fasterrcnn_resnet50_fpn_v2
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class docstring lists fasterrcnn_resnet50_fpn_v2 as supported, but SUPPORTED_TORCHVISION_DETECTORS only includes fasterrcnn_mobilenet_v3_large_fpn. This inconsistency will confuse users (and currently the modelzoo validation rejects the resnet50 variant). Either update the docstring to match the allowlist, or expand SUPPORTED_TORCHVISION_DETECTORS and ensure the humanbody path supports that detector.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many times can copilot make the same bad suggestion lmao

Comment on lines +18 to +20
from dlclive.pose_estimation_pytorch.models.detectors.torchvision import (
SUPPORTED_TORCHVISION_DETECTORS,
)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing SUPPORTED_TORCHVISION_DETECTORS from the torchvision detector module makes dlclive.modelzoo.utils import-time dependent on torch/torchvision. If torchvision is an optional dependency, this can cause config/modelzoo utilities to fail to import even when the user isn’t using detectors. Consider moving the allowlist to a lightweight location (e.g., modelzoo module constants) or doing a local/lazy import inside _get_torchvision_detector_config with a clear error if torchvision isn’t available.

Suggested change
from dlclive.pose_estimation_pytorch.models.detectors.torchvision import (
SUPPORTED_TORCHVISION_DETECTORS,
)
try:
from dlclive.pose_estimation_pytorch.models.detectors.torchvision import (
SUPPORTED_TORCHVISION_DETECTORS,
)
except ImportError: # torchvision (and its dependencies) is optional
# Fallback to an empty collection so this module can be imported even when
# torchvision is not installed. Code that relies on detectors should
# handle the absence of supported detectors appropriately.
SUPPORTED_TORCHVISION_DETECTORS = ()

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad suggestion IMO. utils modules are junk drawers. _get_torchvision_detector_config is a torch-specific function, and it should exist in a torch-specific subpackage or module. don't listen to the LLMs which will fallback your code into a verbose oblivion - just put the code in the place it should be.

return config


def _get_torchvision_detector_config(detector_name: str) -> dict:
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_torchvision_detector_config is annotated as taking detector_name: str, but it explicitly handles None and is called with detector_name that may be None. Update the type signature to accept str | None (and/or adjust the call sites) so type checkers and IDEs reflect the actual contract.

Suggested change
def _get_torchvision_detector_config(detector_name: str) -> dict:
def _get_torchvision_detector_config(detector_name: str | None) -> dict:

Copilot uses AI. Check for mistakes.
Comment on lines +271 to +278
detector_cfg = self.cfg.get("detector")
has_detector_weights = raw_data.get("detector") is not None
if detector_cfg is not None:
detector_model_cfg = detector_cfg["model"]
uses_pretrained = (
detector_model_cfg.get("pretrained", False)
or detector_model_cfg.get("weights") is not None
)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runner now has new behavior: it should build a detector when weights/pretrained is set even if raw_data["detector"] is None, and it should skip load_state_dict in that case. There are modelzoo tests added, but no tests appear to cover this updated runner logic. Consider adding a focused unit test that mocks models.DETECTORS.build and verifies (1) build is invoked when weights is set, (2) load_state_dict is not invoked when no detector weights are present, and (3) load_state_dict is invoked when detector weights are present.

Copilot uses AI. Check for mistakes.
Comment on lines +282 to +287
if self.dynamic is None and (has_detector_weights or uses_pretrained):
self.detector = models.DETECTORS.build(self.cfg["detector"]["model"])
self.detector.to(self.device)
self.detector.load_state_dict(raw_data["detector"])

if has_detector_weights:
self.detector.load_state_dict(raw_data["detector"])
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runner now has new behavior: it should build a detector when weights/pretrained is set even if raw_data["detector"] is None, and it should skip load_state_dict in that case. There are modelzoo tests added, but no tests appear to cover this updated runner logic. Consider adding a focused unit test that mocks models.DETECTORS.build and verifies (1) build is invoked when weights is set, (2) load_state_dict is not invoked when no detector weights are present, and (3) load_state_dict is invoked when detector weights are present.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@sneakers-the-rat sneakers-the-rat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't really fully understand the code around here so i am mostly commenting on general code stuff - I think that in general it really really seems like this thing needs a proper data model since the logic about loading is getting pretty spread out among a bunch of functions. especially if the code is starting to diverge from main dlc, the best time to consolidate configuration and loading of this thing used across two packages in dlclibrary was before the copy/paste, the second best time is now <3.

Usually it is a bad idea to have some base class intended to provide some protocol/interface as an instance of that protocol/interface because it tempts you to modify the protocol to fit the needs of the runtime use when those are implementation details that wouldn't be relevant to the other subclasses, so I suppose i would be curious why not make some trivial subclass that is like "pretrained whatever" so that the division of responsibility is kept. like that would also be a place you could put the additional conditional logic added to the runner.py entrypoint.

Comment on lines +54 to +58

# Skip downloading the detector weights for humanbody models, as they are not on huggingface
skip_detector_download = (detector_name is None) or (
super_animal == "superanimal_humanbody"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems not great to hardcode. idk what if it eventually is on huggingface? what if there are other detector names that can't be reached by the load model weights function? also silently changing explicitly specified functionality is bad to do - i told you to use detector_name, but there is some hidden interaction with my model choice, and how would I even know?

just throw an exception in _load_model_weights if the model weights can't be loaded

Comment on lines +18 to +20
from dlclive.pose_estimation_pytorch.models.detectors.torchvision import (
SUPPORTED_TORCHVISION_DETECTORS,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad suggestion IMO. utils modules are junk drawers. _get_torchvision_detector_config is a torch-specific function, and it should exist in a torch-specific subpackage or module. don't listen to the LLMs which will fallback your code into a verbose oblivion - just put the code in the place it should be.

return config


def _get_torchvision_detector_config(detector_name: str) -> dict:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this function exists - we're just putting a string in a hardcoded dictionary? the main thing it does is throw exceptions, but the exception should be thrown in whatever consumes this dict. this is way way far on the LBYL side of look before you leap vs. easier to ask forgiveness - we are checking if a value is valid in some utility method very distant from where it's used. if this is needed, it sort of screams that whatever super animal is needs a proper data model - if super animal requires a detector name, there should be a data model that requires a detector name, not in some random utility function somewhere!

Comment on lines +152 to +163
if detector_name is None:
model_config["method"] = "BU"
else:
model_config["method"] = "TD"
if super_animal != "superanimal_humanbody":
detector_cfg_path = get_super_animal_model_config_path(
model_name=detector_name
)
detector_cfg = read_config_as_dict(detector_cfg_path)
model_config["detector"] = detector_cfg
detector_cfg_path = get_super_animal_model_config_path(model_name=detector_name)
detector_cfg = read_config_as_dict(detector_cfg_path)
model_config["detector"] = detector_cfg

if super_animal == "superanimal_humanbody":
# Raises ValueError if Detector name is not one of SUPPORTED_TORCHVISION_DETECTORS
torchvision_detector_config = _get_torchvision_detector_config(detector_name)
model_config["detector"]["model"] = torchvision_detector_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soooo remember what we were saying about how copying and pasting code is bad because then you have to ensure that it never changes or you always need to remember to update in both places? this is one of those times. this function should have been located in some shared root package (dlclibrary!!!!) and imported in both in the first place, but if we are sprouting divergent functionality, need to make that clear: "I do this all the time in normal deeplabcut, how come it raises an error here!"

this is also strangely conflicting with the other place where we silence rather than raise an error - https://github.com/DeepLabCut/DeepLabCut-live/pull/164/changes#r2913689689

so we are handling the same thing twice in opposing directions.

BaseDetector,
)

SUPPORTED_TORCHVISION_DETECTORS = ["fasterrcnn_mobilenet_v3_large_fpn"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, better suggestion, don't list the models in the docstring and just say "one of {variable name}"

Comment on lines 33 to 34
- fasterrcnn_mobilenet_v3_large_fpn
- fasterrcnn_resnet50_fpn_v2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many times can copilot make the same bad suggestion lmao

Comment on lines +36 to +37
This class can be used directly (e.g. with pre-trained COCO weights) or through its
subclasses (FasterRCNN or SSDLite) which adapt the model for DLC's 2-class detection.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why! usually base classes that shouldn't be used have some abstract methods and only serve as the external public API contract, so it doesn't seem like this was serving as an ABC before, but usually mixing levels like this is a bad call, having a "pre trained" or whatever subclass might be a good idea to discourage modifying the base class to accommodate any specific needs for this use that doesn't apply to the other subclasses and warps the contract made by the ABC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lint required Please run: pre-commit run --files [files] locally ONLY on changed files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DeepLabCut-live fails to run exported superanimal_humanbody model from DeepLabCut

4 participants