VisualRepresentation
- class VisualRepresentation(images: Sequence[str | Path | Tensor], encoder: str | Module, layer_name: str, max_id: int | None = None, shape: int | Sequence[int] | None = None, transforms: Sequence | None = None, encoder_kwargs: Mapping[str, Any] | None = None, batch_size: int = 32, trainable: bool = True, **kwargs)[source]
Bases:
Representation
Visual representations using a torchvision model.
Initialize the representations.
- Parameters:
images (Sequence[str | Path | Tensor]) – the images, either as tensors, or paths to image files.
encoder (str | Module) – the encoder to use. If given as a string, lookup in
torchvision.models
layer_name (str) – the model’s layer name to use for extracting the features, cf.
torchvision.models.feature_extraction.create_feature_extractor()
max_id (int | None) – the number of representations. If given, it must match the number of images.
shape (int | Sequence[int] | None) – the shape of an individual representation. If provided, it must match the encoder output dimension
transforms (Sequence | None) – transformations to apply to the images. Notice that stochastic transformations will result in stochastic representations, too.
encoder_kwargs (Mapping[str, Any] | None) – additional keyword-based parameters passed to encoder upon instantiation.
batch_size (int) – the batch size to use during encoding
trainable (bool) – whether the encoder should be trainable
kwargs – additional keyword-based parameters passed to
Representation.__init__()
.
- Raises:
ValueError – if max_id is provided and does not match the number of images