VisualRepresentation
- class VisualRepresentation(images: Sequence[str | Path | Tensor], encoder: str | Module, layer_name: str, max_id: int | None = None, shape: int | Sequence[int] | None = None, transforms: Sequence | None = None, encoder_kwargs: Mapping[str, Any] | None = None, batch_size: int = 32, trainable: bool = True, **kwargs)[source]
Bases:
Representation
Visual representations using a
torchvision
model.Initialize the representations.
- Parameters:
images (Sequence[str | Path | Tensor]) – The images, either as tensors, or paths to image files.
encoder (str | Module) – The encoder to use. If given as a string, lookup in
torchvision.models
.layer_name (str) – The model’s layer name to use for extracting the features, cf.
torchvision.models.feature_extraction.create_feature_extractor()
max_id (int | None) – The number of representations. If given, it must match the number of images.
shape (int | Sequence[int] | None) – The shape of an individual representation. If provided, it must match the encoder output dimension
transforms (Sequence | None) – Transformations to apply to the images. Notice that stochastic transformations will result in stochastic representations, too.
encoder_kwargs (Mapping[str, Any] | None) – Additional keyword-based parameters passed to encoder upon instantiation.
batch_size (int) – The batch size to use during encoding.
trainable (bool) – Whether the encoder should be trainable.
kwargs – Additional keyword-based parameters passed to
Representation
.
- Raises:
ValueError – If max_id is provided and does not match the number of images.