DeepLabV3Plus Segmentation
- Original Link : https://keras.io/api/keras_cv/models/tasks/deeplab_v3_segmentation/
- Last Checked at : 2024-11-25
DeepLabV3Plus
class
keras_cv.models.DeepLabV3Plus(
backbone,
num_classes,
projection_filters=48,
spatial_pyramid_pooling=None,
segmentation_head=None,
**kwargs
)
A Keras model implementing the DeepLabV3+ architecture for semantic segmentation.
References
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation # noqa: E501
(ECCV 2018)
- Rethinking Atrous Convolution for Semantic Image Segmentation # noqa: E501 (CVPR 2017)
Arguments
- backbone:
keras.Model
. The backbone network for the model that is used as a feature extractor for the DeepLabV3+ Encoder. Should either be akeras_cv.models.backbones.backbone.Backbone
or akeras.Model
that implements thepyramid_level_inputs
property with keys “P2” and “P5” and layer names as values. A somewhat sensible backbone to use in many cases is thekeras_cv.models.ResNet50V2Backbone.from_preset("resnet50_v2_imagenet")
. - num_classes: int, the number of classes for the detection model. Note
that the
num_classes
contains the background class, and the classes from the data should be represented by integers with range [0,num_classes
). - projection_filters: int, number of filters in the convolution layer
projecting low-level features from the
backbone
. The default value is set to48
, as per the TensorFlow implementation of DeepLab. # noqa: E501 - spatial_pyramid_pooling: (Optional) a
keras.layers.Layer
. Also known as Atrous Spatial Pyramid Pooling (ASPP). Performs spatial pooling on different spatial levels in the pyramid, with dilation. If provided, the feature map from the backbone is passed to it inside the DeepLabV3 Encoder, otherwisekeras_cv.layers.spatial_pyramid.SpatialPyramidPooling
is used. - segmentation_head: (Optional) a
keras.layers.Layer
. If provided, the outputs of the DeepLabV3 encoder is passed to this layer and it should predict the segmentation mask based on feature from backbone and feature from decoder, otherwise a default DeepLabV3 convolutional head is used.
Example
import keras_cv
images = np.ones(shape=(1, 96, 96, 3))
labels = np.zeros(shape=(1, 96, 96, 1))
backbone = keras_cv.models.ResNet50V2Backbone(input_shape=[96, 96, 3])
model = keras_cv.models.segmentation.DeepLabV3Plus(
num_classes=1, backbone=backbone,
)
# Evaluate model
model(images)
# Train model
model.compile(
optimizer="adam",
loss=keras.losses.BinaryCrossentropy(from_logits=False),
metrics=["accuracy"],
)
model.fit(images, labels, epochs=3)
from_preset
method
DeepLabV3Plus.from_preset()
Instantiate DeepLabV3Plus model from preset config and weights.
Arguments
- preset: string. Must be one of “resnet18”, “resnet34”, “resnet50”, “resnet101”, “resnet152”, “resnet18_v2”, “resnet34_v2”, “resnet50_v2”, “resnet101_v2”, “resnet152_v2”, “mobilenet_v3_small”, “mobilenet_v3_large”, “csp_darknet_tiny”, “csp_darknet_s”, “csp_darknet_m”, “csp_darknet_l”, “csp_darknet_xl”, “efficientnetv1_b0”, “efficientnetv1_b1”, “efficientnetv1_b2”, “efficientnetv1_b3”, “efficientnetv1_b4”, “efficientnetv1_b5”, “efficientnetv1_b6”, “efficientnetv1_b7”, “efficientnetv2_s”, “efficientnetv2_m”, “efficientnetv2_l”, “efficientnetv2_b0”, “efficientnetv2_b1”, “efficientnetv2_b2”, “efficientnetv2_b3”, “densenet121”, “densenet169”, “densenet201”, “efficientnetlite_b0”, “efficientnetlite_b1”, “efficientnetlite_b2”, “efficientnetlite_b3”, “efficientnetlite_b4”, “yolo_v8_xs_backbone”, “yolo_v8_s_backbone”, “yolo_v8_m_backbone”, “yolo_v8_l_backbone”, “yolo_v8_xl_backbone”, “vitdet_base”, “vitdet_large”, “vitdet_huge”, “videoswin_tiny”, “videoswin_small”, “videoswin_base”, “resnet50_imagenet”, “resnet50_v2_imagenet”, “mobilenet_v3_large_imagenet”, “mobilenet_v3_small_imagenet”, “csp_darknet_tiny_imagenet”, “csp_darknet_l_imagenet”, “efficientnetv2_s_imagenet”, “efficientnetv2_b0_imagenet”, “efficientnetv2_b1_imagenet”, “efficientnetv2_b2_imagenet”, “densenet121_imagenet”, “densenet169_imagenet”, “densenet201_imagenet”, “yolo_v8_xs_backbone_coco”, “yolo_v8_s_backbone_coco”, “yolo_v8_m_backbone_coco”, “yolo_v8_l_backbone_coco”, “yolo_v8_xl_backbone_coco”, “vitdet_base_sa1b”, “vitdet_large_sa1b”, “vitdet_huge_sa1b”, “videoswin_tiny_kinetics400”, “videoswin_small_kinetics400”, “videoswin_base_kinetics400”, “videoswin_base_kinetics400_imagenet22k”, “videoswin_base_kinetics600_imagenet22k”, “videoswin_base_something_something_v2”, “deeplab_v3_plus_resnet50_pascalvoc”. If looking for a preset with pretrained weights, choose one of “resnet50_imagenet”, “resnet50_v2_imagenet”, “mobilenet_v3_large_imagenet”, “mobilenet_v3_small_imagenet”, “csp_darknet_tiny_imagenet”, “csp_darknet_l_imagenet”, “efficientnetv2_s_imagenet”, “efficientnetv2_b0_imagenet”, “efficientnetv2_b1_imagenet”, “efficientnetv2_b2_imagenet”, “densenet121_imagenet”, “densenet169_imagenet”, “densenet201_imagenet”, “yolo_v8_xs_backbone_coco”, “yolo_v8_s_backbone_coco”, “yolo_v8_m_backbone_coco”, “yolo_v8_l_backbone_coco”, “yolo_v8_xl_backbone_coco”, “vitdet_base_sa1b”, “vitdet_large_sa1b”, “vitdet_huge_sa1b”, “videoswin_tiny_kinetics400”, “videoswin_small_kinetics400”, “videoswin_base_kinetics400”, “videoswin_base_kinetics400_imagenet22k”, “videoswin_base_kinetics600_imagenet22k”, “videoswin_base_something_something_v2”, “deeplab_v3_plus_resnet50_pascalvoc”.
- load_weights: Whether to load pre-trained weights into model.
Defaults to
None
, which follows whether the preset has pretrained weights available. - input_shape : input shape that will be passed to backbone
initialization, Defaults to
None
.IfNone
, the preset value will be used.
Example
# Load architecture and weights from preset
model = keras_cv.models.DeepLabV3Plus.from_preset(
"resnet50_imagenet",
)
# Load randomly initialized model from preset architecture with weights
model = keras_cv.models.DeepLabV3Plus.from_preset(
"resnet50_imagenet",
load_weights=False,
Preset name | Parameters | Description |
---|---|---|
deeplab_v3_plus_resnet50_pascalvoc | 39.19M | DeeplabV3Plus with a ResNet50 v2 backbone. Trained on PascalVOC 2012 Semantic segmentation task, which consists of 20 classes and one background class. This model achieves a final categorical accuracy of 89.34% and mIoU of 0.6391 on evaluation dataset. This preset is only comptabile with Keras 3. |