detectron2.utils

detectron2.utils.colormap module

An awesome colormap for really neat visualizations. Copied from Detectron, and removed gray colors.

detectron2.utils.colormap.colormap(rgb=False, maximum=255)[源代码]
参数:
  • rgb (bool) – whether to return RGB colors or BGR colors.

  • maximum (int) – either 255 or 1

返回:

ndarray – a float32 array of Nx3 colors, in range [0, 255] or [0, 1]

detectron2.utils.colormap.random_color(rgb=False, maximum=255)[源代码]
参数:
  • rgb (bool) – whether to return RGB colors or BGR colors.

  • maximum (int) – either 255 or 1

返回:

ndarray – a vector of 3 numbers

detectron2.utils.colormap.random_colors(N, rgb=False, maximum=255)[源代码]
参数:
  • N (int) – number of unique colors needed

  • rgb (bool) – whether to return RGB colors or BGR colors.

  • maximum (int) – either 255 or 1

返回:

ndarray – a list of random_color

detectron2.utils.comm module

This file contains primitives for multi-gpu communication. This is useful when doing distributed training.

detectron2.utils.comm.get_world_size() int[源代码]
detectron2.utils.comm.get_rank() int[源代码]
detectron2.utils.comm.create_local_process_group(num_workers_per_machine: int) None[源代码]

Create a process group that contains ranks within the same machine.

Detectron2’s launch() in engine/launch.py will call this function. If you start workers without launch(), you’ll have to also call this. Otherwise utilities like get_local_rank() will not work.

This function contains a barrier. All processes must call it together.

参数:

num_workers_per_machine – the number of worker processes per machine. Typically the number of GPUs.

detectron2.utils.comm.get_local_process_group()[源代码]
返回:

A torch process group which only includes processes that are on the same machine as the current process. This group can be useful for communication within a machine, e.g. a per-machine SyncBN.

detectron2.utils.comm.get_local_rank() int[源代码]
返回:

The rank of the current process within the local (per-machine) process group.

detectron2.utils.comm.get_local_size() int[源代码]
返回:

The size of the per-machine process group, i.e. the number of processes per machine.

detectron2.utils.comm.is_main_process() bool[源代码]
detectron2.utils.comm.synchronize()[源代码]

Helper function to synchronize (barrier) among all processes when using distributed training

detectron2.utils.comm.all_gather(data, group=None)[源代码]

Run all_gather on arbitrary picklable data (not necessarily tensors).

参数:
  • data – any picklable object

  • group – a torch process group. By default, will use a group which contains all ranks on gloo backend.

返回:

list[data] – list of data gathered from each rank

detectron2.utils.comm.gather(data, dst=0, group=None)[源代码]

Run gather on arbitrary picklable data (not necessarily tensors).

参数:
  • data – any picklable object

  • dst (int) – destination rank

  • group – a torch process group. By default, will use a group which contains all ranks on gloo backend.

返回:

list[data]

on dst, a list of data gathered from each rank. Otherwise,

an empty list.

detectron2.utils.comm.shared_random_seed()[源代码]
返回:

int – a random number that is the same across all workers. If workers need a shared RNG, they can use this shared seed to create one.

All workers must call this function, otherwise it will deadlock.

detectron2.utils.comm.reduce_dict(input_dict, average=True)[源代码]

Reduce the values in the dictionary from all processes so that process with rank 0 has the reduced results.

参数:
  • input_dict (dict) – inputs to be reduced. All the values must be scalar CUDA Tensor.

  • average (bool) – whether to do average or sum

返回:

a dict with the same keys as input_dict, after reduction.

detectron2.utils.events module

detectron2.utils.logger module

detectron2.utils.logger.setup_logger(output=None, distributed_rank=0, *, color=True, name='detectron2', abbrev_name=None, enable_propagation: bool = False, configure_stdout: bool = True)[源代码]

Initialize the detectron2 logger and set its verbosity level to “DEBUG”.

参数:
  • output (str) – a file name or a directory to save log. If None, will not save log file. If ends with “.txt” or “.log”, assumed to be a file name. Otherwise, logs will be saved to output/log.txt.

  • name (str) – the root module name of this logger

  • abbrev_name (str) – an abbreviation of the module, to avoid long names in logs. Set to “” to not log the root module in logs. By default, will abbreviate “detectron2” to “d2” and leave other modules unchanged.

  • enable_propagation (bool) – whether to propagate logs to the parent logger.

  • configure_stdout (bool) – whether to configure logging to stdout.

返回:

logging.Logger – a logger

detectron2.utils.logger.log_first_n(lvl, msg, n=1, *, name=None, key='caller')[源代码]

Log only for the first n times.

参数:
  • lvl (int) – the logging level

  • msg (str) –

  • n (int) –

  • name (str) – name of the logger to use. Will use the caller’s module by default.

  • key (str or tuple[str]) – the string(s) can be one of “caller” or “message”, which defines how to identify duplicated logs. For example, if called with n=1, key=”caller”, this function will only log the first call from the same caller, regardless of the message content. If called with n=1, key=”message”, this function will log the same content only once, even if they are called from different places. If called with n=1, key=(“caller”, “message”), this function will not log only if the same caller has logged the same message before.

detectron2.utils.logger.log_every_n(lvl, msg, n=1, *, name=None)[源代码]

Log once per n times.

参数:
  • lvl (int) – the logging level

  • msg (str) –

  • n (int) –

  • name (str) – name of the logger to use. Will use the caller’s module by default.

detectron2.utils.logger.log_every_n_seconds(lvl, msg, n=1, *, name=None)[源代码]

Log no more than once per n seconds.

参数:
  • lvl (int) – the logging level

  • msg (str) –

  • n (int) –

  • name (str) – name of the logger to use. Will use the caller’s module by default.

detectron2.utils.registry module

class detectron2.utils.registry.Registry(name: str)[源代码]

基类:Iterable[Tuple[str, Any]]

The registry that provides name -> object mapping, to support third-party users’ custom modules.

To create a registry (e.g. a backbone registry):

BACKBONE_REGISTRY = Registry('BACKBONE')

To register an object:

@BACKBONE_REGISTRY.register()
class MyBackbone():
    ...

Or:

BACKBONE_REGISTRY.register(MyBackbone)
__init__(name: str) None[源代码]
参数:

name (str) – the name of this registry

register(obj: Optional[Any] = None) Any[源代码]

Register the given object under the the name obj.__name__. Can be used as either a decorator or not. See docstring of this class for usage.

get(name: str) Any[源代码]
detectron2.utils.registry.locate(name: str) Any[源代码]

Locate and return an object x using an input string {x.__module__}.{x.__qualname__}, such as “module.submodule.class_name”.

Raise Exception if it cannot be found.

detectron2.utils.memory module

detectron2.utils.memory.retry_if_cuda_oom(func)[源代码]

Makes a function retry itself after encountering pytorch’s CUDA OOM error. It will first retry after calling torch.cuda.empty_cache().

If that still fails, it will then retry by trying to convert inputs to CPUs. In this case, it expects the function to dispatch to CPU implementation. The return values may become CPU tensors as well and it’s user’s responsibility to convert it back to CUDA tensor if needed.

参数:

func – a stateless callable that takes tensor-like objects as arguments

返回:

a callable which retries func if OOM is encountered.

Examples:

output = retry_if_cuda_oom(some_torch_function)(input1, input2)
# output may be on CPU even if inputs are on GPU

备注

  1. When converting inputs to CPU, it will only look at each argument and check if it has .device and .to for conversion. Nested structures of tensors are not supported.

  2. Since the function might be called more than once, it has to be stateless.

detectron2.utils.analysis module

detectron2.utils.visualizer module

class detectron2.utils.visualizer.ColorMode(value)[源代码]

基类:Enum

Enum of different color modes to use for instance visualizations.

IMAGE = 0

Picks a random color for every instance and overlay segmentations with low opacity.

SEGMENTATION = 1

Let instances of the same category have similar colors (from metadata.thing_colors), and overlay them with high opacity. This provides more attention on the quality of segmentation.

IMAGE_BW = 2

Same as IMAGE, but convert all areas without masks to gray-scale. Only available for drawing per-instance mask predictions.

class detectron2.utils.visualizer.VisImage(img, scale=1.0)[源代码]

基类:object

__init__(img, scale=1.0)[源代码]
参数:
  • img (ndarray) – an RGB image of shape (H, W, 3) in range [0, 255].

  • scale (float) – scale the input image

reset_image(img)[源代码]
参数:

img – same as in __init__

save(filepath)[源代码]
参数:

filepath (str) – a string that contains the absolute path, including the file name, where the visualized image will be saved.

get_image()[源代码]
返回:

ndarray – the visualized image of shape (H, W, 3) (RGB) in uint8 type. The shape is scaled w.r.t the input image using the given scale argument.

class detectron2.utils.visualizer.Visualizer(img_rgb, metadata=None, scale=1.0, instance_mode=ColorMode.IMAGE, font_size_scale=1.0)[源代码]

基类:object

Visualizer that draws data about detection/segmentation on images.

It contains methods like draw_{text,box,circle,line,binary_mask,polygon} that draw primitive objects to images, as well as high-level wrappers like draw_{instance_predictions,sem_seg,panoptic_seg_predictions,dataset_dict} that draw composite data in some pre-defined style.

Note that the exact visualization style for the high-level wrappers are subject to change. Style such as color, opacity, label contents, visibility of labels, or even the visibility of objects themselves (e.g. when the object is too small) may change according to different heuristics, as long as the results still look visually reasonable.

To obtain a consistent style, you can implement custom drawing functions with the abovementioned primitive methods instead. If you need more customized visualization styles, you can process the data yourself following their format documented in tutorials (使用模型, 使用自定义数据集). This class does not intend to satisfy everyone’s preference on drawing styles.

This visualizer focuses on high rendering quality rather than performance. It is not designed to be used for real-time applications.

__init__(img_rgb, metadata=None, scale=1.0, instance_mode=ColorMode.IMAGE, font_size_scale=1.0)[源代码]
参数:
  • img_rgb – a numpy array of shape (H, W, C), where H and W correspond to the height and width of the image respectively. C is the number of color channels. The image is required to be in RGB format since that is a requirement of the Matplotlib library. The image is also expected to be in the range [0, 255].

  • metadata (Metadata) – dataset metadata (e.g. class names and colors)

  • instance_mode (ColorMode) – defines one of the pre-defined style for drawing instances on an image.

  • font_size_scale – extra scaling of font size on top of default font size

draw_instance_predictions(predictions, jittering: bool = True)[源代码]

Draw instance-level prediction results on an image.

参数:
  • predictions (Instances) – the output of an instance detection/segmentation model. Following fields will be used to draw: “pred_boxes”, “pred_classes”, “scores”, “pred_masks” (or “pred_masks_rle”).

  • jittering – if True, in color mode SEGMENTATION, randomly jitter the colors per class to distinguish instances from the same class

返回:

output (VisImage) – image object with visualizations.

draw_sem_seg(sem_seg, area_threshold=None, alpha=0.8)[源代码]

Draw semantic segmentation predictions/labels.

参数:
  • sem_seg (Tensor or ndarray) – the segmentation of shape (H, W). Each value is the integer label of the pixel.

  • area_threshold (int) – segments with less than area_threshold are not drawn.

  • alpha (float) – the larger it is, the more opaque the segmentations are.

返回:

output (VisImage) – image object with visualizations.

draw_panoptic_seg(panoptic_seg, segments_info, area_threshold=None, alpha=0.7)[源代码]

Draw panoptic prediction annotations or results.

参数:
  • panoptic_seg (Tensor) – of shape (height, width) where the values are ids for each segment.

  • segments_info (list[dict] or None) – Describe each segment in panoptic_seg. If it is a list[dict], each dict contains keys “id”, “category_id”. If None, category id of each pixel is computed by pixel // metadata.label_divisor.

  • area_threshold (int) – stuff segments with less than area_threshold are not drawn.

返回:

output (VisImage) – image object with visualizations.

draw_dataset_dict(dic)[源代码]

Draw annotations/segmentations in Detectron2 Dataset format.

参数:

dic (dict) – annotation/segmentation data of one image, in Detectron2 Dataset format.

返回:

output (VisImage) – image object with visualizations.

overlay_instances(*, boxes=None, labels=None, masks=None, keypoints=None, assigned_colors=None, alpha=0.5)[源代码]
参数:
  • boxes (Boxes, RotatedBoxes or ndarray) – either a Boxes, or an Nx4 numpy array of XYXY_ABS format for the N objects in a single image, or a RotatedBoxes, or an Nx5 numpy array of (x_center, y_center, width, height, angle_degrees) format for the N objects in a single image,

  • labels (list[str]) – the text to be displayed for each instance.

  • masks (masks-like object) –

    Supported types are:

    • detectron2.structures.PolygonMasks, detectron2.structures.BitMasks.

    • list[list[ndarray]]: contains the segmentation masks for all objects in one image. The first level of the list corresponds to individual instances. The second level to all the polygon that compose the instance, and the third level to the polygon coordinates. The third level should have the format of [x0, y0, x1, y1, …, xn, yn] (n >= 3).

    • list[ndarray]: each ndarray is a binary mask of shape (H, W).

    • list[dict]: each dict is a COCO-style RLE.

  • keypoints (Keypoint or array like) – an array-like object of shape (N, K, 3), where the N is the number of instances and K is the number of keypoints. The last dimension corresponds to (x, y, visibility or score).

  • assigned_colors (list[matplotlib.colors]) – a list of colors, where each color corresponds to each mask or box in the image. Refer to ‘matplotlib.colors’ for full list of formats that the colors are accepted in.

返回:

output (VisImage) – image object with visualizations.

overlay_rotated_instances(boxes=None, labels=None, assigned_colors=None)[源代码]
参数:
  • boxes (ndarray) – an Nx5 numpy array of (x_center, y_center, width, height, angle_degrees) format for the N objects in a single image.

  • labels (list[str]) – the text to be displayed for each instance.

  • assigned_colors (list[matplotlib.colors]) – a list of colors, where each color corresponds to each mask or box in the image. Refer to ‘matplotlib.colors’ for full list of formats that the colors are accepted in.

返回:

output (VisImage) – image object with visualizations.

draw_and_connect_keypoints(keypoints)[源代码]

Draws keypoints of an instance and follows the rules for keypoint connections to draw lines between appropriate keypoints. This follows color heuristics for line color.

参数:

keypoints (Tensor) – a tensor of shape (K, 3), where K is the number of keypoints and the last dimension corresponds to (x, y, probability).

返回:

output (VisImage) – image object with visualizations.

draw_text(text, position, *, font_size=None, color='g', horizontal_alignment='center', rotation=0)[源代码]
参数:
  • text (str) – class label

  • position (tuple) – a tuple of the x and y coordinates to place text on image.

  • font_size (int, optional) – font of the text. If not provided, a font size proportional to the image width is calculated and used.

  • color – color of the text. Refer to matplotlib.colors for full list of formats that are accepted.

  • horizontal_alignment (str) – see matplotlib.text.Text

  • rotation – rotation angle in degrees CCW

返回:

output (VisImage) – image object with text drawn.

draw_box(box_coord, alpha=0.5, edge_color='g', line_style='-')[源代码]
参数:
  • box_coord (tuple) – a tuple containing x0, y0, x1, y1 coordinates, where x0 and y0 are the coordinates of the image’s top left corner. x1 and y1 are the coordinates of the image’s bottom right corner.

  • alpha (float) – blending efficient. Smaller values lead to more transparent masks.

  • edge_color – color of the outline of the box. Refer to matplotlib.colors for full list of formats that are accepted.

  • line_style (string) – the string to use to create the outline of the boxes.

返回:

output (VisImage) – image object with box drawn.

draw_rotated_box_with_label(rotated_box, alpha=0.5, edge_color='g', line_style='-', label=None)[源代码]

Draw a rotated box with label on its top-left corner.

参数:
  • rotated_box (tuple) – a tuple containing (cnt_x, cnt_y, w, h, angle), where cnt_x and cnt_y are the center coordinates of the box. w and h are the width and height of the box. angle represents how many degrees the box is rotated CCW with regard to the 0-degree box.

  • alpha (float) – blending efficient. Smaller values lead to more transparent masks.

  • edge_color – color of the outline of the box. Refer to matplotlib.colors for full list of formats that are accepted.

  • line_style (string) – the string to use to create the outline of the boxes.

  • label (string) – label for rotated box. It will not be rendered when set to None.

返回:

output (VisImage) – image object with box drawn.

draw_circle(circle_coord, color, radius=3)[源代码]
参数:
  • circle_coord (list(int) or tuple(int)) – contains the x and y coordinates of the center of the circle.

  • color – color of the polygon. Refer to matplotlib.colors for a full list of formats that are accepted.

  • radius (int) – radius of the circle.

返回:

output (VisImage) – image object with box drawn.

draw_line(x_data, y_data, color, linestyle='-', linewidth=None)[源代码]
参数:
  • x_data (list[int]) – a list containing x values of all the points being drawn. Length of list should match the length of y_data.

  • y_data (list[int]) – a list containing y values of all the points being drawn. Length of list should match the length of x_data.

  • color – color of the line. Refer to matplotlib.colors for a full list of formats that are accepted.

  • linestyle – style of the line. Refer to matplotlib.lines.Line2D for a full list of formats that are accepted.

  • linewidth (float or None) – width of the line. When it’s None, a default value will be computed and used.

返回:

output (VisImage) – image object with line drawn.

draw_binary_mask(binary_mask, color=None, *, edge_color=None, text=None, alpha=0.5, area_threshold=10)[源代码]
参数:
  • binary_mask (ndarray) – numpy array of shape (H, W), where H is the image height and W is the image width. Each value in the array is either a 0 or 1 value of uint8 type.

  • color – color of the mask. Refer to matplotlib.colors for a full list of formats that are accepted. If None, will pick a random color.

  • edge_color – color of the polygon edges. Refer to matplotlib.colors for a full list of formats that are accepted.

  • text (str) – if None, will be drawn on the object

  • alpha (float) – blending efficient. Smaller values lead to more transparent masks.

  • area_threshold (float) – a connected component smaller than this area will not be shown.

返回:

output (VisImage) – image object with mask drawn.

draw_soft_mask(soft_mask, color=None, *, text=None, alpha=0.5)[源代码]
参数:
  • soft_mask (ndarray) – float array of shape (H, W), each value in [0, 1].

  • color – color of the mask. Refer to matplotlib.colors for a full list of formats that are accepted. If None, will pick a random color.

  • text (str) – if None, will be drawn on the object

  • alpha (float) – blending efficient. Smaller values lead to more transparent masks.

返回:

output (VisImage) – image object with mask drawn.

draw_polygon(segment, color, edge_color=None, alpha=0.5)[源代码]
参数:
  • segment – numpy array of shape Nx2, containing all the points in the polygon.

  • color – color of the polygon. Refer to matplotlib.colors for a full list of formats that are accepted.

  • edge_color – color of the polygon edges. Refer to matplotlib.colors for a full list of formats that are accepted. If not provided, a darker shade of the polygon color will be used instead.

  • alpha (float) – blending efficient. Smaller values lead to more transparent masks.

返回:

output (VisImage) – image object with polygon drawn.

get_output()[源代码]
返回:

output (VisImage) – the image output containing the visualizations added to the image.

detectron2.utils.video_visualizer module

class detectron2.utils.video_visualizer.VideoVisualizer(metadata, instance_mode=ColorMode.IMAGE)[源代码]

基类:object

__init__(metadata, instance_mode=ColorMode.IMAGE)[源代码]
参数:

metadata (MetadataCatalog) – image metadata.

draw_instance_predictions(frame, predictions)[源代码]

Draw instance-level prediction results on an image.

参数:
  • frame (ndarray) – an RGB image of shape (H, W, C), in the range [0, 255].

  • predictions (Instances) – the output of an instance detection/segmentation model. Following fields will be used to draw: “pred_boxes”, “pred_classes”, “scores”, “pred_masks” (or “pred_masks_rle”).

返回:

output (VisImage) – image object with visualizations.

draw_sem_seg(frame, sem_seg, area_threshold=None)[源代码]
参数:
  • sem_seg (ndarray or Tensor) – semantic segmentation of shape (H, W), each value is the integer label.

  • area_threshold (Optional[int]) – only draw segmentations larger than the threshold