cv2.dnn ¶

getRealValue([idx]) → retval¶

Parameters:

self –
idx (int) –

Return type:

float

getStringValue([idx]) → retval¶

Parameters:

self –
idx (int) –

Return type:

str

class cv2.dnn.Image2BlobParams¶

__init__(self)¶

Parameters:: self –
Return type:: None

__init__(self, scalefactor: cv2.typing.Scalar, size: cv2.typing.Size = ..., mean: cv2.typing.Scalar = ..., swapRB: bool = ..., ddepth: int = ..., datalayout: DataLayout = ..., mode: ImagePaddingMode = ..., borderValue: cv2.typing.Scalar = ...)¶

Parameters:

self –
scalefactor (cv2.typing.Scalar) –
size (cv2.typing.Size) –
mean (cv2.typing.Scalar) –
swapRB (bool) –
ddepth (int) –
datalayout (DataLayout) –
mode (ImagePaddingMode) –
borderValue (cv2.typing.Scalar) –

Return type:

None

blobRectToImageRect(rBlob, size) → retval¶

Get rectangle coordinates in original image system from rectangle in blob coordinates. * @param rBlob rect in blob coordinates. * @param size original input image size. * @returns rectangle in original image coordinates.

Parameters:

self –
rBlob (cv2.typing.Rect) –
size (cv2.typing.Size) –

Return type:

cv2.typing.Rect

blobRectsToImageRects(rBlob, size) → rImg¶

Get rectangle coordinates in original image system from rectangle in blob coordinates. * @param rBlob rect in blob coordinates. * @param rImg result rect in image coordinates. * @param size original input image size.

Parameters:

self –
rBlob (_typing.Sequence[cv2.typing.Rect]) –
size (cv2.typing.Size) –

Return type:

_typing.Sequence[cv2.typing.Rect]

scalefactor: cv2.typing.Scalar¶

size: cv2.typing.Size¶

mean: cv2.typing.Scalar¶

swapRB: bool¶

ddepth: int¶

datalayout: DataLayout¶

paddingmode: ImagePaddingMode¶

borderValue: cv2.typing.Scalar¶

class cv2.dnn.KeypointsModel¶

__init__(self, model: str, config: str = ...)¶

Parameters:

self –
model (str) –
config (str) –

Return type:

None

__init__(self, network: Net)¶

Parameters:

self –
network (Net) –

Return type:

None

estimate(frame[, thresh]) → retval¶

Given the @p input frame, create input blob, run net * @param[in] frame The input image. * @param thresh minimum confidence threshold to select a keypoint * @returns a vector holding the x and y coordinates of each detected keypoint *

Parameters:

self –
frame (cv2.typing.MatLike) –
thresh (float) –

Return type:

_typing.Sequence[cv2.typing.Point2f]

estimate(frame[, thresh]) → retval¶

Parameters:

self –
frame (cv2.UMat) –
thresh (float) –

Return type:

_typing.Sequence[cv2.typing.Point2f]

class cv2.dnn.Layer¶

name()¶

Parameters:: self –
Return type:: str

type()¶

Parameters:: self –
Return type:: str

preferableTarget()¶

Parameters:: self –
Return type:: int

finalize(inputs[, outputs]) → outputs¶

Computes and sets internal parameters according to inputs, outputs and blobs. * @param[in] inputs vector of already allocated input blobs * @param[out] outputs vector of already allocated output blobs * * If this method is called after network has allocated all memory for input and output blobs * and before inferencing.

Parameters:

self –
inputs (_typing.Sequence[cv2.typing.MatLike]) –
outputs (_typing.Sequence[cv2.typing.MatLike] | None) –

Return type:

_typing.Sequence[cv2.typing.MatLike]

finalize(inputs[, outputs]) → outputs¶

Parameters:

self –
inputs (_typing.Sequence[cv2.UMat]) –
outputs (_typing.Sequence[cv2.UMat] | None) –

Return type:

_typing.Sequence[cv2.UMat]

run(inputs, internals[, outputs]) → outputs, internals¶

Allocates layer and computes output. * @deprecated This method will be removed in the future release.

Parameters:

self –
inputs (_typing.Sequence[cv2.typing.MatLike]) –
internals (_typing.Sequence[cv2.typing.MatLike]) –
outputs (_typing.Sequence[cv2.typing.MatLike] | None) –

Return type:

tuple[_typing.Sequence[cv2.typing.MatLike], _typing.Sequence[cv2.typing.MatLike]]

outputNameToIndex(outputName) → retval¶

Returns index of output blob in output array. * @see inputNameToIndex()

Parameters:

self –
outputName (str) –

Return type:

blobs: _typing.Sequence[cv2.typing.MatLike]¶

class cv2.dnn.Model¶

__init__(self, model: str, config: str = ...)¶

Parameters:

self –
model (str) –
config (str) –

Return type:

None

__init__(self, network: Net)¶

Parameters:

self –
network (Net) –

Return type:

None

setInputSize(size) → retval¶

Set input size for frame. * @param[in] size New input size. * @note If shape of the new blob less than 0, then frame size not change. @overload * @param[in] width New input width. * @param[in] height New input height.

Parameters:

self –
size (cv2.typing.Size) –

Return type:

setInputSize(size) → retval¶

Parameters:

self –
width (int) –
height (int) –

Return type:

predict(frame[, outs]) → outs¶

Given the @p input frame, create input blob, run net and return the output @p blobs. * @param[in] frame The input image. * @param[out] outs Allocated output blobs, which will store results of the computation.

Parameters:

self –
frame (cv2.typing.MatLike) –
outs (_typing.Sequence[cv2.typing.MatLike] | None) –

Return type:

_typing.Sequence[cv2.typing.MatLike]

predict(frame[, outs]) → outs¶

Parameters:

self –
frame (cv2.UMat) –
outs (_typing.Sequence[cv2.UMat] | None) –

Return type:

_typing.Sequence[cv2.UMat]

setInputMean(mean) → retval¶

Set mean value for frame. * @param[in] mean Scalar with mean values which are subtracted from channels.

Parameters:

self –
mean (cv2.typing.Scalar) –

Return type:

setInputScale(scale) → retval¶

Set scalefactor value for frame. * @param[in] scale Multiplier for frame values.

Parameters:

self –
scale (cv2.typing.Scalar) –

Return type:

setInputCrop(crop) → retval¶

Set flag crop for frame. * @param[in] crop Flag which indicates whether image will be cropped after resize or not.

Parameters:

self –
crop (bool) –

Return type:

setInputSwapRB(swapRB) → retval¶

Set flag swapRB for frame. * @param[in] swapRB Flag which indicates that swap first and last channels.

Parameters:

self –
swapRB (bool) –

Return type:

setInputParams([scale[, size[, mean[, swapRB[, crop]]]]]) → None¶

Set preprocessing parameters for frame. * @param[in] size New input size. * @param[in] mean Scalar with mean values which are subtracted from channels. * @param[in] scale Multiplier for frame values. * @param[in] swapRB Flag which indicates that swap first and last channels. * @param[in] crop Flag which indicates whether image will be cropped after resize or not. * blob(n, c, y, x) = scale * resize( frame(y, x, c) ) - mean(c) )

Parameters:

self –
scale (float) –
size (cv2.typing.Size) –
mean (cv2.typing.Scalar) –
swapRB (bool) –
crop (bool) –

Return type:

None

setPreferableBackend(backendId) → retval¶

Parameters:

self –
backendId (Backend) –

Return type:

setPreferableTarget(targetId) → retval¶

Parameters:

self –
targetId (Target) –

Return type:

enableWinograd(useWinograd) → retval¶

Parameters:

self –
useWinograd (bool) –

Return type:

class cv2.dnn.Net¶

classmethod readFromModelOptimizer(xml, bin) → retval¶

Create a network from Intel’s Model Optimizer in-memory buffers with intermediate representation (IR). * @param[in] bufferModelConfig buffer with model’s configuration. * @param[in] bufferWeights buffer with model’s trained weights. * @returns Net object.

Parameters:

cls –
xml (str) –
bin (str) –

Return type:

classmethod readFromModelOptimizer(xml, bin) → retval¶

Parameters:

cls –
bufferModelConfig (numpy.ndarray[_typing.Any, numpy.dtype[numpy.uint8]]) –
bufferWeights (numpy.ndarray[_typing.Any, numpy.dtype[numpy.uint8]]) –

Return type:

getLayer(layerId) → retval¶

Returns pointer to layer with specified id or name which the network use.@overload * @deprecated Use int getLayerId(const String &layer)

Parameters:

self –
layerId (int) –

Return type:

Layer

getLayer(layerId) → retval¶

Returns pointer to layer with specified id or name which the network use.@overload * @deprecated Use int getLayerId(const String &layer)

Parameters:

self –
layerName (str) –

Return type:

Layer

getLayer(layerId) → retval¶

Returns pointer to layer with specified id or name which the network use.@overload * @deprecated Use int getLayerId(const String &layer)

Parameters:

self –
layerId (cv2.typing.LayerId) –

Return type:

Layer

forward([outputName]) → retval¶

Runs forward pass to compute outputs of layers listed in @p outBlobNames. * @param outputBlobs contains blobs for first outputs of specified layers. * @param outBlobNames names for layers which outputs are needed to get

Parameters:

self –
outputName (str) –

Return type:

cv2.typing.MatLike

forward([outputName]) → retval¶

Parameters:

self –
outputBlobs (_typing.Sequence[cv2.typing.MatLike] | None) –
outputName (str) –

Return type:

_typing.Sequence[cv2.typing.MatLike]

forward([outputName]) → retval¶

Parameters:

self –
outputBlobs (_typing.Sequence[cv2.UMat] | None) –
outputName (str) –

Return type:

_typing.Sequence[cv2.UMat]

forward([outputName]) → retval¶

Parameters:

self –
outBlobNames (_typing.Sequence[str]) –
outputBlobs (_typing.Sequence[cv2.typing.MatLike] | None) –

Return type:

_typing.Sequence[cv2.typing.MatLike]

forward([outputName]) → retval¶

Parameters:

self –
outBlobNames (_typing.Sequence[str]) –
outputBlobs (_typing.Sequence[cv2.UMat] | None) –

Return type:

_typing.Sequence[cv2.UMat]

quantize(calibData, inputsDtype, outputsDtype[, perChannel]) → retval¶

Returns a quantized Net from a floating-point Net. * @param calibData Calibration data to compute the quantization parameters. * @param inputsDtype Datatype of quantized net’s inputs. Can be CV_32F or CV_8S. * @param outputsDtype Datatype of quantized net’s outputs. Can be CV_32F or CV_8S. * @param perChannel Quantization granularity of quantized Net. The default is true, that means quantize model * in per-channel way (channel-wise). Set it false to quantize model in per-tensor way (or tensor-wise).

Parameters:

self –
calibData (_typing.Sequence[cv2.typing.MatLike]) –
inputsDtype (int) –
outputsDtype (int) –
perChannel (bool) –

Return type:

quantize(calibData, inputsDtype, outputsDtype[, perChannel]) → retval¶

Parameters:

self –
calibData (_typing.Sequence[cv2.UMat]) –
inputsDtype (int) –
outputsDtype (int) –
perChannel (bool) –

Return type:

setInput(blob[, name[, scalefactor[, mean]]]) → None¶

Sets the new input value for the network * @param blob A new blob. Should have CV_32F or CV_8U depth. * @param name A name of input layer. * @param scalefactor An optional normalization scale. * @param mean An optional mean subtraction values. * @see connect(String, String) to know format of the descriptor. * * If scale or mean values are specified, a final input blob is computed * as: * \begin{equation*}input(n,c,h,w) = scalefactor \times (blob(n,c,h,w) - mean_c)\end{equation*}

Parameters:

self –
blob (cv2.typing.MatLike) –
name (str) –
scalefactor (float) –
mean (cv2.typing.Scalar) –

Return type:

None

setInput(blob[, name[, scalefactor[, mean]]]) → None¶

Parameters:

self –
blob (cv2.UMat) –
name (str) –
scalefactor (float) –
mean (cv2.typing.Scalar) –

Return type:

None

setParam(layer, numParam, blob) → None¶

Sets the new value for the learned param of the layer. * @param layer name or id of the layer. * @param numParam index of the layer parameter in the Layer::blobs array. * @param blob the new value. * @see Layer::blobs * @note If shape of the new blob differs from the previous shape, * then the following forward pass may fail.

Parameters:

self –
layer (int) –
numParam (int) –
blob (cv2.typing.MatLike) –

Return type:

None

setParam(layer, numParam, blob) → None¶

Parameters:

self –
layerName (str) –
numParam (int) –
blob (cv2.typing.MatLike) –

Return type:

None

getParam(layer[, numParam]) → retval¶

Returns parameter blob of the layer. * @param layer name or id of the layer. * @param numParam index of the layer parameter in the Layer::blobs array. * @see Layer::blobs

Parameters:

self –
layer (int) –
numParam (int) –

Return type:

cv2.typing.MatLike

getParam(layer[, numParam]) → retval¶

Returns parameter blob of the layer. * @param layer name or id of the layer. * @param numParam index of the layer parameter in the Layer::blobs array. * @see Layer::blobs

Parameters:

self –
layerName (str) –
numParam (int) –

Return type:

cv2.typing.MatLike

getLayersShapes(netInputShapes) → layersIds, inLayersShapes, outLayersShapes¶

Returns input and output shapes for all layers in loaded model; * preliminary inferencing isn’t necessary. * @param netInputShapes shapes for all input blobs in net input layer. * @param layersIds output parameter for layer IDs. * @param inLayersShapes output parameter for input layers shapes; * order is the same as in layersIds * @param outLayersShapes output parameter for output layers shapes; * order is the same as in layersIds @overload

Parameters:

self –
netInputShapes (_typing.Sequence[cv2.typing.MatShape]) –

Return type:

tuple[_typing.Sequence[int], _typing.Sequence[_typing.Sequence[cv2.typing.MatShape]], _typing.Sequence[_typing.Sequence[cv2.typing.MatShape]]]

getLayersShapes(netInputShapes) → layersIds, inLayersShapes, outLayersShapes¶

Parameters:

self –
netInputShape (cv2.typing.MatShape) –

Return type:

tuple[_typing.Sequence[int], _typing.Sequence[_typing.Sequence[cv2.typing.MatShape]], _typing.Sequence[_typing.Sequence[cv2.typing.MatShape]]]

getFLOPS(netInputShapes) → retval¶

Computes FLOP for whole loaded model with specified input shapes. * @param netInputShapes vector of shapes for all net inputs. * @returns computed FLOP. @overload @overload @overload

Parameters:

self –
netInputShapes (_typing.Sequence[cv2.typing.MatShape]) –

Return type:

getFLOPS(netInputShapes) → retval¶

Computes FLOP for whole loaded model with specified input shapes. * @param netInputShapes vector of shapes for all net inputs. * @returns computed FLOP. @overload @overload @overload

Parameters:

self –
netInputShape (cv2.typing.MatShape) –

Return type:

getFLOPS(netInputShapes) → retval¶

Computes FLOP for whole loaded model with specified input shapes. * @param netInputShapes vector of shapes for all net inputs. * @returns computed FLOP. @overload @overload @overload

Parameters:

self –
layerId (int) –
netInputShapes (_typing.Sequence[cv2.typing.MatShape]) –

Return type:

getFLOPS(netInputShapes) → retval¶

Computes FLOP for whole loaded model with specified input shapes. * @param netInputShapes vector of shapes for all net inputs. * @returns computed FLOP. @overload @overload @overload

Parameters:

self –
layerId (int) –
netInputShape (cv2.typing.MatShape) –

Return type:

getMemoryConsumption(netInputShape) → weights, blobs¶

@overload @overload @overload

Parameters:

self –
netInputShape (cv2.typing.MatShape) –

Return type:

tuple[int, int]

getMemoryConsumption(netInputShape) → weights, blobs¶

@overload @overload @overload

Parameters:

self –
layerId (int) –
netInputShapes (_typing.Sequence[cv2.typing.MatShape]) –

Return type:

tuple[int, int]

getMemoryConsumption(netInputShape) → weights, blobs¶

@overload @overload @overload

Parameters:

self –
layerId (int) –
netInputShape (cv2.typing.MatShape) –

Return type:

tuple[int, int]

__init__(self)¶

Parameters:: self –
Return type:: None

empty() → retval¶

Returns true if there are no layers in the network.

Parameters:: self –
Return type:: bool

dump() → retval¶

Dump net to String * @returns String with structure, hyperparameters, backend, target and fusion * Call method after setInput(). To see correct backend, target and fusion run after forward().

Parameters:: self –
Return type:: str

dumpToFile(path) → None¶

Dump net structure, hyperparameters, backend, target and fusion to dot file * @param path path to output file with .dot extension * @see dump()

Parameters:

self –
path (str) –

Return type:

None

getLayerId(layer) → retval¶

Converts string name of the layer to the integer identifier. * @returns id of the layer, or -1 if the layer wasn’t found.

Parameters:

self –
layer (str) –

Return type:

getLayerNames() → retval¶

Parameters:: self –
Return type:: _typing.Sequence[str]

connect(outPin, inpPin) → None¶

Connects output of the first layer to input of the second layer. * @param outPin descriptor of the first layer output. * @param inpPin descriptor of the second layer input. * * Descriptors have the following template <layer_name>[.input_number]: * - the first part of the template layer_name is string name of the added layer. * If this part is empty then the network input pseudo layer will be used; * - the second optional part of the template input_number * is either number of the layer input, either label one. * If this part is omitted then the first layer input will be used. * * @see setNetInputs(), Layer::inputNameToIndex(), Layer::outputNameToIndex()

Parameters:

self –
outPin (str) –
inpPin (str) –

Return type:

None

setInputsNames(inputBlobNames) → None¶

Sets outputs names of the network input pseudo layer. * * Each net always has special own the network input pseudo layer with id=0. * This layer stores the user blobs only and don’t make any computations. * In fact, this layer provides the only way to pass user data into the network. * As any other layer, this layer can label its outputs and this function provides an easy way to do this.

Parameters:

self –
inputBlobNames (_typing.Sequence[str]) –

Return type:

None

setInputShape(inputName, shape) → None¶

Specify shape of network input.

Parameters:

self –
inputName (str) –
shape (cv2.typing.MatShape) –

Return type:

None

forwardAsync([outputName]) → retval¶

Runs forward pass to compute output of layer with name @p outputName. * @param outputName name for layer which output is needed to get * @details By default runs forward pass for the whole network. * * This is an asynchronous version of forward(const String&). * dnn::DNN_BACKEND_INFERENCE_ENGINE backend is required.

Parameters:

self –
outputName (str) –

Return type:

cv2.AsyncArray

forwardAndRetrieve(outBlobNames) → outputBlobs¶

Runs forward pass to compute outputs of layers listed in @p outBlobNames. * @param outputBlobs contains all output blobs for each layer specified in @p outBlobNames. * @param outBlobNames names for layers which outputs are needed to get

Parameters:

self –
outBlobNames (_typing.Sequence[str]) –

Return type:

_typing.Sequence[_typing.Sequence[cv2.typing.MatLike]]

getInputDetails() → scales, zeropoints¶

Returns input scale and zeropoint for a quantized Net. * @param scales output parameter for returning input scales. * @param zeropoints output parameter for returning input zeropoints.

Parameters:: self –
Return type:: tuple[_typing.Sequence[float], _typing.Sequence[int]]

getOutputDetails() → scales, zeropoints¶

Returns output scale and zeropoint for a quantized Net. * @param scales output parameter for returning output scales. * @param zeropoints output parameter for returning output zeropoints.

Parameters:: self –
Return type:: tuple[_typing.Sequence[float], _typing.Sequence[int]]

setHalideScheduler(scheduler) → None¶

@brief Compile Halide layers. * @param[in] scheduler Path to YAML file with scheduling directives. * @see setPreferableBackend * * Schedule layers that support Halide backend. Then compile them for * specific target. For layers that not represented in scheduling file * or if no manual scheduling used at all, automatic scheduling will be applied.

Parameters:

self –
scheduler (str) –

Return type:

None

setPreferableBackend(backendId) → None¶

@brief Ask network to use specific computation backend where it supported. * @param[in] backendId backend identifier. * @see Backend

Parameters:

self –
backendId (int) –

Return type:

None

setPreferableTarget(targetId) → None¶

@brief Ask network to make computations on specific target device. * @param[in] targetId target identifier. * @see Target * * List of supported combinations backend / target: * | | DNN_BACKEND_OPENCV | DNN_BACKEND_INFERENCE_ENGINE | DNN_BACKEND_HALIDE | DNN_BACKEND_CUDA | * |————————|——————–|——————————|——————–|——————-| * | DNN_TARGET_CPU | + | + | + | | * | DNN_TARGET_OPENCL | + | + | + | | * | DNN_TARGET_OPENCL_FP16 | + | + | | | * | DNN_TARGET_MYRIAD | | + | | | * | DNN_TARGET_FPGA | | + | | | * | DNN_TARGET_CUDA | | | | + | * | DNN_TARGET_CUDA_FP16 | | | | + | * | DNN_TARGET_HDDL | | + | | |

Parameters:

self –
targetId (int) –

Return type:

None

getUnconnectedOutLayers() → retval¶

Returns indexes of layers with unconnected outputs. * * FIXIT: Rework API to registerOutput() approach, deprecate this call

Parameters:: self –
Return type:: _typing.Sequence[int]

getUnconnectedOutLayersNames() → retval¶

Returns names of layers with unconnected outputs. * * FIXIT: Rework API to registerOutput() approach, deprecate this call

Parameters:: self –
Return type:: _typing.Sequence[str]

getLayerTypes() → layersTypes¶

Returns list of types for layer used in model. * @param layersTypes output parameter for returning types.

Parameters:: self –
Return type:: _typing.Sequence[str]

getLayersCount(layerType) → retval¶

Returns count of layers of specified type. * @param layerType type. * @returns count of layers

Parameters:

self –
layerType (str) –

Return type:

enableFusion(fusion) → None¶

Enables or disables layer fusion in the network. * @param fusion true to enable the fusion, false to disable. The fusion is enabled by default.

Parameters:

self –
fusion (bool) –

Return type:

None

enableWinograd(useWinograd) → None¶

Enables or disables the Winograd compute branch. The Winograd compute branch can speed up * 3x3 Convolution at a small loss of accuracy. * @param useWinograd true to enable the Winograd compute branch. The default is true.

Parameters:

self –
useWinograd (bool) –

Return type:

None

getPerfProfile() → retval, timings¶

Returns overall time for inference and timings (in ticks) for layers. * * Indexes in returned vector correspond to layers ids. Some layers can be fused with others, * in this case zero ticks count will be return for that skipped layers. Supported by DNN_BACKEND_OPENCV on DNN_TARGET_CPU only. * * @param[out] timings vector for tick timings for all layers. * @return overall ticks for model inference.

Parameters:: self –
Return type:: tuple[int, _typing.Sequence[float]]

class cv2.dnn.SegmentationModel¶

__init__(self, model: str, config: str = ...)¶

Parameters:

self –
model (str) –
config (str) –

Return type:

None

__init__(self, network: Net)¶

Parameters:

self –
network (Net) –

Return type:

None

segment(frame[, mask]) → mask¶

Given the @p input frame, create input blob, run net * @param[in] frame The input image. * @param[out] mask Allocated class prediction for each pixel

Parameters:

self –
frame (cv2.typing.MatLike) –
mask (cv2.typing.MatLike | None) –

Return type:

cv2.typing.MatLike

segment(frame[, mask]) → mask¶

Given the @p input frame, create input blob, run net * @param[in] frame The input image. * @param[out] mask Allocated class prediction for each pixel

Parameters:

self –
frame (cv2.UMat) –
mask (cv2.UMat | None) –

Return type:

cv2.UMat

class cv2.dnn.TextDetectionModel¶

detect(frame) → detections, confidences¶

Performs detection * * Given the input @p frame, prepare network input, run network inference, post-process network output and return result detections. * * Each result is quadrangle’s 4 points in this order: * - bottom-left * - top-left * - top-right * - bottom-right * * Use cv::getPerspectiveTransform function to retrieve image region without perspective transformations. * * @note If DL model doesn’t support that kind of output then result may be derived from detectTextRectangles() output. * * @param[in] frame The input image * @param[out] detections array with detections’ quadrangles (4 points per result) * @param[out] confidences array with detection confidences @overload

Parameters:

self –
frame (cv2.typing.MatLike) –

Return type:

tuple[_typing.Sequence[_typing.Sequence[cv2.typing.Point]], _typing.Sequence[float]]

detect(frame) → detections, confidences¶

Parameters:

self –
frame (cv2.UMat) –

Return type:

tuple[_typing.Sequence[_typing.Sequence[cv2.typing.Point]], _typing.Sequence[float]]

detect(frame) → detections, confidences¶

Parameters:

self –
frame (cv2.typing.MatLike) –

Return type:

_typing.Sequence[_typing.Sequence[cv2.typing.Point]]

detect(frame) → detections, confidences¶

Parameters:

self –
frame (cv2.UMat) –

Return type:

_typing.Sequence[_typing.Sequence[cv2.typing.Point]]

detectTextRectangles(frame) → detections, confidences¶

Performs detection * * Given the input @p frame, prepare network input, run network inference, post-process network output and return result detections. * * Each result is rotated rectangle. * * @note Result may be inaccurate in case of strong perspective transformations. * * @param[in] frame the input image * @param[out] detections array with detections’ RotationRect results * @param[out] confidences array with detection confidences @overload

Parameters:

self –
frame (cv2.typing.MatLike) –

Return type:

tuple[_typing.Sequence[cv2.typing.RotatedRect], _typing.Sequence[float]]

detectTextRectangles(frame) → detections, confidences¶

Parameters:

self –
frame (cv2.UMat) –

Return type:

tuple[_typing.Sequence[cv2.typing.RotatedRect], _typing.Sequence[float]]

detectTextRectangles(frame) → detections, confidences¶

Parameters:

self –
frame (cv2.typing.MatLike) –

Return type:

_typing.Sequence[cv2.typing.RotatedRect]

detectTextRectangles(frame) → detections, confidences¶

Parameters:

self –
frame (cv2.UMat) –

Return type:

_typing.Sequence[cv2.typing.RotatedRect]

class cv2.dnn.TextDetectionModel_DB¶

__init__(self, network: Net)¶

Parameters:

self –
network (Net) –

Return type:

None

__init__(self, model: str, config: str = ...)¶

Parameters:

self –
model (str) –
config (str) –

Return type:

None

setBinaryThreshold(binaryThreshold) → retval¶

Parameters:

self –
binaryThreshold (float) –

Return type:

getBinaryThreshold() → retval¶

Parameters:: self –
Return type:: float

setPolygonThreshold(polygonThreshold) → retval¶

Parameters:

self –
polygonThreshold (float) –

Return type:

getPolygonThreshold() → retval¶

Parameters:: self –
Return type:: float

setUnclipRatio(unclipRatio) → retval¶

Parameters:

self –
unclipRatio (float) –

Return type:

getUnclipRatio() → retval¶

Parameters:: self –
Return type:: float

setMaxCandidates(maxCandidates) → retval¶

Parameters:

self –
maxCandidates (int) –

Return type:

getMaxCandidates() → retval¶

Parameters:: self –
Return type:: int

class cv2.dnn.TextDetectionModel_EAST¶

__init__(self, network: Net)¶

Parameters:

self –
network (Net) –

Return type:

None

__init__(self, model: str, config: str = ...)¶

Parameters:

self –
model (str) –
config (str) –

Return type:

None

setConfidenceThreshold(confThreshold) → retval¶

@brief Set the detection confidence threshold
- @param[in] confThreshold A threshold used to filter boxes by confidences

Parameters:

self –
confThreshold (float) –

Return type:

TextDetectionModel_EAST

getConfidenceThreshold() → retval¶

@brief Get the detection confidence threshold

Parameters:: self –
Return type:: float

setNMSThreshold(nmsThreshold) → retval¶

@brief Set the detection NMS filter threshold
- @param[in] nmsThreshold A threshold used in non maximum suppression

Parameters:

self –
nmsThreshold (float) –

Return type:

TextDetectionModel_EAST

getNMSThreshold() → retval¶

@brief Get the detection confidence threshold

Parameters:: self –
Return type:: float

class cv2.dnn.TextRecognitionModel¶

__init__(self, network: Net)¶

Parameters:

self –
network (Net) –

Return type:

None

__init__(self, model: str, config: str = ...)¶

Parameters:

self –
model (str) –
config (str) –

Return type:

None

recognize(frame) → retval¶

@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @return The text recognition result
@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @param[in] roiRects List of text detection regions of interest (cv::Rect, CV_32SC4). ROIs is be cropped as the network inputs
- @param[out] results A set of text recognition results.

Parameters:

self –
frame (cv2.typing.MatLike) –

Return type:

str

recognize(frame) → retval¶

@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @return The text recognition result
@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @param[in] roiRects List of text detection regions of interest (cv::Rect, CV_32SC4). ROIs is be cropped as the network inputs
- @param[out] results A set of text recognition results.

Parameters:

self –
frame (cv2.UMat) –

Return type:

str

recognize(frame) → retval¶

@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @return The text recognition result
@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @param[in] roiRects List of text detection regions of interest (cv::Rect, CV_32SC4). ROIs is be cropped as the network inputs
- @param[out] results A set of text recognition results.

Parameters:

self –
frame (cv2.typing.MatLike) –
roiRects (_typing.Sequence[cv2.typing.MatLike]) –

Return type:

_typing.Sequence[str]

recognize(frame) → retval¶

@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @return The text recognition result
@brief Given the @p input frame, create input blob, run net and return recognition result
- @param[in] frame The input image
- @param[in] roiRects List of text detection regions of interest (cv::Rect, CV_32SC4). ROIs is be cropped as the network inputs
- @param[out] results A set of text recognition results.

Parameters:

self –
frame (cv2.UMat) –
roiRects (_typing.Sequence[cv2.UMat]) –

Return type:

_typing.Sequence[str]

setDecodeType(decodeType) → retval¶

@brief Set the decoding method of translating the network output into string
- @param[in] decodeType The decoding method of translating the network output into string, currently supported type:
- - "CTC-greedy" greedy decoding for the output of CTC-based methods
- - "CTC-prefix-beam-search" Prefix beam search decoding for the output of CTC-based methods

Parameters:

self –
decodeType (str) –

Return type:

TextRecognitionModel

getDecodeType() → retval¶

@brief Get the decoding method
- @return the decoding method

Parameters:: self –
Return type:: str

setDecodeOptsCTCPrefixBeamSearch(beamSize[, vocPruneSize]) → retval¶

@brief Set the decoding method options for "CTC-prefix-beam-search" decode usage
- @param[in] beamSize Beam size for search
- @param[in] vocPruneSize Parameter to optimize big vocabulary search,
- only take top @p vocPruneSize tokens in each search step, @p vocPruneSize <= 0 stands for disable this prune.

Parameters:

self –
beamSize (int) –
vocPruneSize (int) –

Return type:

TextRecognitionModel

setVocabulary(vocabulary) → retval¶

@brief Set the vocabulary for recognition.
- @param[in] vocabulary the associated vocabulary of the network.

Parameters:

self –
vocabulary (_typing.Sequence[str]) –

Return type:

TextRecognitionModel

getVocabulary() → retval¶

@brief Get the vocabulary for recognition.
- @return vocabulary the associated vocabulary

Parameters:: self –
Return type:: _typing.Sequence[str]

Functions¶

cv2.dnn.NMSBoxes(bboxes, scores, score_threshold, nms_threshold[, eta[, top_k]]) → indices¶

Performs non maximum suppression given boxes and corresponding scores.

 * @param bboxes a set of bounding boxes to apply NMS.
 * @param scores a set of corresponding confidences.
 * @param score_threshold a threshold used to filter boxes by score.
 * @param nms_threshold a threshold used in non maximum suppression.
 * @param indices the kept indices of bboxes after NMS.
 * @param eta a coefficient in adaptive threshold formula: $nms\_threshold_{i+1}=eta\cdot nms\_threshold_i$.
 * @param top_k if `>0`, keep at most @p top_k picked indices.

Parameters:

bboxes (_typing.Sequence[cv2.typing.Rect2d]) –
scores (_typing.Sequence[float]) –
score_threshold (float) –
nms_threshold (float) –
eta (float) –
top_k (int) –

Return type:

_typing.Sequence[int]

cv2.dnn.NMSBoxesBatched(bboxes, scores, class_ids, score_threshold, nms_threshold[, eta[, top_k]]) → indices¶

Performs batched non maximum suppression on given boxes and corresponding scores across different classes.

 * @param bboxes a set of bounding boxes to apply NMS.
 * @param scores a set of corresponding confidences.
 * @param class_ids a set of corresponding class ids. Ids are integer and usually start from 0.
 * @param score_threshold a threshold used to filter boxes by score.
 * @param nms_threshold a threshold used in non maximum suppression.
 * @param indices the kept indices of bboxes after NMS.
 * @param eta a coefficient in adaptive threshold formula: $nms\_threshold_{i+1}=eta\cdot nms\_threshold_i$.
 * @param top_k if `>0`, keep at most @p top_k picked indices.

Parameters:

bboxes (_typing.Sequence[cv2.typing.Rect2d]) –
scores (_typing.Sequence[float]) –
class_ids (_typing.Sequence[int]) –
score_threshold (float) –
nms_threshold (float) –
eta (float) –
top_k (int) –

Return type:

_typing.Sequence[int]

cv2.dnn.NMSBoxesRotated(bboxes, scores, score_threshold, nms_threshold[, eta[, top_k]]) → indices¶

Parameters:

bboxes (_typing.Sequence[cv2.typing.RotatedRect]) –
scores (_typing.Sequence[float]) –
score_threshold (float) –
nms_threshold (float) –
eta (float) –
top_k (int) –

Return type:

_typing.Sequence[int]

cv2.dnn.Net_readFromModelOptimizer(xml, bin) → retval¶

Return type:: object

cv2.dnn.blobFromImage(image[, scalefactor[, size[, mean[, swapRB[, crop[, ddepth]]]]]]) → retval¶

Creates 4-dimensional blob from image. Optionally resizes and crops @p image from center, * subtract @p mean values, scales values by @p scalefactor, swap Blue and Red channels. * @param image input image (with 1-, 3- or 4-channels). * @param scalefactor multiplier for @p images values. * @param size spatial size for output image * @param mean scalar with mean values which are subtracted from channels. Values are intended * to be in (mean-R, mean-G, mean-B) order if @p image has BGR ordering and @p swapRB is true. * @param swapRB flag which indicates that swap first and last channels * in 3-channel image is necessary. * @param crop flag which indicates whether image will be cropped after resize or not * @param ddepth Depth of output blob. Choose CV_32F or CV_8U. * @details if @p crop is true, input image is resized so one side after resize is equal to corresponding * dimension in @p size and another one is equal or larger. Then, crop from the center is performed. * If @p crop is false, direct resize without cropping and preserving aspect ratio is performed. * @returns 4-dimensional Mat with NCHW dimensions order. * * @note * The order and usage of scalefactor and mean are (input - mean) * scalefactor.

Parameters:

image (cv2.typing.MatLike) –
scalefactor (float) –
size (cv2.typing.Size) –
mean (cv2.typing.Scalar) –
swapRB (bool) –
crop (bool) –
ddepth (int) –

Return type:

cv2.typing.MatLike

cv2.dnn.blobFromImageWithParams(image[, param]) → retval¶

Creates 4-dimensional blob from image with given params. * * @details This function is an extension of @ref blobFromImage to meet more image preprocess needs. * Given input image and preprocessing parameters, and function outputs the blob. * * @param image input image (all with 1-, 3- or 4-channels). * @param param struct of Image2BlobParams, contains all parameters needed by processing of image to blob. * @return 4-dimensional Mat. @overload

Parameters:

image (cv2.typing.MatLike) –
param (Image2BlobParams) –

Return type:

cv2.typing.MatLike

cv2.dnn.blobFromImages(images[, scalefactor[, size[, mean[, swapRB[, crop[, ddepth]]]]]]) → retval¶

Creates 4-dimensional blob from series of images. Optionally resizes and * crops @p images from center, subtract @p mean values, scales values by @p scalefactor, * swap Blue and Red channels. * @param images input images (all with 1-, 3- or 4-channels). * @param size spatial size for output image * @param mean scalar with mean values which are subtracted from channels. Values are intended * to be in (mean-R, mean-G, mean-B) order if @p image has BGR ordering and @p swapRB is true. * @param scalefactor multiplier for @p images values. * @param swapRB flag which indicates that swap first and last channels * in 3-channel image is necessary. * @param crop flag which indicates whether image will be cropped after resize or not * @param ddepth Depth of output blob. Choose CV_32F or CV_8U. * @details if @p crop is true, input image is resized so one side after resize is equal to corresponding * dimension in @p size and another one is equal or larger. Then, crop from the center is performed. * If @p crop is false, direct resize without cropping and preserving aspect ratio is performed. * @returns 4-dimensional Mat with NCHW dimensions order. * * @note * The order and usage of scalefactor and mean are (input - mean) * scalefactor.

Parameters:

images (_typing.Sequence[cv2.typing.MatLike]) –
scalefactor (float) –
size (cv2.typing.Size) –
mean (cv2.typing.Scalar) –
swapRB (bool) –
crop (bool) –
ddepth (int) –

Return type:

cv2.typing.MatLike

cv2.dnn.blobFromImagesWithParams(images[, param]) → retval¶

Creates 4-dimensional blob from series of images with given params. * * @details This function is an extension of @ref blobFromImages to meet more image preprocess needs. * Given input image and preprocessing parameters, and function outputs the blob. * * @param images input image (all with 1-, 3- or 4-channels). * @param param struct of Image2BlobParams, contains all parameters needed by processing of image to blob. * @returns 4-dimensional Mat. @overload

Parameters:

images (_typing.Sequence[cv2.typing.MatLike]) –
param (Image2BlobParams) –

Return type:

cv2.typing.MatLike

cv2.dnn.getAvailableTargets(be) → retval¶

Parameters:: be (Backend) –
Return type:: _typing.Sequence[Target]

cv2.dnn.imagesFromBlob(blob_[, images_]) → images_¶

Parse a 4D blob and output the images it contains as 2D arrays through a simpler data structure * (std::vectorcv::Mat). * @param[in] blob_ 4 dimensional array (images, channels, height, width) in floating point precision (CV_32F) from * which you would like to extract the images. * @param[out] images_ array of 2D Mat containing the images extracted from the blob in floating point precision * (CV_32F). They are non normalized neither mean added. The number of returned images equals the first dimension * of the blob (batch size). Every image has a number of channels equals to the second dimension of the blob (depth).

Parameters:

blob_ (cv2.typing.MatLike) –
images_ (_typing.Sequence[cv2.typing.MatLike] | None) –

Return type:

_typing.Sequence[cv2.typing.MatLike]

cv2.dnn.readNet(model[, config[, framework]]) → retval¶

@brief Read deep learning network represented in one of the supported formats. * @param[in] model Binary file contains trained weights. The following file * extensions are expected for models from different frameworks: * * *.caffemodel (Caffe, http://caffe.berkeleyvision.org/) * * *.pb (TensorFlow, https://www.tensorflow.org/) * * *.t7 | *.net (Torch, http://torch.ch/) * * *.weights (Darknet, https://pjreddie.com/darknet/) * * *.bin | *.onnx (OpenVINO, https://software.intel.com/openvino-toolkit) * * *.onnx (ONNX, https://onnx.ai/) * @param[in] config Text file contains network configuration. It could be a * file with the following extensions: * * *.prototxt (Caffe, http://caffe.berkeleyvision.org/) * * *.pbtxt (TensorFlow, https://www.tensorflow.org/) * * *.cfg (Darknet, https://pjreddie.com/darknet/) * * *.xml (OpenVINO, https://software.intel.com/openvino-toolkit) * @param[in] framework Explicit framework name tag to determine a format. * @returns Net object. * * This function automatically detects an origin framework of trained model * and calls an appropriate function such @ref readNetFromCaffe, @ref readNetFromTensorflow, * @ref readNetFromTorch or @ref readNetFromDarknet. An order of @p model and @p config * arguments does not matter.
@brief Read deep learning network represented in one of the supported formats. * @details This is an overloaded member function, provided for convenience. * It differs from the above function only in what argument(s) it accepts. * @param[in] framework Name of origin framework. * @param[in] bufferModel A buffer with a content of binary file with weights * @param[in] bufferConfig A buffer with a content of text file contains network configuration. * @returns Net object.

Parameters:

model (str) –
config (str) –
framework (str) –

Return type:

cv2.dnn.readNetFromCaffe(prototxt[, caffeModel]) → retval¶

Reads a network model stored in Caffe model in memory. * @param bufferProto buffer containing the content of the .prototxt file * @param bufferModel buffer containing the content of the .caffemodel file * @returns Net object.

Parameters:

prototxt (str) –
caffeModel (str) –

Return type:

cv2.dnn.readNetFromDarknet(cfgFile[, darknetModel]) → retval¶

Reads a network model stored in Darknet model files. * @param bufferCfg A buffer contains a content of .cfg file with text description of the network architecture. * @param bufferModel A buffer contains a content of .weights file with learned network. * @returns Net object.

Parameters:

cfgFile (str) –
darknetModel (str) –

Return type:

cv2.dnn.readNetFromModelOptimizer(xml[, bin]) → retval¶

Load a network from Intel’s Model Optimizer intermediate representation. * @param[in] bufferModelConfig Buffer contains XML configuration with network’s topology. * @param[in] bufferWeights Buffer contains binary data with trained weights. * @returns Net object. * Networks imported from Intel’s Model Optimizer are launched in Intel’s Inference Engine * backend.

Parameters:

xml (str) –
bin (str) –

Return type:

cv2.dnn.readNetFromONNX(onnxFile) → retval¶

Reads a network model from ONNX * in-memory buffer. * @param buffer in-memory buffer that stores the ONNX model bytes. * @returns Network object that ready to do forward, throw an exception * in failure cases.

Parameters:: onnxFile (str) –
Return type:: Net

cv2.dnn.readNetFromTFLite(model) → retval¶

Reads a network model stored in TFLite framework’s format. * @param bufferModel buffer containing the content of the tflite file * @returns Net object.

Parameters:: model (str) –
Return type:: Net

cv2.dnn.readNetFromTensorflow(model[, config]) → retval¶

Reads a network model stored in TensorFlow framework’s format. * @param bufferModel buffer containing the content of the pb file * @param bufferConfig buffer containing the content of the pbtxt file * @returns Net object.

Parameters:

model (str) –
config (str) –

Return type:

cv2.dnn.readNetFromTorch(model[, isBinary[, evaluate]]) → retval¶

@brief Reads a network model stored in Torch7 framework’s format.
- @param model path to the file, dumped from Torch by using torch.save() function.
- @param isBinary specifies whether the network was serialized in ascii mode or binary.
- @param evaluate specifies testing phase of network. If true, it’s similar to evaluate() method in Torch.
- @returns Net object.
- @note Ascii mode of Torch serializer is more preferable, because binary mode extensively use long type of C language,
- which has various bit-length on different systems.
- The loading file must contain serialized nn.Module object
- with importing network. Try to eliminate a custom objects from serialazing data to avoid importing errors.
- List of supported layers (i.e. object instances derived from Torch nn.Module class):
- - nn.Sequential
- - nn.Parallel
- - nn.Concat
- - nn.Linear
- - nn.SpatialConvolution
- - nn.SpatialMaxPooling, nn.SpatialAveragePooling
- - nn.ReLU, nn.TanH, nn.Sigmoid
- - nn.Reshape
- - nn.SoftMax, nn.LogSoftMax
- Also some equivalents of these classes from cunn, cudnn, and fbcunn may be successfully imported.

Parameters:

model (str) –
isBinary (bool) –
evaluate (bool) –

Return type: