
 2022-09-18 17:45:05

Gradients and Sobel Derivatives

One of the most basic and important convolutions is the computation of derivatives (or

approximations to them). There are many ways to do this, but only a few are well suited

to a given situation.

In general, the most common operator used to represent differentiation is the Sobel derivative[Sobel68] operator (see Figures 6-3 and 6-4). Sobel operators exist for any order of derivative as well as for mixed partial derivatives (e.g., /xy).

Figure 6-3. The effect of the Sobel operator when used to approximate a first derivative in the x-dimension

Here, src and dst are your image input and output, and xorder and yorder are the orders

of the derivative. Typically yoursquo;ll use 0, 1, or at most 2; a 0 value indicates no derivative in that direction.* The aperture_size parameter should be odd and is the width (and the height) of the square filter. Currently, aperture_sizes of 1, 3, 5, and 7 are supported. If src is 8-bit then the dst must be of depth IPL_DEPTH_16S to avoid overflow.

Figure 6-4. The effect of the Sobel operator when used to approximate a first derivative in the y-dimension

Sobel derivatives have the nice property that they can be defined for kernels of any size, and those kernels can be constructed quickly and iteratively. The larger kernels give a better approximation to the derivative because the smaller kernels are very sensitive to noise.

To understand this more exactly, we must realize that a Sobel derivative is not really a

derivative at all. This is because the Sobel operator is defined on a discrete space. What

the Sobel operator actually represents is a fit to a polynomial. That is, the Sobel derivative of second order in the x-direction is not really a second derivative; it is a local fit to a parabolic function. This explains why one might want to use a larger kernel: that larger kernel is computing the fit over a larger number of pixels.

Scharr Filter

In fact, there are many ways to approximate a derivative in the case of a discrete grid. The downside of the approximation used for the Sobel operator is that it is less accurate

for small kernels. For large kernels, where more points are used in the approximation,

this problem is less significant. This inaccuracy does not show up directly for the X and

Y filters used in cvSobel(), because they are exactly aligned with the x- and y-axes. The

difficulty arises when you want to make image measurements that are approximations of directional derivatives (i.e., direction of the image gradient by using the arctangent of the y/x filter responses).

To put this in context, a concrete example of where you may want image measurements of this kind would be in the process of collecting shape information from an object by assembling a histogram of gradient angles around the object. Such a histogram is the basis on which many common shape classifiers are trained and operated. In this case, inaccurate measures of gradient angle will decrease the recognition performance of the classifier.

For a 3-by-3 Sobel filter, the inaccuracies are more apparent the further the gradient angle is from horizontal or vertical. OpenCV addresses this inaccuracy for small (but fast) 3-by-3 Sobel derivative filters by a somewhat obscure use of the special aperture_size

value CV_SCHARR in the cvSobel() function. The Scharr filter is just as fast but more accurate than the Sobel filter, so it should always be used if you want to make image measurements using a 3-by-3 filter. The filter coefficients for the Scharr filter are shown in Figure 6-5 [Scharr00].

Figure 6-5. Th e 3-by-3 Scharr filter using flag CV_SHARR


The OpenCV Laplacian function (first used in vision by Marr [Marr82]) implements a

discrete analog of the Laplacian operator:*

Because the Laplacian operator can be defined in terms of second derivatives, you might

well suppose that the discrete implementation works something like the second-order

Sobel derivative. Indeed it does, and in fact the OpenCV implementation of the Laplacian

operator uses the Sobel operators directly in its computation.

The cvLaplace() function takes the usual source and destination images as arguments as

well as an aperture size. The source can be either an 8-bit (unsigned) image or a 32-bit

(floating-point) image. The destination must be a 16-bit (signed) image or a 32-bit (floating-point) image. This aperture is precisely the same as the aperture appearing in the

Sobel derivatives and, in effect, gives the size of the region over which the pixels are

sampled in the computation of the second derivatives.

The Laplace operator can be used in a variety of contexts. A common application is to

detect “blobs.” Recall that the form of the Laplacian operator is a sum of second derivatives along the x-axis and y-axis. This means that a single point or any small blob

(smaller than the aperture) that is surrounded by higher values will tend to maximize this function. Conversely, a point or small blob that is surrounded by lower values will tend to maximize the negative of this function.

With this in mind, the Laplace operator can also be used as a kind of edge detector. To see how this is done, consider the first derivative of a function, which will (of course) be large wherever the function is changing rapidly. Equally important, it will grow rapidly as we approach an edge-like discontinuity and shrink rapidly as we move past the

discontinuity. Hence the derivative will be at a local maximum somewhere within this

range. Therefore we can look to the 0s of the second derivative for locations of such




通常来说,勇于表达微分的最常用的操作是Sobel微分算子(见图6-3和图6-4)。Sobel算子包含任意阶的微分以及融合偏导(例如 /xy)。







事实上,在离散网络的场合下有很多方法可以近似的计算出导数。对于小一点的核而言,这种使用于Sobel算子近似计算导数的缺点是精度比较低。对于大核,由于在估计时使用了更多的点,所以这个问题并不严重。这种不精确性并不会直接在cvSobel()中使用的X和Y滤波器中表现出来,因为它们完全沿x轴和y轴排列。当试图估计图像的方向导数(directional derivative,即,使用y/x滤波器响应的反正切得到的图像梯度的方向)时,难度就出现了。





OpenCV的拉普拉斯函数(第一次被Marr [Marr82]应用于视觉领域)实现了拉普拉斯算子的离散拟合。















霍夫直线变换的基本理论是二值图像中的任何点都可能是一些候选直线集合的一部分。如果要确定每条线进行参数化,例如一个斜率a和截距b原始图像中的一点会变换为(a,b)平面上的轨迹,轨迹上的点对应着所有过原始图像上点的直线(见图6-9)。如果我们将输入图像中所有非0像素转化成输出图像中的这些点集并且将其贡献相加,然后输入图像(例如(x,y)平面)出现的直线将会在输出图像(例如(a,b)平面)以局部最大值出现。因为我们将每个点的贡献相加,因此(a,b)平面通常被称为累加平面(accumu lator plane)。


你可能认为用斜率-截距的形式来代表所有通过的点并不是一种最好的方式(因为作为斜率函数,直线的密度有相当的差异,以及相关的事实是可能的斜率间隔的范围是从 -到 )。正是由于这个原因,在实际数值计算中使用的变换图像的参数化略微有些不同。首选的参数化方式是每一行代表极坐标(,)中的一个点,并且隐含的直线是通过象征点,垂直于远点到此点的半径。如图6-10所示,此直线的方程如下:

图6-10:a)显示图像平面的一个点(,),b)显示a)图像中参数 不同时的许多线,这些线隐含着在(,)平面内的点,放在一起就形成了一条特征曲线(c图)






在SHT中没有用到param1和param2参数。对于PPHT,param1设置为将要返回的线段的最小长度,param2设置为一条直线上分离线段不能连成一条直线的分割像素点数。对于多尺度的HT(Hough Transform),这两个参数是用来指明应被计算的直线参数中较高的分辨率。多尺度的HT首先根据rho和theta参数准确计算直线的位置,然后分别通过param1和param2等比例继续细化结果(例如,rho中最终的分辨率是param1分割rho产生的,theta中最终的分辨率是param2分割theta产生的)。

函数的返回内容依赖于调用方式。如果line_storage是矩阵数组,最终的返回值为空。在这种情形下使用SHT或者多尺度的HT时,矩阵应该是cv_32FC2类型,当使用PPHT时,矩阵应为cv_32SC4类型。在头两种情况下,每一行中- 和-的值应在数组中的两个通道里。在PPHT情形下,四个通道保留的是返回线段开始点和结束点的x-和y-的值。在所有的情形下数组的行数将会被cvHoughLines2()更新以便正确反映返回直线的数量。


其中lines是从cvHoughLines2()中得到的返回值,i是所关心的线的索引。在这种情形下,line是指向这条直线数据的指针,对于SHT和MSHT,line[0]和line[1]是浮点类型的 和 ,对于PPHT,是线段终点的CvPoint结构。







您需要先支付 30元 才能查看全部内容!立即支付
