Edge Detection

Convolution

1. calculate

Meaning

保存不变

向右移动

平滑

图像锐化

Sharpening filter: Accentuates differences with local average
- 锐化过滤器：突出与局部平均值的差异

计算结果大小

padding

zero “padding”
edge value replication
mirror extension
more (beyond the scope of this

Smoothing with box filter revisited

使用平滑滤波器会导致边缘消失
为了消除边缘效应，对邻域的权重贡献，根据像素与中心的接近程度确定像素。

Gaussian Kernel

记得归一化，一般来说$\sigma$的大小决定了高斯核的大小，所以标准差𝜎: 确定平滑的范围
作用：从图像中删除“高频”分量（低通滤波器）
与自身的卷积是另一种高斯函数
- 所以可以用小的平滑𝜎内核，重复与自身卷积，可以得到和大卷积核卷积相同的结果
- 卷积两次的高斯核相当于标准差变为$\frac{\sigma}{\sqrt{2}}$

Separable kernel

2D高斯核具有可分性，可以分为两个一维卷积核

$G(x, y)=\frac{1}{2 \pi \sigma^{2}} \exp ^{-\frac{x^{2}+y^{2}}{2 \sigma^{2}}}=\left(\frac{1}{2 \pi \sigma} \exp ^{-\frac{x^{2}}{2 \sigma^{2}}}\right)\left(\frac{1}{2 \pi \sigma} \exp ^{-\frac{y^{2}}{2 \sigma^{2}}}\right)$

What is the complexity of filtering an $𝑛×𝑛$ image with an $𝑚×𝑚$ kernel?
$O(n^2m^2)$
What if the kernel is separable?
$O(n^2m)$

(Cross) correlation

Properties

Commutative property: $f * * h=h * * f$
Associative property: $\left(f * * h_{1}\right) * * h_{2}=f * *\left(h_{1} * * h_{2}\right)$
Distributive property: $f * *\left(h_{1}+h_{2}\right)=\left(f * * h_{1}\right)+\left(f * * h_{2}\right)$
The order doesn’t matter! $\quad h_{1} h_{2}=h_{2} h_{1}$
Shift property:
$f[n, m] * * \delta_{2}\left[n-n_{0}, m-m_{0}\right]=f\left[n-n_{0}, m-m_{0}\right]$
Shift-invariance:

$\begin{gathered} g[n, m]=f[n, m] * h[n, m] \\ \Longrightarrow f\left[n-l_{1}, m-l_{1}\right] * h\left[n-l_{2}, m-l_{2}\right] \\ =g\left[n-l_{1}-l_{2}, m-l_{1}-l_{2}\right] \end{gathered}$

Convolution vs. (Cross) Correlation

A convolution is an integral that expresses the amount of overlap of one function as it is shifted over another function.
- convolution is a filtering operation
Correlation compares the similarity of two sets of data. Correlation computes a measure of similarity of two input signals as they are shifted by one another. The correlation result reaches a maximum at the time when the two signals match best.
- correlation is a measure of relatedness of two signals

Edge Detection

1. Edges

1.1 Def

significant local changes of intensity (discontinuities) in an image. 图像中强度的显着局部变化（不连续性）

1.2 Origins of edges

discontinuity in depth 深度不连续
surface normal/color/texture discontinuity 表面法线\颜色\纹理不连续
specularity /shadows 由于光照的阴影

2. Image gradient

2.1 The gradient of an image

$\nabla f=\left[\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}\right]$ $\theta=\tan ^{-1}\left(\frac{\partial f}{\partial y} / \frac{\partial f}{\partial x}\right)$ $\|\nabla f\|=\sqrt{\left(\frac{\partial f}{\partial x}\right)^{2}+\left(\frac{\partial f}{\partial y}\right)^{2}}$

梯度垂直于图片边缘

3. Effects of noise

如果信号中有噪声，边缘的特征可能会淹没在噪声中，从而无法通过求梯度的方法，对边缘进行定位。
所以，实际上我们经常先对信号做平滑处理，然后再求导。

4. Sobel Operator

4.1 算法介绍

Uses two 3 3×3 kernels which are convolved with the original image to calculate approximations of the derivatives
One for horizontal changes, and one for vertical

$\mathbf{G}_{x}=\left[\begin{array}{ccc} +1 & 0 & -1 \\ +2 & 0 & -2 \\ +1 & 0 & -1 \end{array}\right] \quad \mathbf{G}_{y}=\left[\begin{array}{ccc} +1 & +2 & +1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{array}\right]$

Smoothing + differentiation：平滑处理＋微分

$\begin{aligned} &\mathbf{G}_{x}=\left[\begin{array}{ccc} +1 & 0 & -1 \\ +2 & 0 & -2 \\ +1 & 0 & -1 \end{array}\right]=\left[\begin{array}{l} 1 \\ 2 \\ 1 \end{array}\right]\left[\begin{array}{lll} +1 & 0 & -1 \end{array}\right]\\ &\text { Gaussian smoothing differentiation } \end{aligned}$

之所以$[1 2 1]^t$可以看作为高斯核，这是因为其数值呈现类似高斯分布的效果，并且可以通过数学手段验证。
Magnitude: 模值

$\mathbf{G}=\sqrt{\mathbf{G}_{x}^{2}+\mathbf{G}_{y}^{2}}$

Angle or direction of the gradient: 方向

$\Theta=\operatorname{atan}\left(\frac{\mathbf{G}_{y}}{\mathbf{G}_{x}}\right)$

4.2 Sobel Filter Problems

Poor Localization (Trigger response in multiple adjacent pixels)：定位不够准确，边缘可能很粗
Thresholding value favors certain directions over others
- Can miss oblique edges more than horizontal or vertical edges 可能丢失除了水平以及垂直的边缘
- False negatives 最终造成把边缘识别为不是边缘

4.3 Other approximations of derivative filters

4.3.1 Prewitt:

$G_{x}=\left[\begin{array}{lll} -1 & 0 & 1 \\ -1 & 0 & 1 \\ -1 & 0 & 1 \end{array}\right] \quad G_{y}=\left[\begin{array}{ccc} 1 & 1 & 1 \\ 0 & 0 & 0 \\ -1 & -1 & -1 \end{array}\right]\$

除了考虑中心像素左右邻近的像素值，还考虑了其对角的领域像素。

4.3.2 Roberts:

$G_{x}=\left[\begin{array}{cc} 0 & 1 \\ -1 & 0 \end{array}\right] \quad G_{y}=\left[\begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array}\right]$

$G_x$用于检测135°的边缘，$G_y$用于检测45°的边缘。

5. Canny edge detector

This is probably the most widely used edge detector in computer vision

5.1 Derivative of Gaussian filter

$\frac{d}{d x}(f * g)=f * \frac{d}{d x} g$

对于高斯平滑核而言所有元素都是正的，对于高斯偏导核而言有可能存在非正的元素。
对于高斯平滑核而言，其元素之和为1；对于高斯偏导核而言，其所有元素之和为0（奇函数）。

5.2 Problems of Gaussian filter

5.2.3 细节过多

我们可以发现高斯偏导核可能检测出许多我们不需要的细节。所以我们尝试将高斯偏导核的结果再输入一个阈值核（Tresholdingb Kernel），滤去不用的边缘。

5.2.4 边缘过粗

这是因为，我们原本的边缘取的是梯度的最大值，但是阈值的方法取的是阈值以上的整个部分。

5.2.5 Non-maximum suppression

为了解决边缘过粗的问题，我们可以采用Non-maximum的方法：

沿着边缘的梯度方向，每次记录邻近像素最大的一个值，一般地，如果当前像素梯度方向不存在邻近像素，则考虑用像素插值地方法，充当其邻近像素。最终使得整个边缘的宽度为1

5.2.6 边缘消失（FN problem）

5.2.7 Hysteresis thresholding

Avoid streaking near threshold value 避免在阈值附近的边缘丢失
Define two thresholds: Low and High
- If less than Low, not an edge
- If greater than High, strong edge
- If between Low and High, weak edge

首先能够检测出强边缘的像素，以及弱边缘的像素：
接着，我们让强边缘的像素不断与周围像素进行比较，总是取最接近的邻居进行延申；同理，弱边缘也进行延申，若强边缘最终可以和强边缘连接起来，则保留该弱边缘，反之则忽略该弱边缘。

5.3 Summary

Filter image with 𝑥,𝑦derivatives of Gaussian 使用x,y方向的高斯偏导核进行滤波
Find magnitude and orientation of gradient 找到边缘的梯度大小以及梯度方向
Non maximum suppression:
- Thin multi pixel wide ridges down to single pixel width 将多像素的边缘降至单像素
Thresholding and linking (hysteresis): 设置高低阈值，滤去不必要的细节同时，保留阈值附近及以上的边缘
- Define two thresholds: low and high
- Use the high threshold to start edge curves and the low threshold to continue them

5.4 Effect of $\sigma$ Gaussian kernel spread/size)

$\sigma$越小，窗口越小（由于$3\sigma$原则），能检测出更多的细节，适合人脸检测；
$\sigma$越大，窗口越大（由于$3\sigma$原则），能检测出更少的细节，更注重整体轮廓，适合行人检测。

5.4 Concluding remarks

Advantages:
- Conceptually simple. 概念简单
- Easy implementation 易于实施
- Handles missing and occluded data very gracefully. 易于解决缺失值
- Can be adapted to many types of forms, not just lines 可以适应多种形式
Disadvantages:
- Computationally complex for objects with many parameters. 参数过多，计算复杂
- Looks for only one single type of object
- Can be “fooled” by “apparent lines”. 可能被“明显的线条”迷惑，比如共线
- The length and the position of a line segment cannot be determined.
- Co linear line segments cannot be separated.