Recovery of 3D structure
Recovery of 3D structure
1. Measure three-dimensional information
Camera model
Camera calibration(标定)
- Epipolar geometry
2. Things aren’t always as they appear…
- Single-view ambiguity
- 失去深度信息
- When certain assumptions hold, we can recover structure from a single view
- In general, we need multi-view geometry
3. Review: Pinhole camera model
- f = focal length 焦距
- o = aperture光圈 = pinhole = center of the camera
- Question: Is this a linear transformation?
- It’s not, because it has X and Z in the equations.
3.1 Homogeneous coordinates 欧式坐标与齐次坐标互转换
3.2 Projective transformation in Homogeneous coordinates
- 投影变换
- 先将其转换为其次坐标系,然后就可以用线性式子来表示变换关系
3.3 Camera calibration(标定)
- 由于摄像机的位置不固定,所以需要设立一个世界坐标系。然后所有变换在该坐标系进行
- Normalized (camera) coordinate system:camera center is at the origin原点, the principal axisis 主轴 the z-axis,
- Camera calibration: figuring out transformation from world coordinate system to image coordinate system
- 这里我们可以看到已经有两个坐标系,分别为摄像机坐标系、变换后的坐标系。不同的是,一个是摄像机的坐标系,原点位于图片主点;变换后的坐标系,其坐标原点在图片的左下角上(右上角)
3.3.1 From retina plane to images
- retina plane 视平面
- Principal point (p):point where principal axis intersects the image plane
- 主轴交图像坐标系的点叫主点
- Normalized coordinate system: origin of the image is at the principal point
- 规范后的坐标系原点在主点上
- Image coordinate system: origin is in the corner
- 图像坐标系的原点是左下角
3.3.2 Principal point offset
我们进行投影时,会首先投影到规范化坐标系,之后再将该坐标系平移到图像角点
先进行坐标平移,再进行投影变换
- $\mathrm{P}=\mathrm{K}[\mathrm{I} \mid 0]$ 规范化矩阵
3.3.3 Pixel coordinates
- Pixel size: $\frac{1}{m_{x}} \times \frac{1}{m_{y}}$
- $m_{x}$ pixels per meter in horizontal direction
- $m_{y}$ pixels per meter in vertical direction
- 观察式子,我们会发现,其只是又做了依次scale,所以只需要左乘一个scale transformation matrix
- 有五个自由度
3.3.4 Camera rotation and translation
3.3.4.1 3D Translation
3.3.4.2 3D Scaling
3.3.4.3 3D rotation transformation
3D rotation is done around a rotation axis
- Fundamental rotations – rotate about x, y, or z axes
- Counter-clockwise rotation逆时针旋转 is referred to as positive rotation (when you look down negative axis)
- Rotation about Z – similar to 2D rotation
- Rotation about y (z -> y, y -> x, x->z)
- Rotation about x (z -> x, y -> z, x->y)
3.3.5 Composing Transformation
3.3.6 Camera rotation and translation
- You can think of object transformations as moving (transforming) its local coordinate frame
- 世界坐标系到camera坐标系
- All the transformations are performed relative to the current coordinate frame origin and axes
- In general, the camera coordinate frame will be related to the world coordinate frame by a rotation and a translation
- Conversion from world to camera coordinate system (in non-homogeneous coordinates):
- 其实可以这么理解这个变换矩阵,对于偏置量会受旋转的影响:
- 2D transformation matrix (3 x 3) 从摄像机坐标系平移到另一个坐标系,并做了投影变换和尺度变换
- 用于视角变换,从二维变换转为三维变换,相当于矩阵维度变换
- 我们可以这么理解一下式子,其先将世界坐标系转到了摄像机坐标系,再进行平移变换,最后进行投影变换和尺度变换
- $K$摄像机内部参数,$[R\mid t]$外部参数
3.4 Camera parameters
- Intrinsic parameters
- Principal point coordinates
- $p_x,p_y$
- Focal length
- $f$
- Pixel magnification factors
- $m_x,m_y$
- Skew (non-rectangular pixels)
- Radial distortion 畸变
- 越远离光圈越容易发生弯曲
- Principal point coordinates
- Extrinsic parameters
- Rotation and translation relative to world coordinate system
- What is the projection of the camera center?
- $C$是摄像机中心在世界坐标系的坐标
- The camera center is the null space of the projection matrix!
- 投影矩阵:$\mathbf{P} \mathbf{C}=\mathbf{K}[\mathbf{R}\mid-\mathbf{R} \tilde{\mathbf{C}}]$
- camera center:$\left[\begin{array}{c}
\widetilde{\mathbf{C}} \\
1
\end{array}\right]$ - 在数学中,一个算子 $A$ 的零空间是方程 $Av = 0$ 的所有解 $v$ 的集合。它也叫做 $A$ 的核空间。如果算子是在向量空间上的线性算子,零空间就是线性子空间。因此零空间是向量空间。
4. Camera calibration
- 参数不可知
- Given $n$ points with known $3 D$ coordinates $X_{i}$ and known image projections $\boldsymbol{x}_{i}$, estimate the camera parameters
- 通过实验,可以同时测得3D坐标和图像投影坐标
- 上面的式子是先做了变换,然后将齐次坐标转为了欧式坐标
- Two linearly independent equations
- P has 11 degrees of freedom
- One 2D/3D correspondence gives us two linearly independent equations
- 6 correspondences needed for a minimal solution
4.1 Nonlinear method
- Homogeneous least squares:|find $\mathbf{p}$ minimizing $|\mathbf{A} \mathbf{p}|^{2}$
- Solution given by eigenvector of $\mathbf{A}^{\mathrm{T}} \mathbf{A}$ with smallest eigenvalue
- 奇异值分解,求$p$的值
5. Epipolar geometry
5.1 Recovering structure from a single view
- 从单个图片,即使有知识,但是很难进行重建,因为图片具有歧义,缺少深度信息
- From calibration rig: location/pose of the rig, K
- Knowledge about scene: point correspondences, geometry of lines & planes, etc…
- 这些知识包括点的依赖性、线的平行特征,平面等
- Intrinsic ambiguity of the mapping from 3D to image (2D)
- 具有内部歧义性,主要在投影的时候,丢失了深度信息
- Two eyes help!
5.2 A taste of multi-view geometry: Triangulation
- Given projections of a 3D point in two or more images (with known camera matrices), find the coordinates of the point
- 给定3D point的投影坐标,要求3D坐标
- We want to intersect the two visual rays corresponding to $x_1$and $x_2$, but because of noise and numerical errors, they don’t meet exactly
- 理论上是可以知道x的位置,即使不知道,有两张照片也可以找的到,但是实际上有噪音,所以很难找到交点
- $\text { Find } \mathrm{X} \text { that minimizes } d^{2}\left(\mathbf{x}_{1}, \mathbf{P}_{1} \mathbf{X}\right)+d^{2}\left(\mathbf{x}_{2}, \mathbf{P}_{2} \mathbf{X}\right)$
- 最小化投影距离与真实距离
5.3 问题分类
- 求相机内参
Motivation: Given a set of known 3D points seen by a camera, compute the camera parameters
- Calibration!
定位真实空间位置
- Structure: Given known cameras and projections of the same 3D point in two or more images, compute the 3D coordinates of that point
- Triangulation!
- 给定同一点的一些投影坐标和相机参数等,用三角法求真实坐标
- 要求一张图片的点对应另一张图片的另一个点
- Correspondence: Given a point in one image, find the corresponding point in another one.
- 知道摄像机,也知道图片,要求一张图片的点对应另一张图片的另一个点
5.4 Epipolar geometry
- Baseline(基线) —— line connecting the two camera centers
- 两个相机中心的连线
- Epipolar Plane(极平面)——plane containing baseline and $X$
- 这里有三个坐标系,两个摄像机坐标系,一个世界坐标系,也可以将世界坐标系和其中一个摄像机坐标系移到到重合
- Epipoles(极点) ——intersections of baseline with image planes
- 基线和图片的交点$e$
- Epipolar Lines —— intersections of epipolar plane with image planes (always come in corresponding pairs)
- 极平面和图像平面的交线$l,l’$
- If we observe a point $x$ in one image, where can the corresponding point $x’$ be in the other image?
- Potential matches for $x$ have to lie on the corresponding epipolar line $ l’$.
- Potential matches for $x$ ‘ have to lie on the corresponding epipolar line $l$.
- 无论是已知哪一个点,要找匹配,都在相关的极线上,所以匹配的时候,只要遍历极线的点就行
- 这个问题其实是一个三点共线问题:即要证明$O’$和$X$的连线与图片平面的交点一定在极线上
5.5 Epipolar constraint example
5.6 Epipolarconstraint: Calibrated case
- 现验证匹配的投影点是否在交线上,即已知$x’$坐标,验证其是否在直线上
- 先将所有点的坐标转到世界坐标系里表达
- 假设世界坐标系和其中一个摄影坐标系原点重合
- Intrinsic and extrinsic parameters of the cameras are known, world coordinate system is set to that of the first camera.
- 返回到世界坐标系当中
- 对于摄像机坐标系上的任意一点坐标$x’$,我们可以将其变换为世界坐标系表示
- Lecture10 更新解法
- 由于$x’$是右边那个极平面的法向量,所以会垂直于极线,那么对于满足任意$x’$都垂直于极线的方程,显然就是极线的方程
- $\boldsymbol{Ex}$ is the epipolar line associated with $\boldsymbol{x}\left(\boldsymbol{l}^{\prime}=\boldsymbol{E} \boldsymbol{x}\right)$
- Recall: a line is given by $a x+b y+c=0$ or $\mathbf{l}^{T} \mathbf{x}=0$ where $\mathbf{l}=\left[\begin{array}{l}a \ b \ c\end{array}\right], \quad \mathbf{x}=\left[\begin{array}{l}x \ y \ 1\end{array}\right]$
- $E \boldsymbol{x}$ is the epipolar line associated with $\boldsymbol{x}\left(\boldsymbol{l}^{\prime}=\boldsymbol{E} \boldsymbol{x}\right)$
- $\boldsymbol{E}^{T} \boldsymbol{x}^{\prime}$ is the epipolar line associated with $\boldsymbol{x}^{\prime}\left(\boldsymbol{I}=\boldsymbol{E}^{\top} \boldsymbol{x}^{\prime}\right)$
- $E \boldsymbol{e}=0$ and $\boldsymbol{E}^{\top} \boldsymbol{e}^{\prime}=0$
- $E$ is singular (rank two)
- 因为$t_x$的rank是2
- $E$ has five degrees of freedom
- The calibration matrices $K$ and $K^{\prime}$ of the two cameras are unknown
- We can write the epipolar constraint in terms of unknown normalized coordinates:
- 这里的$[I,O]$相当于视角转换,由齐次坐标变为欧式坐标
- 乘一个逆就可以变到另一个点的规范化坐标系
- $\boldsymbol{F} \boldsymbol{x}$ is the epipolar line associated with $\boldsymbol{x}\left(\boldsymbol{l}^{\prime}=\boldsymbol{F} \boldsymbol{x}\right)$
- $\boldsymbol{F}^{\boldsymbol{T}} \boldsymbol{x}^{\boldsymbol{x}}$ is the epipolar line associated with $\boldsymbol{x}^{\prime}\left(\boldsymbol{l}=\boldsymbol{F}^{\boldsymbol{T}} \boldsymbol{x}^{\prime}\right)$
- $\boldsymbol{F} \boldsymbol{e}=0$ and $\boldsymbol{F}^{T} \boldsymbol{e}^{\prime}=0$
- $\boldsymbol{F}$ is singular (rank two)
- $\boldsymbol{F}$ has seven degrees of freedom
5.7 Estimating the fundamental matrix
5.7.1 The eight-point algorithm
- Solve homogeneous linear system using eight or more matches $\rightarrow F$
- 这里会需要八个点,最终会变成两个矩阵相乘
- Enforce rank-2 constraint (take SVD of $F$ and throw out the smallest singular value). Find F that minimizes $|\mathrm{F}-\hat{\mathrm{F}}|=0$ Subject to detf(F) $=0$