Camera Model | Fungus Field

1. 功能分类

Area Scan / Line Scan
- Area Scan - 最常见的相机
- Line Scan - 打印机里的扫描仪
Color / Monochrome 彩色 / 黑白
CMOS / CCD
- CMOS - Complementary Metal Oxide Semiconductor
- CCD - Charged Coupled Device
目前 CMOS 的综合性能会更好一些
Global Shutter 全局快门 / Rolling Shutter 卷帘快门
- Global Shutter - Capture entire frame at once
- Rolling Shutter - Capture image line by line
由于卷帘快门的抓拍有顺序，因此会出现如下状况
1. 与抓拍物的相对运动会导致抖动和变形
2. 抓拍闪光等瞬时效果时，不一定能拍全，此时（快门速度 < 闪光消失速度）
全局快门相机更适合运动摄影，但是呢...更贵，得爆金币（XP
Resolution 分辨率

320p, 640p, 720p, 1080p, 1440p, 2160p, 4320p, ...
Frame Rate 帧率

15Hz / 30Hz / 60Hz...
Focal Length 焦距

Fixed 固定焦距 / Auto-Focus 自动对焦
Connection Interface

USB 3.0 / GigE ...

2. Color Space

2.1 RGB[A]

一般的彩图是以 RGB（Red-Green-Blue）格式储存的

rgb

一般R, G, B channel values在不同图片类型下的取值范围：

CV_8U		8bit unsigned integer		0 - 255
          CV_16U		16bit unsigned integer		0 - 65535
          CV_32F		32bit floating point		0 - 1

Alpha Channel是除了RGB以外的第四通道——“非彩色通道”，能读取透明度

2.2 Gray / Greyscale 灰度图

一般只有一个 Channel

RGB转灰度的常用公式为

2.3 HSV

Hue - Saturation - Value / 色相 - 饱和度 - 明度

hsv

Hue 色相 - Base Pigment
Saturation 饱和度 - Depth of Pigment
Value 明度 - Darkness of Pigment

HSV - RGB conversion is mathematically lossless

3. Rigid Body Motion

基于欧拉角的刚体运动描述方式

3.1 2D Rotation

2d trans

$p$ 点在 Frame $\{A\}$ 中坐标记为 $p_A(x_a, y_a)$ ，与 $x$ 轴夹角为 $\alpha$

现在将它旋转 $\theta$ 度到 Frame $\{B\}$ ，此时的 $p$ 在 $\{B\}$ 中坐标记为 $p_B(x_b, y_b)$

从上图其实可以看出 $p_A = p_B$ ，然并卵，这信息不加工没有任何用处

How to express $p_B$ in $\{A\}$ coordinates ?

Define

$R_{AB} =$ coordinates of $\{B\}$ in $\{A\}$
$x_{B\{A\}} =$ $x$ coordinate of $\{B\}$ expressed in $\{A\}$
$y_{B\{A\}} =$ $y$ coordinate of $\{B\}$ expressed in $\{A\}$

然后就能得到如下关系式

证明？嘿嘿嘿...Trivial !

通用表达方式

[General Case] $p$ moving from $\{A\}$ to $\{B\}$

上图中，点 $p$ 从 Frame $\{A\}$ （以黑色虚线标注）变换到 Frame $\{B\}$ （以黑色实线标注），此时有
- $p_B \longrightarrow$ coordinates of $p$ in $\{B\}$
- $p_A \longrightarrow$ coordinates of $p$ in $\{A\}$
- $R_{AB} =$ coordinates of $\{B\}$ in $\{A\}$
- $R_{BA} =$ coordinates of $\{A\}$ in $\{B\}$
Important Properties
1. $\det{(R)} = 1$
2. $RR^T = R^TR = I$
  
  $\longrightarrow R^T = R^{-1}$
  
  $\longrightarrow R_{AB}^T = R_{BA}$

3.2 2D Translation

这个甚至都不需要画图表示了， $p$ 点从 $\{A\}$ 平移到 $\{B\}$ 可以用如下关系式表达

3.3 Homogeneous Coordinates

Homogeneous Coordinates 齐次坐标，这到底是什么捏

旋转 & 平移 Review

3.1 和 3.2 的流程如果都用最基本的矩阵描述，那就是

此处更换一下notation，用更加通用的 $q$ 来表示被变换点，用 $p$ 来表示 Translation

2D Rotation

2D Translation

where
- $R_{AB}\in\mathbb{R}^{2\times2} \longrightarrow$ Coordinates of $\{B\}$ in $\{A\}$
- $p_{AB} \in\mathbb{R}^{2} \longrightarrow$ Position of $\{B\}$ in $\{A\}$
- $q_A \in\mathbb{R}^{2} \longrightarrow$ Position of point $q$ in $\{A\}$
- $q_B \in\mathbb{R}^{2} \longrightarrow$ Position of point $q$ in $\{B\}$

暴力 V.S. 优雅运动描述

两个运动，两个公式；那如果只用一个公式，一次性表达 Rotation + Translation 呢？

如果你觉得还好，那大可以试试多做几个运动，然后依然只用一个公式表达...

再套几层，动作顺序shuffle一下，这样还能分辨的出哪个运动属于哪个步骤就有鬼了...

现在我告诉阁下，有某种方式能用一条统一且简洁的 Matrix Train 来描述运动，那阁下该如何应对

Homogeneous Coordinates

实现上述简洁方法的方式就是变换成 Homogeneous Coordinates 齐次坐标

换而言之，就是把运动过程放到高一维度的空间来描述：2D 运动变成 3D，3D 运动变成 4D，以此类推...

"Add another coordinate to our points"
- Position
- Rotation Matrix
- Translation Matrix
  
  我们要用统一的方式描述所有运动，所以新的 Translation Matrix 的 Dimension 会和新的 Rotation Matrix 保持一致

Homogeneous Transformation Matrix (HTM)

用 Homogeneous Coordinates 描述的运动矩阵，统称为 Homogeneous Transformation Matrix 齐次变换矩阵 $T$

这种矩阵可以通过直接相乘合并使用，不同的相乘顺序会得到不同的结果
- 先 Translate，后 Rotate
- 先 Rotate，后 Translate
[General Form of HTM]

参数 $R$ 和 $p$ 可以随便配置，只要记住其本质变换顺序即可

使用说明
1. 一般 $p$ 点从 Frame $\{B\}$ 到 $\{A\}$ 的齐次变换
  
  对应的公式展开为
2. 用 Matrix Train 表示 $p$ 点在多个 Frame 之间的齐次变换
3. Important Properties
  - $\det{(T)} = 1$
  - $T^T = T^TT = I$
    
    $\longrightarrow T^T = T^{-1}$
    
    $\longrightarrow T_{AB}^T = T_{BA}$
  - 能让两条平行线相交（see 3.4.2）
  性质上和 Rotation Matrix 几乎没有任何区别，相当于是一个高维的特殊 Rotation Matrix

3.4 齐次坐标下的平移运动

Translation Visualized

2D Translation 在三维中实际上是一个剪切变换，示例如下

Given coordinates in the Initial Frame and Homogeneous Transformation Matrix (HTM)

Then the coordinates of $p$ after translation is

If we visualize this process in 3D space...
- 白点 $\longrightarrow$ 即 $p$ 点的位置
- 黄点 $\longrightarrow$ 白点在 $z = 0$ 处的二维平面上的映射
- 蓝色立方体区 $\longrightarrow$ 这个这个立方体可以想象为所有 Coordinates，即所有 $p$ 点之集合
  
  可以抽象化表达为 $p = [p_x, p_y, p_z]^T$ ，并非一定要使 $p_z = 1$
- 白点所在的蓝色平面 $\longrightarrow$ 这个平面可以想象为所有符合二维平移运动之要求的 Coordinates，即所有 $p_z = 1$ 的 $p$ 点之集合
  
  可以抽象化表达为 $p = [p_x, p_y, 1]^T$

如何解释图中的形变？

图中形变是一种透视投影 （see 5）

可以发现，蓝色立方体区因为齐次变换而整体发生了形变，可以把之前设定的 $p_i$ 换成任何一个蓝色立方体区域内的坐标，然后与齐次变换矩阵 $H$ 相乘即可验证

上图中，白色虚线表示的是形变后的原来在 $z$ 轴上的点

若用 $\omega$ 表示一个点在 $z$ 轴上的坐标，那么所有 $z$ 轴上的点可以表示为

而这条线的形变过程则可以表示为

$\omega = 1$ 所描述的平面被称为归一化平面，即上图中从下往上起第一个蓝色平面

4. Pinhole Camera Model

首先设定固定不动的 Global Frame $W$ 和可以自由运动的 Camera Frame $C$

cam model scene

上图中的一些元素

白点 $\longrightarrow$ 世界坐标系下的物体坐标点

蓝色平面 $\longrightarrow$ Image Plane 相机成像平面

蓝点 $\longrightarrow$ 白点在相机成像平面上的投影

描述这个投影过程需要解决如下的问题

Coordinates Conversion from $W \longrightarrow C$

解决方案：Extrinsic Matrix 相机外参矩阵 $E$
Coordinates Projection from $C \longrightarrow \text{Image Plane}$

解决方案：Intrinsic Matrix 相机内参矩阵 $K$

4.1 Extrinsic Matrix 外参矩阵

Extrinsic Matrix 外参矩阵的功能是将物体在 Global Frame 中的坐标转换到 Camera Frame 中，是 3D 空间的坐标转换

可以直接把 3.3 - 使用说明 1. 中的公式拿来用，只需要把 $q_A, q_B, R_{AB}, p_{AB}$ 全都变成三维的版本就可以了！

$R_{AB}\in\mathbb{R}^{3\times3} \longrightarrow$ Coordinates of $\{B\}$ in $\{A\}$
$p_{AB} \in\mathbb{R}^{3} \longrightarrow$ Position of $\{B\}$ in $\{A\}$
$q_A \in\mathbb{R}^{3} \longrightarrow$ Position of point $q$ in $\{A\}$
$q_B \in\mathbb{R}^{3} \longrightarrow$ Position of point $q$ in $\{B\}$

where $E$ is the Extrinsic Matrix

Degrees of Freedom

好消息，我们的世界中三个维度 $(x,y,z)$ 是完全正交的，因此 $R_{AB}\in SO(3)$ ，哪怕它实际有9个参数，实际上也只有 3 DOF

外参矩阵 Extrinsic Matrix 一共有 6 个自由度

4.2 Intrinsic Matrix 内参矩阵

Problem Statement

通过 4.1 的 Extrinsic Matrix $E$ ，我们可以把物体的位置从世界坐标下转换到相机坐标下

内参矩阵会解决从相机坐标到Image Plane 像平面 的投射问题
- 红点 $p(x, y, z) \longrightarrow$ 物体在相机坐标系中的位置
- 蓝点 $p'(x', y', f) \longrightarrow$ 物体在蓝色平面，即 Image Plane 像平面上上的投影点
- 蓝色平面 $\longrightarrow$ Image Plane 成像平面
- $f \longrightarrow$ 焦距
- $z \longrightarrow$ 主光轴 Optical Axis，与 Image Plane 的交点称 Principal Point $(c_x, c_y)$
Image Plane Problems

以下所有问题均可甩锅给生产工艺
1. Principal Point 正好在 Image Plane 的中心，即 $(c_x, c_y) = (0, 0)$
  
  实际上：有 偏移 Offset
2. 每个 Image Plane 的 $x$ 与 $y$ 的比例是正常的
  
  实际上：有 缩放 Scaling
3. 每个 Image Plane 的 $x$ 与 $y$ 轴是完美正交的
  
  实际上：有 偏斜 Skew
Offset 和 Scaling 可以在获得投影点坐标的时候就解决问题 (see 4.2.1)，Skew会更复杂 (see 4.2.2)

4.2.1 如何获得投影点坐标

上图中， $p$ ， $p'$ 与 $\{C\}$ 的 $(x, z)$ 平面形成了一些相似三角形，可以求得 $x'$ 和 $y'$ 为

考虑主光轴偏移，则投影点在 Image Plane 上的坐标 $(u, v)$ 为

等式右边转化为用 matrix 表达

然后 get rid of $z$ ，在深度上归一化

从 3D 到 2D 的 Dimension Loss 就发生于此

然后就会得到

最后一步，采集到的图像后需要数字化 Digitization / Pixelization，同时解决 像平面缩放

Assume $\rho_w$ and $\rho_h$ are width and height of each pixel

4.2.2 Skewed Image Plane

skew

从相机镜头投到 Image Plane 的图像是以左侧的，但是为了让 Image Plane 能正确接收，我们需要把图像变换到右侧

skew coord

从上图可得到如下坐标变换公式

把从 4.2.1 中得到的结论从 Matrix Form 变成 Equation Form，然后代入上式

4.2.3 Summary

Pixelized Matrix Form
Pixelized Matrix Form, Normalized
Pixelized Equation Form

And $K$ is the Intrinsic Matrix

Degrees of Freedom

5 个参数，内参矩阵 Intrinsic Matrix 一共有 5 个自由度

4.3 Lens Distortion 透镜畸变

相机肯定不止是小孔（光圈），它也有透镜（镜头），而透镜通常会导致图像变形...然后你的坐标就不对了

好消息是，我们只需要在算投影点坐标前增加一步 Anti-Distortion 就好了

坏消息是，不是一般的麻烦...

Lens Distortion 大部分是 Radial Distortion，少部分是 Tangential Distortion

lens distort

OpenCV 提供的解决方案是用一种长得非常离谱的...系数...

where

$r^2 = x'^2 + y'^2$
$k_i$ are Radial Distortion Coefficients, and typically

Higher-Order Coefficients are not considered in OpenCV
- $k_1>0 \longrightarrow$ Barrel Distortion
- $k_1<0 \longrightarrow$ Pincushion Distortion

投影点坐标可以用如下的方式计算

算齐次坐标
抗畸变（4.2.1 里没有这一步）
算投影点坐标

$x', y'$ 和之前的定义稍微有点区别，方便起见，我们会把 $f_x, f_y$ 移后面去

4.4 Model Summary

通用写法

已知 Global Frame $\{W\}$ 下物体坐标 $p(x, y, z)$ ，求 Camera Frame $\{C\}$ 下投影点坐标的流程为
1. $\{W\} \longrightarrow \{C\}$
2. $\{C\} \longrightarrow \text{Image Plane}$
然后 get rid of $z_C$ ，留下 $(u, v)$ 就完成了
Dimension 能对上的写法

上面的那种 Dimension 没法直接对上，所以此处把 Extrinsic Matrix 的最后一层削了，让它变成 Euclidean Rigid Transformation

然后就能写成略丑但 Dimension Match 的版本了

Degrees of Freedom

相机矩阵 Camera Matrix 一共 11 个自由度

5. Camera Calibration

以找 Intrinsic Matrix 为目的的 Checkerboard Calibration

以找 Extrinic Matrix 为目的的 Perspective-n-Point Problem

5.1 Checkerboard Calibration

Camera Intrinsics are generally static，校准只需要一次即可

除非经历长途运送或者受到了很大的冲击，不然不会轻易改变

棋盘格是很好的判断成像形变的参照物，checkerboard calibration 是非常通用的校准方法

checkerboard

如果你会用 AprilTag 这类标记符号，也可以用下图的 AprilTag 棋盘，反正不管哪种都有一对 Libraries 能用

apriltag calib

5.2 PnP Problem

PnP = Perspective-n-Point

pnp

Given

$K \longrightarrow$ 相机内参

$X_i \in \mathbb{R}^3 \longrightarrow$ $n$ 组三维空间中的点

$x_i \in \mathbb{R}^2 \longrightarrow$ $X_i$ 映射到相机 Image Plane 上的坐标

对应点一般至少要给 3 组
Find

$E = [R \;| \;p] \in \mathbb{R}^{3\times 4} \longrightarrow$ 相机外参，即相机姿态

where $R\in SO(3)$ and $p \in \mathbb{R}^3$
应用场景
- Augmented Reality (AR)
- Structure from Motion (SfM)
- ... ...