Smurf
文章50
标签0
分类6
semi-supervised learning

semi-supervised learning

semi-supervised learning

1. What is semi-supervised learning?

image-20211210100412162

  • Humans learn in semi-supervised way

1.1 Why semi-supervised learning helps?

image-20211210100909904

  • The distribution of the unlabeled data tell us something.

image-20211210100942292

1.2 Low-density Separation Assumption

image-20211210101011787

  • 希望分开的类内差距尽量大

  • Given: labelled data set $=\left\{\left(x^{r}, \hat{y}^{r}\right)\right\}_{r=1}^{R}$, unlabeled data set $=\left\{x^{u}\right\}_{u=1}^{U}$

  • Repeat:

image-20211210101306510

  • Hard label vs Soft label
    • Considering using neural network $𝜃^∗$(network parameter) from labelled data

image-20211210101454065

  • 软标记对训练没有影响,所以应该使用硬标记

1.3 Entropy-based Regularization

image-20211210101842324

  • 我们希望我们的分类是非黑即白的
  • 可以通过信息熵来判断是否分类成功,对于无标记数据我们希望其分类越集中越好

1.4 Smoothness Assumption

  • Assumption: “similar” $x$​ has the same $\hat{y}$​
    • 数据相似带来标签相似
  • More precisely:
    • $\mathrm{x}$ is not uniform.
    • If $x^{1}$ and $x^{2}$ are close in a high density region, $\hat{y}^{1}$ and $\hat{y}^{2}$ are the same.

image-20211210102237830

  • Connected by a high density path

image-20211210102337143

  • 我们可以在训练数据库中插入多个2,使得可以左边的2可以通向右边的2

  • Classify astronomyvs. travelarticles

image-20211210102620958

  • 可以找到一条连通区域,从而进行分类

1.5 Graph-based Approach

  • $\text { How to know } x^{1} \text { and } x^{2} \text { are connected by a high density path? }$

image-20211210102726729

  • Define the similarity $s\left(x^{i}, x^{j}\right)$ between $x^{i}$ and $x^{j}$

  • Add edge:

    • K Nearest Neighbor
    • e-Neighborhood

image-20211210102920349

image-20211210102924267

  • Edge weight is proportional to $s\left(x^{i}, x^{j}\right)$ Gaussian Radial Basis Function:

image-20211210102943612

  • The labelled data influence their neighbors. Propagate through the graph
    • 图上的标记会随着路径传播

image-20211210103022932

image-20211210103102308

  • 不一定有效

image-20211210103106161

  • Define the smoothness of the labelson the graph
    • w是特征空间的相似度,S越小越平滑

image-20211210103143848

image-20211210103225622

  • Define the smoothness of the labels on the graph
  • $y:(R+U)-\operatorname{dim}$​ vector
    • 在标记传播过程中,会先初始化标签,R表示有标签,U表示原来无标签
  • $L:(R+U) \times(R+U)$​​ matrix
  • D是行和放于对角线

image-20211210103429450

image-20211210103437375

  • 不同层都可以加如smooth,即传播后会返回每一层的输出以便于计算loss

image-20211210103501154

2. Unsupervised Neural Network

2.1 Recall: Unsupervised learning

Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering, dimensionality reduction, density estimation, etc.

  • K-means clustering

image-20211210104309326

2.2 Auto-encoder

  • 希望编码器可以自动凝练特征,编码后又可以恢复

image-20211210104423492

  • Output of the hidden layer is the code

image-20211210104607869

2.3 Deep Auto-encoder

image-20211210104754054

image-20211210104829374

  • 深度网络更具表征更具判别性

image-20211210105000945

  • De-noising auto-encoder
    • 希望有噪音的图像能够恢复为无噪图像,即希望自编码器能够自主去噪

image-20211210105036929

image-20211210105156254

2.4 Auto-encoder –Text Retrieval

image-20211210105259882

  • The documents talking about the same thing will have close code.

image-20211210105420650

image-20211210105545554

2.6 Auto-encoder for CNN

image-20211210105625444

2.7 CNN -Unpooling

image-20211210105714259

2.8 CNN -Deconvolution

image-20211210105939710

  • Greedy Layer-wise Pre-training
    • 逐层进行训练,训练完后的参数freeze

image-20211210110004456

image-20211210110034495

image-20211210110041712

  • 最后进行微调

image-20211210110135765

2.9 Why VAE (Variational Auto-Encoders)?

  • 对编码进行插值能否采样?
    • 不会

image-20211210110245646

  • 但我们希望编码能够一定线性插值得到新的图片
    • 将确定性的向量变为一个分布,即对编码进行加噪

image-20211210110336087

  • e为来自高斯分布的采样权值,$\sigma$为标准差

image-20211210110752205

2.10 Pokémon Creation

image-20211210111103357

  • 垂直方向控制大小,水平方向控制方向

image-20211210111141895

2.11 Problems of VAE

  • It does not really try to simulate real images

image-20211210111330749

  • 有其重构函数是像素级别的,所以不一定完全相近

3. Generative Adversarial Network (GAN)

3.1 Basic Idea of GAN

  • $\text { The data we want to generate has a distribution } P_{\text {data }}(x)$

image-20211210111657576

  • A generator G is a network. The network defines a probability distribution.
    • 不考虑原始数据的分布

image-20211210111725293

3.2 Generative adversarial networks

  • Train two networks with opposing objectives:
    • Generator:learns to generate samples
    • Discriminator:learns to distinguish between generated and real samples
  • 两者互相博弈,最后越来越好

image-20211210111921680

3.3 Evolution

image-20211210112204048

  • Generator
    • 每一维度都觉了图像某一特征

image-20211210112302058

image-20211210112321616

  • Discriminator

image-20211210112503642

3.4 The evolution of generation

  • 固定一个更新另一个,从而迭代更新

image-20211210113116528

  • The discriminator $D(x)$ should output the probability that the sample $x$ is real
  • That is, we want $D(x)$ to be close to 1 for real data and close to 0 for fake
  • Expected conditional log likelihood for real and generated data:
    • 对于判别器,我们希望区分出真假
    • 而生成器则相反,希望他越小越好
  • We seed the generator with noise $z$ drawn from a simple distribution $p$
    (Gaussian or uniform)

3.5 GAN objective

  • The discriminator wants to correctly distinguish real and fake samples:
  • The generator wants to fool the discriminator:
  • Train the generator and discriminator jointly in a minimax game

  • Update discriminator:

  • Repeat for $k$ steps:
  • Sample mini-batch of noise samples $z_{1}, \ldots, z_{m}$ and mini-batch of real samples $x_{1}, \ldots, x_{m}$

3.6 Training algorithm in practice

  • Update parameters of $D$​ by stochastic gradient ascent on
    • Repeat for $k$ steps:
      • Sample mini-batch of noise samples $z_{1}, \ldots, z_{m}$ and mini-batch of real samples $x_{1}, \ldots, x_{m}$
      • Update parameters of $D$ by stochastic gradient ascent on
  • Update generator:
    • Sample mini-batch of noise samples $z_{1}, \ldots, z_{m}$
    • Update parameters of $G$ by stochastic gradient ascent on
  • Repeat until happy with results

  • Update discriminator: push $D\left(x_{\text {data }}\right)$ close to 1 and $D(G(z))$ close to 0

    • The generator is a “black box” to the discriminator
    • The generator is exposed to real data only via the output of the discriminator (and its gradients)

image-20211210113852972

  • Test time –the discriminator is discarded

image-20211210113926463

3.7 Original GAN results

  • 原始GAN比较模糊,因为这样能够难以分类

image-20211210114030461

3.8 Problems with GAN training

  • Stability

    • Parameters can oscillate or diverge, generator loss does not correlate with sample quality
    • Behavior very sensitive to hyperparameter selection
  • 只能模仿几个模式而无法生成实际的多模态

  • Mode collapse

    • Generator ends up modeling only a small subset of the training data

image-20211210114206921

3.9 DCGAN

  • Early, influential convolutional architecture for generator
    • 使用卷积,且不用池化,即用stride代替

image-20211210114305737

  • Early, influential convolutional architecture for generator
  • Discriminator architecture (empirically determined to give best training stability):
    • Don’t use pooling, only strided convolutions
    • Use Leaky ReLU activations (sparse gradients cause problems for training)
    • Use only one FC layer before the softmax output
    • Use batch normalization after most layers (in the generator also)
      • 降低对超参敏感程度

3.10 DCGAN results

  • Interpolation between different points in the z space
    • 即是连续的

image-20220201123318837

  • Vector arithmetic in the z space

image-20211210114630443

  • Pose transformation by adding a “turn” vector

image-20211210114740203

4. Conditional generation

  • To condition the generation of samples on discrete side information (label) 𝑦, we need to add 𝑦 as an input to both generator and discriminator
    • 加入类的标签,加入限制

image-20211210114953302

4.1 BigGAN

  • Class-conditional generation of ImageNet images up to

image-20211210115037569

  • 对Z空间进行截断,防止由于分布带来的模糊,因为只取了一部分作为编码空间,从而提高分辨率
  • 但也有可能降低保真度,所以需要tradeoff

5. Image-to-image translation

image-20211210115403644

  • Produce modified image $y$ conditioned on input image $x$
    (note change of notation)
    • Generator receives $x$ as input
    • Discriminator receives an $x, y$ pair and has to decide whether it is real or fake

image-20211210115629713

  • 作为一个对照来进行判别,以增加条件
    • 即希望鞋的形状一致

5.1 Translating between maps and aerial photos

image-20211210115821902

  • Day to night

image-20211210120013228

  • Edges to photos

image-20211210120028070

5.2 Unpaired image-to-image translation

  • 有时候我们并不能得到成对的样本

  • Given two unordered image collections 𝑋 and 𝑌, learn to “translate” an image from one into the other and vice versa

image-20211210120218174

image-20211210120245504

5.3 CycleGAN

  • Given: domains $X$ and $Y$​
    • 就是我们希望X可以变为Y,Y经过反变换后还可以生成Y
    • 就可以限制Y的形状类似X
  • Train two generators $F$ and $G$ and two discriminators $D_{X}$ and $D_{Y}$
    • $G$ translates from $X$ to $Y, F$ translates from $Y$ to $X$
    • $D_{X}$ recognizes images from $X, D_{Y}$ from $Y$
    • Cycle consistency: we want $F(G(x)) \approx x$ and $G(F(y)) \approx y$

image-20211210120508806

  • Illustration of cycle consistency:

image-20211210120758130

  • Translation between maps and aerial photos

image-20211210120822260

  • Tasks for which paired data is unavailable

image-20211210120854718

5.4 CycleGAN: Limitations

  • Cannot handle shape changes (e.g., dog to cat)

  • Can get confused on images outside of the training domains (e.g., horse with rider)

    • 不能对训练数据以外的做拟合
  • Cannot close the gap with paired translation methods

5.5 Multimodal image-to-image translation

5.5.1 Human generation conditioned on pose

image-20211210121344553

本文作者:Smurf
本文链接:http://example.com/2021/08/15/cv/12.%20semi-supervised%20learning/
版权声明:本文采用 CC BY-NC-SA 3.0 CN 协议进行许可