devskim blog
Search
🔦

Convolutional Neural Networks

Tags
DeepLearing
Created
Jun 11, 2023 02:49 PM
Last Updated
Jul 30, 2023 09:49 AM
 
 

Convolution

  • Convolutional matrix, matrix is small (3x3, 5x5)
  • Extract features of input images
  • Kernal function in Image processing
    • Ridge detection, Sharpen, Box blur
  • Padding
    • Add zero values in boundary of input image
  • Stride
    • Elements of sliding window of convolution kernel
  • Sliding Window
    • MAC
      • Multiply Accumlation operation
  • IM2COL
    • Transform n-dimmenstion data into 2D matrix data
    • more efficient operation
  • GEMM
    • General Matrix to Matrix Multiplication
  • Pooling
    • Resize the feature map
    • Reduce the resolution of feature map
    • Max pooling, Average pooling
 

Fully Connected Layer

  • Reshape 2D feature maps from 2D to 1D
  • All weights are matching with each feature map pixel
 

Activation

  • sigmoid
  • tabh
  • ReLU
  • LeakyReLU
 

Shallow CNN

  • Shallow Neural Network
  • Backpropagation in Shallow Neural Network
  • Max pooling backpropagation
 

Training

  • Feed forward & backward → weights are updated to fit training data
  • Gradient descent method
  • Optimizer
    • the goal of gradient descent is usually to minimize the loss fuction for a machine learning problem
    • SGD (Stochastic Gradient Descent)
      • update gradients per one data + MGSD (mini-batch gradient descent)
    • Momentum
      • velocity term keep going weight’s previous gradient direction
    • AdaGrad
      • sum of gradient squared
      • very slow convergence…
    • RMS-prop
      • exponential moving average
    • Adam
      • RMS-prop + Momentum
    • SGD is better than Adam?
      • Adaptive methods (Adam, RMS-prop) is worse to generalize than non-adaptive methods(SGD, Momentum)
  • Overfitting
    • When train the model using training dataset, model is fitted on the training Data domain
    • In training dataset, the loss is very slow
    • But when the model predicts new unseen dataset, poor performance
  • Underfitting
    • The model did not train well in training dataset
    • few dataset, inappropriate hyperparameters…
 

Regularization

  • Overfitted model has large value of weights. → Use regularization term in Loss function to penalize the large weights.
  • Make small value weights in model
  • Adds a penalty to the error function.
  • L1 Regularization
    • Penalty term is the sum of absolute values of weights.
  • L2 Regularization
    • Penalty term is the sum of squared values of weights.
 

Dropout

  • Drop out some weights randomly in training process
  • It prevents some weights are blased and has big values
  • Not use Drop out in test(inference) mode
 

Batch normalization

  • Depending on the mean, variance of input batch data, returns stable distribution values. (mean: 0, std: 1)
  • Make stable input values before activation funcation.
PREVObject Detection
NEXTImage Classification, Pytorch