神刀安全网

OC实现(CNN)卷积神经网络

简介


上一篇文章介绍了OC实现softmax来简单完成MNIST数据的训练,但是准确率只有90%。最后也提到了可以通过添加CNN来提高准确率。那么CNN是什么?

卷积神经网络(Convolutional Neural Network, CNN)是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。
卷积神经网络由一个或多个卷积层和顶端的全连通层(对应经典的神经网络)组成,同时也包括关联权重和池化层(pooling layer)。这一结构使得卷积神经网络能够利用输入数据的二维结构。与其他深度学习结构相比,卷积神经网络在图像和语音识别方面能够给出更优的结果。这一模型也可以使用反向传播算法进行训练。相比较其他深度、前馈神经网络,卷积神经网络需要估计的参数更少,使之成为一种颇具吸引力的深度学习结构。

OC实现(CNN)卷积神经网络

接下来介绍本人用OC实现的卷积神经网络。

原理


卷积神经网络核心在于局部感知、权值共享与池化三个方面。

  • 局部感知:对于一张完整的图像,通过一个感知器去捕捉它的局部信息,这样可以降低训练参数。如1000*1000的图像,用10*10的感知器,全部扫描,只需要991*991个神经元。
OC实现(CNN)卷积神经网络
局部感知

  • 权值共享:同一个感知器产生的功能和结构是相同的,是可以相互替代的,那么就可以大幅减少训练参数。如上面所述,只需要10*10=100个参数训练。
OC实现(CNN)卷积神经网络
权值共享

  • 池化:也就是下采样,对前面1000×1000的图像经过10×10的卷积核卷积后,得到的是991×991的特征图,如果使用2×2的池化规模,即每4个点组成的小方块中,取最大的一个作为输出,最终得到的是496×496大小的特征图。
OC实现(CNN)卷积神经网络
池化

卷积神经网络前馈流程主要包含:卷积、采样(池化)、光栅化(全连接)、感知器(激活)。

  • 卷积:实现图像的局部感知与权值共享,如下图所示,展示了一个3×3的卷积核在5×5的图像上做卷积的过程。每个卷积都是一种特征提取方式,就像一个筛子,将图像中符合条件的部分筛选出来。
OC实现(CNN)卷积神经网络
卷积

计算方法如图所示的卷积核[1,0,1,0,1,0,1,0,1],
第一个4 = 1*1+1*0+1*1+0*1+1*0+1*1+0*1+0*0+1*1。

  • 池化:上面已经介绍过最大池化,还有均值池化(取一个小方块里的均值),高斯池化与可训练池化等。

  • 光栅化:主要是将采样的特征图排成一个向量。

  • 感知器:常用的有Relu、tanh、sigmoid等,具体的优劣势、公式很多论文都有分析介绍过,这里就不多述。

卷积神经网络的反向传播更新,后面有机会再具体解释,这里给出几个公式:

  • 池化:反向传播损失的时候,最大池化将一点残差更新到前馈流程中的最大值位置,其他3个位置填0;均值池化,将1个点的残差平均到4个点上。

  • 卷积:参数公式如下,其中,rot180是将一个矩阵旋转180度; Oq是连接到该“神经中枢”前的池化层的输出;对偏置的梯度即 Δp所有元素之和。

    OC实现(CNN)卷积神经网络
    参数更新公式

    损失传播公式如下:

OC实现(CNN)卷积神经网络
损失传播公式

OC实现CNN


上面简单介绍了CNN的相关知识,接下来看一下具体实现。
首先针对前面的Softmax实现中,要添加上CNN损失反传等代码,实现CNN+Softmax如下:

- (void)updateModel:(double *)index currentPos:(int)pos {     for (int i = 0; i < _kType; i++) {         double delta;         if (i != _randomY[pos]) {             delta = 0.0 - index[i];         }         else         {             delta = 1.0 - index[i];         }          _bias[i] += _descentRate * delta;         double loss = _descentRate * delta / _randSize;         double *decay = malloc(sizeof(double) * _dim);         vDSP_vsmulD(_randomX[pos], 1, &loss, decay, 1, _dim);         double *backLoss = malloc(sizeof(double) * _dim);         vDSP_vsmulD((_theta + i * _dim), 1, &loss, backLoss, 1, _dim);         [_cnn backPropagation:backLoss];         vDSP_vaddD((_theta + i * _dim), 1, decay, 1, (_theta + i * _dim), 1, _dim);         if (decay != NULL) {             free(decay);             decay = NULL;         }     } }

CNN主体实现代码如下:

// //  MLCnn.m //  MNIST // //  Created by Jiao Liu on 9/28/16. //  Copyright © 2016 ChangHong. All rights reserved. //  #import "MLCnn.h"  @implementation MLCnn  + (double)truncated_normal:(double)mean dev:(double)stddev {     double outP = 0.0;     do {         static int hasSpare = 0;         static double spare;         if (hasSpare) {             hasSpare = 0;             outP = mean + stddev * spare;             continue;         }          hasSpare = 1;         static double u,v,s;         do {             u = (rand() / ((double) RAND_MAX)) * 2.0 - 1.0;             v = (rand() / ((double) RAND_MAX)) * 2.0 - 1.0;             s = u * u + v * v;         } while ((s >= 1.0) || (s == 0.0));         s = sqrt(-2.0 * log(s) / s);         spare = v * s;         outP = mean + stddev * u * s;     } while (fabsl(outP) > 2*stddev);     return outP; }  + (double *)relu:(double *)x size:(int)size {     double *zero = [MLCnn fillVector:0.0f size:size];     vDSP_vmaxD(x, 1, zero, 1, x, 1, size);     if (zero != NULL) {         free(zero);         zero = NULL;     }     return x; }  + (double *)fillVector:(double)num size:(int)size {     double *outP = malloc(sizeof(double) * size);     vDSP_vfillD(&num, outP, 1, size);     return outP;  }  + (double)max_pool:(double *)input dim:(int)dim row:(int)row col:(int)col stride:(NSArray *)stride {     double maxV = input[dim * [stride[0] intValue] + row * 2 * [stride[1] intValue] + col * 2];     maxV = MAX(maxV, input[dim * [stride[0] intValue] + (row * 2 + 1) * [stride[1] intValue] + col * 2]);     maxV = MAX(maxV, input[dim * [stride[0] intValue] + row * 2 * [stride[1] intValue] + col * 2 + 1]);     maxV = MAX(maxV, input[dim * [stride[0] intValue] + (row * 2 + 1) * [stride[1] intValue] + col * 2 + 1]);     return maxV; }  + (double)mean_pool:(double *)input dim:(int)dim row:(int)row col:(int)col stride:(NSArray *)stride {     double sum = input[dim * [stride[0] intValue] + row * 2 * [stride[1] intValue] + col * 2];     sum += input[dim * [stride[0] intValue] + (row * 2 + 1) * [stride[1] intValue] + col * 2];     sum += input[dim * [stride[0] intValue] + row * 2 * [stride[1] intValue] + col * 2 + 1];     sum += input[dim * [stride[0] intValue] + (row * 2 + 1) * [stride[1] intValue] + col * 2 + 1];     return sum / 4; }  + (void)conv_2d:(double *)input inputRow:(int)NR inputCol:(int)NC filter:(double *)filter output:(double *)output filterRow:(int)P filterCol:(int)Q {     int outRow = NR - P + 1;     int outCol = NR - Q + 1;     for (int i = 0; i < outRow; i++) {         for (int j = 0; j < outCol; j++) {             double sum = 0;             for (int k = 0; k < P; k++) {                 double *inner = malloc(sizeof(double) * Q);                 vDSP_vmulD((input + (i + k) * NR + j), 1, (filter + k * Q), 1, inner, 1, Q);                 vDSP_vswsumD(inner, 1, &sum, 1, 1, Q);                 if (inner != NULL) {                     free(inner);                     inner = NULL;                 }             }             output[i* outCol + j] = sum;         }     } }  + (double *)weight_init:(int)size {     double *outP = malloc(sizeof(double) * size);     for (int i = 0; i < size; i++) {         outP[i] = [MLCnn truncated_normal:0 dev:0.1];     }     return outP; }  + (double *)bias_init:(int)size {     return [MLCnn fillVector:0.1f size:size]; }  # pragma mark - CNN Main  - (id)initWithFilters:(NSArray *)filters fullConnectSize:(int)size row:(int)dimRow col:(int)dimCol keepRate:(double)rate {     self = [super init];     if (self) {         _filters = filters;         _connectSize = size;         _numOfFilter = (int)[filters count];         _dimRow = dimRow;         _dimCol = dimCol;         _keepProb = rate;         _weight = malloc(sizeof(double) * (_numOfFilter + 1));         _bias = malloc(sizeof(double) * (_numOfFilter + 1));         _filteredImage = malloc(sizeof(double) * (_numOfFilter + 1));         _reluFlag = malloc(sizeof(double) * (_numOfFilter + 1));         _dropoutMask = malloc(sizeof(double) * (_connectSize));         int preDim = 1;         int row = dimRow;         int col = dimCol;         for (int i = 0; i < _numOfFilter; i++) {             _weight[i] = [MLCnn weight_init:[_filters[i][0] intValue] * [_filters[i][1] intValue] * [_filters[i][2] intValue] * preDim];             _bias[i] = [MLCnn bias_init:[_filters[i][2] intValue]];             row = (row - ([_filters[i][0] intValue] / 2) * 2) / 2;             col = (col - ([_filters[i][1] intValue] / 2) * 2) / 2;             preDim = [_filters[i][2] intValue];             _filteredImage[i] = NULL;             _reluFlag[i] = NULL;         }         _weight[_numOfFilter] = [MLCnn weight_init:row * col * preDim * _connectSize];         _bias[_numOfFilter] = [MLCnn bias_init:_connectSize];         _filteredImage[_numOfFilter] = NULL;         _reluFlag[_numOfFilter] = NULL;         _outRow = row;         _outCol = col;     }     return self; }  - (void)dealloc {     if (_weight != NULL) {         for (int i = 0; i < _numOfFilter + 1; i++) {             free(_weight[i]);             _weight[i] = NULL;         }         free(_weight);         _weight = NULL;     }     if (_bias != NULL) {         for (int i = 0; i < _numOfFilter + 1; i++) {             free(_bias[i]);             _bias[i] = NULL;         }         free(_bias);         _bias = NULL;     }     if (_filteredImage != NULL) {         for (int i = 1; i < _numOfFilter + 1; i++) {             free(_filteredImage[i]);             _filteredImage[i] = NULL;         }         free(_filteredImage);         _filteredImage = NULL;     }     if (_reluFlag != NULL) {         for (int i = 0; i < _numOfFilter + 1; i++) {             free(_reluFlag[i]);             _reluFlag[i] = NULL;         }         free(_reluFlag);         _reluFlag = NULL;     }     if (_dropoutMask != NULL) {         free(_dropoutMask);         _dropoutMask = NULL;     } }  - (double *)filterImage:(double *)image state:(BOOL)isTraining {     if (_numOfFilter == 0) {         return image;     }      int preDim = 1;     int row = _dimRow;     int col = _dimCol;     _filteredImage[0] = image;     for (int i = 0; i < _numOfFilter; i++) {         double *conv = [MLCnn fillVector:0.0f size:row * col * [_filters[i][2] intValue]];         // convolve         for (int k = 0; k < [_filters[i][2] intValue]; k++) {             double *inner = malloc(sizeof(double) * row * col);             for (int m = 0; m < preDim; m++) {                 vDSP_imgfirD((_filteredImage[i] + m * row * col), row, col, (_weight[i] + k * [_filters[i][0] intValue] * [_filters[i][1] intValue] * preDim + m * [_filters[i][0] intValue] * [_filters[i][1] intValue]), inner, [_filters[i][0] intValue], [_filters[i][1] intValue]);                 vDSP_vaddD((conv + k * row * col), 1, inner, 1, (conv + k * row * col), 1, row * col);             }             vDSP_vsaddD((conv + k * row * col), 1, &_bias[i][k], (conv + k * row * col), 1, row * col);             if (inner != NULL) {                 free(inner);                 inner = NULL;             }         }          int strideRow = [_filters[i][0] intValue] / 2;         int strideCol = [_filters[i][1] intValue] / 2;         row -= strideRow * 2;         col -= strideCol * 2;         if (_reluFlag[i] != NULL) {             free(_reluFlag[i]);             _reluFlag[i] = NULL;         }         _reluFlag[i] = malloc(sizeof(double) * row * col * [_filters[i][2] intValue]);         for (int k = 0; k < [_filters[i][2] intValue]; k++) {             for (int r = 0; r < row; ++r)             {                 for (int c = 0; c < col; ++c)                 {                     _reluFlag[i][k * row *col + r * col + c] = conv[k * (row + strideRow * 2) * (col + strideCol * 2) + (r + strideRow) * (col + strideCol * 2) + c + strideCol];                 }              }         }         // relu         _reluFlag[i] = [MLCnn relu:_reluFlag[i] size:row * col * [_filters[i][2] intValue]];          // pooling 2*2         if (_filteredImage[i+1] != NULL) {             free(_filteredImage[i+1]);             _filteredImage[i+1] = NULL;         }         _filteredImage[i+1] = malloc(sizeof(double) * row * col * [_filters[i][2] intValue] / 4);          for (int k = 0; k < [_filters[i][2] intValue]; k++) {             for (int m = 0; m < row / 2; m++) {                 for (int n = 0; n < col / 2; n++) {                     _filteredImage[i+1][k * row * col / 4 + m * col / 2 + n] = [MLCnn mean_pool:_reluFlag[i] dim:k row:m col:n stride:@[[NSNumber numberWithInt:row * col],[NSNumber numberWithInt:col]]];                 }             }         }          row /= 2;         col /= 2;         preDim = [_filters[i][2] intValue];          if (conv != NULL) {             free(conv);             conv = NULL;         }     }      // full connect     if (_reluFlag[_numOfFilter] != NULL) {         free(_reluFlag[_numOfFilter]);         _reluFlag[_numOfFilter] = NULL;     }     _reluFlag[_numOfFilter] = malloc(sizeof(double) * _connectSize);     vDSP_mmulD(_weight[_numOfFilter], 1, _filteredImage[_numOfFilter], 1, _reluFlag[_numOfFilter], 1, _connectSize, 1, row * col * preDim);     vDSP_vaddD(_reluFlag[_numOfFilter], 1, _bias[_numOfFilter], 1, _reluFlag[_numOfFilter], 1, _connectSize);     _reluFlag[_numOfFilter] = [MLCnn relu:_reluFlag[_numOfFilter] size:_connectSize];      // dropOut     if (isTraining) {         for (int i = 0; i < _connectSize; i++) {             if ((double)rand()/RAND_MAX > _keepProb) {                 _dropoutMask[i] = 0;                 _reluFlag[_numOfFilter][i] = 0;             }             else             {                 _dropoutMask[i] = 1;             }         }     }     else     {         vDSP_vsmulD(_reluFlag[_numOfFilter], 1, &_keepProb, _reluFlag[_numOfFilter], 1, _connectSize);     }      return _reluFlag[_numOfFilter]; }  - (void)backPropagation:(double *)loss {     int row = _outRow;     int col = _outCol;     // dropOut     vDSP_vmulD(loss, 1, _dropoutMask, 1, loss, 1, _connectSize);      // deRelu     for (int i = 0; i < _connectSize; i++) {         if (_reluFlag[_numOfFilter][i] == 0) {             loss[i] = 0;         }     }      // update full-connect layer     vDSP_vaddD(loss, 1, _bias[_numOfFilter], 1, _bias[_numOfFilter], 1, _connectSize);     double *flayerLoss = malloc(sizeof(double) * row * col * [_filters[_numOfFilter - 1][2] intValue]);     double *transWeight = malloc(sizeof(double) * row * col * [_filters[_numOfFilter - 1][2] intValue] * _connectSize);     vDSP_mtransD(_weight[_numOfFilter], 1, transWeight, 1, row * col * [_filters[_numOfFilter - 1][2] intValue], _connectSize);     vDSP_mmulD(transWeight, 1, loss, 1, flayerLoss, 1, row * col * [_filters[_numOfFilter - 1][2] intValue], 1, _connectSize);      double *flayerWeight = malloc(sizeof(double) * row * col * [_filters[_numOfFilter - 1][2] intValue] * _connectSize);     vDSP_mmulD(loss, 1, _filteredImage[_numOfFilter], 1, flayerWeight, 1, _connectSize, row * col * [_filters[_numOfFilter - 1][2] intValue], 1);     vDSP_vaddD(_weight[_numOfFilter], 1, flayerWeight, 1, _weight[_numOfFilter], 1, row * col * [_filters[_numOfFilter - 1][2] intValue] * _connectSize);      if (loss != NULL) {         free(loss);         loss = NULL;     }     if (flayerWeight != NULL) {         free(flayerWeight);         flayerWeight = NULL;     }     if (transWeight != NULL) {         free(transWeight);         transWeight = NULL;     }      // update Conv & pooling layer     double *convBackLoss = flayerLoss;     for (int i = _numOfFilter - 1; i >= 0; i--) {         // unsampling         row *= 2;         col *= 2;         int preDim = i > 0 ? [_filters[i-1][2] intValue] : 1;         double *unsample = malloc(sizeof(double) * row * col * [_filters[i][2] intValue]);         for (int k = 0; k < [_filters[i][2] intValue]; k++) {             for (int m = 0; m < row / 2; m++) {                 for (int n = 0; n < col / 2; n++) {                     unsample[k*row*col + m*2*col + n*2] = unsample[k*row*col + m*2*col + n*2 + 1] = unsample[k*row*col + (m*2+1)*col + n*2] = unsample[k*row*col + (m*2+1)*col + n*2 + 1] = convBackLoss[k*row*col/4 + m*col/2 + n] / 4;                 }             }         }         // deRelu         for (int k = 0; k < row * col * [_filters[i][2] intValue]; k++) {             if (_reluFlag[i][k] == 0) {                 unsample[k] = 0;             }         }          // update conv bias         for (int k = 0; k < [_filters[i][2] intValue]; k++) {             double biasLoss = 0;             for (int m = 0; m < row / 2; m++) {                 for (int n = 0; n < col / 2; n++) {                     biasLoss += convBackLoss[k*row*col/4 + m*col/2 + n];                 }             }             _bias[i][k] += biasLoss;         }          int strideRow = [_filters[i][0] intValue] / 2;         int strideCol = [_filters[i][1] intValue] / 2;          if (i > 0) { //if not the first layer calculate back loss             if (convBackLoss != NULL) {                 free(convBackLoss);                 convBackLoss = NULL;             }             convBackLoss = [MLCnn fillVector:0.0f size:(row + strideRow * 2) * (col + strideCol * 2) * preDim];             double *curLoss = [MLCnn fillVector:0.0f size:(row + strideRow * 2) * (col + strideCol * 2) * [_filters[i][2] intValue]];             for (int k = 0; k < [_filters[i][2] intValue]; k++) {                 for (int p = 0; p < row; p++) {                     for (int q = 0; q < col; q++) {                         curLoss[k * (row + strideRow * 2) * (col + strideCol * 2) + (p + strideRow) * (col + strideCol * 2) + q + strideCol] = unsample[k * row * col + p * col + q];                     }                 }             }              // Δq′=(∑p∈CΔp∗frot180(Θp))∘ϕ′(Oq′)             for (int k = 0; k < preDim; k++) {                 double *inner = malloc(sizeof(double) * (row + strideRow * 2) * (col + strideCol * 2));                 for (int m = 0; m < [_filters[i][2] intValue]; m++) {                     double *reverseWeight = [MLCnn fillVector:0.0f size:[_filters[i][0] intValue] * [_filters[i][1] intValue]];                     vDSP_vaddD(reverseWeight, 1, (_weight[i] + m * [_filters[i][0] intValue] * [_filters[i][1] intValue] * preDim + k * [_filters[i][0] intValue] * [_filters[i][1] intValue]), 1, reverseWeight, 1, [_filters[i][0] intValue] * [_filters[i][1] intValue]);                     vDSP_vrvrsD(reverseWeight, 1, [_filters[i][0] intValue] * [_filters[i][1] intValue]);                     vDSP_imgfirD((curLoss + m * (row + strideRow * 2) * (col + strideCol * 2)), row + strideRow * 2, col + strideCol * 2, reverseWeight, inner, [_filters[i][0] intValue], [_filters[i][1] intValue]);                     vDSP_vaddD((convBackLoss + k * (row + strideRow * 2) * (col + strideCol * 2)), 1, inner, 1, (convBackLoss + k * (row + strideRow * 2) * (col + strideCol * 2)), 1, (row + strideRow * 2) * (col + strideCol * 2));                     if (reverseWeight != NULL) {                         free(reverseWeight);                         reverseWeight = NULL;                     }                 }                 if (inner != NULL) {                     free(inner);                     inner = NULL;                 }             }             if (curLoss != NULL) {                 free(curLoss);                 curLoss = NULL;             }         }          // update conv weight         for (int k = 0; k < [_filters[i][2] intValue]; k++) { //            int strideRow = [_filters[i][0] intValue] / 2; //            int strideCol = [_filters[i][1] intValue] / 2; //            double *curLoss = malloc(sizeof(double) * (row - strideRow * 2) * (col - strideCol * 2)); //            for (int p = 0; p < row - strideRow * 2; p++) { //                for (int q = 0; q < col - strideCol * 2; q++) { //                    curLoss[p * (col - strideCol * 2) + q] = unsample[k * row * col + (p + strideRow) * col + q + strideCol]; //                } //            } //            vDSP_vrvrsD(curLoss, 1, (row - strideRow * 2) * (col - strideCol * 2));             vDSP_vrvrsD((unsample + k * row * col), 1, row * col);              for (int m = 0; m < preDim; m++) {                 double *inner = malloc(sizeof(double) * (row + strideRow * 2) * (col + strideCol * 2));                 vDSP_imgfirD((_filteredImage[i] + m * (row + strideRow * 2) * (col + strideCol * 2)), (row + strideRow * 2), (col + strideCol * 2), (unsample + k * row * col), inner, row, col);                 double *weightLoss = malloc(sizeof(double) * [_filters[i][0] intValue] * [_filters[i][1] intValue]);                 int P = row / 2;                 int Q = col / 2;                 for (int r = P; r <= (row + strideRow * 2) - P; ++r)                 {                     for (int c = Q; c <= (col + strideCol * 2) - Q; ++c)                     {                         weightLoss[(r-P)*[_filters[i][1] intValue] + (c-Q)] = inner[r*col + c];                     }                 } //                [MLCnn conv_2d:(_filteredImage[i] + m * (row + strideRow * 2) * (col + strideCol * 2)) inputRow:(row + strideRow * 2) inputCol:(col + strideCol * 2) filter:(unsample + k * row * col) output:weightLoss filterRow:row filterCol:col];                 vDSP_vrvrsD(weightLoss, 1, [_filters[i][0] intValue] * [_filters[i][1] intValue]);                 vDSP_vaddD((_weight[i] + k * [_filters[i][0] intValue] * [_filters[i][1] intValue] * preDim + m * [_filters[i][0] intValue] * [_filters[i][1] intValue]), 1, weightLoss, 1, (_weight[i] + k * [_filters[i][0] intValue] * [_filters[i][1] intValue] * preDim + m * [_filters[i][0] intValue] * [_filters[i][1] intValue]), 1, [_filters[i][0] intValue] * [_filters[i][1] intValue]);                  if (weightLoss != NULL) {                     free(weightLoss);                     weightLoss = NULL;                 }                 if (inner != NULL) {                     free(inner);                     inner = NULL;                 }             }         }          row += strideRow * 2;         col += strideCol * 2;         if (unsample != NULL) {             free(unsample);             unsample = NULL;         }      }      if (convBackLoss != NULL) {         free(convBackLoss);         convBackLoss = NULL;     } }  @end

这里我选用的激活函数是Relu,卷积核参数初始化用的是正态分布随机95%区间内数字填充,池化选择平均池化,也实现最大池化的方法。

最后我选择卷积核5*5*10,5*5*20只迭代1000次的一个输出结果如下:

OC实现(CNN)卷积神经网络
训练结果

正确率比仅仅使用Softmax有明显提高。

结语


以上就是OC实现的一个简单的卷积神经网络,有兴趣的朋友可以下载代码,尝试改变卷积核、迭代参数等,有可能得到更高的正确率😊。

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » OC实现(CNN)卷积神经网络

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址