# 12- 深度学习之神经网络核心原理与算法-Softmax

## softmax

mark

mark

mark

### softmax的训练过程

mark

onehot

mark

1000个节点就可以做1000中样本的分类。

## softmax层代码实现

``class SoftmaxLayer(object):     def __init__(self, n_in, n_out, p_dropout=0.0):         self.n_in = n_in         self.n_out = n_out         self.p_dropout = p_dropout         self.w = theano.shared(             np.zeros((n_in, n_out), dtype=theano.config.floatX),             name='w', borrow=True)         self.b = theano.shared(             np.zeros((n_out,), dtype=theano.config.floatX),             name='b', borrow=True)         self.params = [self.w, self.b] ``

`` def set_inpt(self, inpt, inpt_dropout, mini_batch_size):         self.inpt = inpt.reshape((mini_batch_size, self.n_in))         self.output = softmax((1 - self.p_dropout) * T.dot(self.inpt, self.w) + self.b)         self.y_out = T.argmax(self.output, axis=1)         self.inpt_dropout = dropout_layer(             inpt_dropout.reshape((mini_batch_size, self.n_in)), self.p_dropout)         self.output_dropout = softmax(T.dot(self.inpt_dropout, self.w) + self.b) ``

``    def cost(self, net):         "Return the log-likelihood cost."         return -T.mean(T.log(self.output_dropout)[T.arange(net.y.shape[0]), net.y]) ``

## 卷积神经网络手写数字识别

• 全连接
• 加入卷积池化层
• 使用Relu函数
• 加上Dropout

### 编码实现

``import pickle import gzip ``

Python3下的修改:

``TypeError: 'float' object cannot be interpreted as an integer” ``

pickle加上encoding参数.

### reluc函数

``def ReLU(z): return T.maximum(0.0, z) ``

### 读取数据的函数

``# 读取MNIST data def load_data_shared(filename="./data/mnist.pkl.gz"):     f = gzip.open(filename, 'rb')     training_data, validation_data, test_data = pickle.load(f,encoding='bytes')     f.close()      def shared(data):         shared_x = theano.shared(             np.asarray(data[0], dtype=theano.config.floatX), borrow=True)         shared_y = theano.shared(             np.asarray(data[1], dtype=theano.config.floatX), borrow=True)         return shared_x, T.cast(shared_y, "int32")      return [shared(training_data), shared(validation_data), shared(test_data)] ``

gzip打开文件，pickle来读取我们的文件数据。返回三个值分别是训练，验证和测试

### 如何使用？

``if __name__ == '__main__':     training_data, validation_data, test_data = load_data_shared()     mini_batch_size = 10 ``

``    net = Network([         FullyConnectedLayer(n_in=784, n_out=100),         SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size) ``

``    net.SGD(training_data, 60, mini_batch_size, 0.1, validation_data, test_data) ``

mark

### 加入卷积池化层

``    # 加入卷积池化层     net = Network([         ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28),                       filter_shape=(20, 1, 5, 5),                       poolsize=(2,2)),         FullyConnectedLayer(n_in=20*12*12, n_out=100),         SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size) ``

image_shape=(mini-batch的大小，输入的feature map个数, 图片高， 图片宽)

filter_shape=(filter个数，输入的feature map个数，filter高，filter宽)

SOftmax层的输入就是上一层全连接层的输出100.输出是10有十种分类。

mark

### 多加入一层卷积池化层

``#多加入一层卷积池化层     net = Network([         ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28),                       filter_shape=(20, 1, 5, 5),                       poolsize=(2,2)),         ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 12),                       filter_shape=(40, 20, 5, 5),                       poolsize=(2,2)),         FullyConnectedLayer(n_in=40*4*4, n_out=100),         SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size) ``

• 我们的(20,12,12) 经过40个(5,5)的过滤器.

featuremap个数会变成40个。12-5 = 7 再加一就是8,8的。(40,8,8)

mark

### 使用relu激励函数替代sigmoid

``net = Network([         ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28),             filter_shape=(20, 1, 5, 5),             poolsize=(2,2),             activation_fn=ReLU),         ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 12),             filter_shape=(40, 20, 5, 5),             poolsize=(2,2),             activation_fn=ReLU),         FullyConnectedLayer(n_in=40*4*4, n_out=100, activation_fn=ReLU),         SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size) ``

mark

### 加上Dropout

`` #加上dropout     net = Network([         ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28),             filter_shape=(20, 1, 5, 5),             poolsize=(2,2),             activation_fn=ReLU),         ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 12),             filter_shape=(40, 20, 5, 5),             poolsize=(2,2),             activation_fn=ReLU),         FullyConnectedLayer(n_in=40*4*4, n_out=100,                             activation_fn=ReLU, p_dropout=0.5),         SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size) ``

mark