# 有趣的机器学习概念纵览：从多元拟合，神经网络到深度学习，给每个感兴趣的人

## What is Machine Learning: Machine Learning的概念与算法介绍

Machine learning is the idea that there are generic algorithms that can tell you something interesting about a set of data without you having to write any custom code specific to the problem. Instead of writing code, you feed data to the generic algorithm and it builds its own logic based on the data.

“Machine learning” is an umbrella term covering lots of these kinds of generic algorithms.

## House Price Estimation With Supervised Learning: 利用监督学习进行房屋价格估计

### Let’s Write the Program

``def estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):    price = 0    # 俺们这嘎达，房子基本上每平方200    price_per_sqft = 200    if neighborhood == "hipsterton":      # 市中心会贵一点      price_per_sqft = 400    elif neighborhood == "skid row":      # 郊区便宜点      price_per_sqft = 100    # 可以根据单价*房子大小得出一个基本价格    price = price_per_sqft * sqft    # 基于房间数做点调整    if num_of_bedrooms == 0:      # 没房间的便宜点      price = price — 20000    else:      # 房间越多一般越值钱      price = price + (num_of_bedrooms * 1000)   return price``

``def estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):    price = <computer, plz do some math for me>    return price``

``def estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):   price = 0   # a little pinch of this   price += num_of_bedrooms * .841231951398213   # and a big pinch of that   price += sqft * 1231.1231231   # maybe a handful of this   price += neighborhood * 2.3242341421   # and finally, just a little extra salt for good measure   price += 201.23432095   return price``

### Weights

#### Step 1

``def estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):    price = 0    # a little pinch of this    price += num_of_bedrooms * 1.0    # and a big pinch of that    price += sqft * 1.0    # maybe a handful of this    price += neighborhood * 1.0    # and finally, just a little extra salt for good measure    price += 1.0    return price``

### Mind Blowage Time

• 过去40年来，包括语言学、翻译等等在内的很多领域都证明了通用的学习算法也能表现出色，尽管这些算法本身看上去毫无意义。

• 刚才咱写的那个函数也是所谓的无声的，即函数中，并不知道卧室数目bedrooms、客厅大小square_feet这些变量到底是啥意思，它只知道输入某些数字然后得出一个值。这一点就很明显地和那些面向特定的业务逻辑的处理程序有很大区别。

• 估计你是猜不到哪些权重才是最合适的，或许你连自己为啥要这么写函数都不能理解，虽然你能证明这么写就是有用的。

• 如果我们把参数 `sqft` 改成了图片中的像素的敏感度，那么原来输出的值是所谓的价格，而现在的值就是所谓的图片的类型，输入的不同，输出值的意义也就可以不一样。

## Neural Network: 神级网络

### Introduction To Neural Networks: 神经网络模型初探

``def estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):   price = 0  # a little pinch of this   price += num_of_bedrooms * 0.123  # and a big pinch of that   price += sqft * 0.41  # maybe a handful of this   price += neighborhood * 0.57  return price``

#### What is Neural Network?: 神经网络初识

• 每个能够接收一系列的输入并且能够按权重求和的估值函数被称为Neuron(神经元)

• 多个简单的神经元的连接可以用来构造处理复杂问题的模型

#### Giving Our Neural Network a Memory: 给神级网络加点上下文

``Robert Cohn was once middleweight boxi``

#### Generating a story: 生成一个完整的故事

``hjCTCnhoofeoxelif edElobe negnk e iohehasenoldndAmdaI ayio pe e h’e btentmuhgehi bcgdltt. gey heho grpiahe.  Ddelnss.eelaishaner” cot AAfhB ht ltny  ehbih a”on bhnte ectrsnae abeahngy  amo k ns aeo?cdse nh a taei.rairrhelardr er deffijha``

``hing soor ither. And the caraos, and the crowebel for figttier and ale the room of me? Streat was not to him Bill-stook of the momansbed mig out ust on the bull, out here. I been soms  inick stalling that aid.   “Hon’t me and acrained on .Hw’s don’t you for the roed,” In’s pair.”   “Alough marith him.”``

``He went over to the gate of the café. It was like a country bed.   “Do you know it’s been me.”   “Damned us,” Bill said.   “I was dangerous,” I said. “You were she did it and think I would a fine cape you,” I said.   “I can’t look strange in the cab.”   “You know I was this is though,” Brett said.   “It’s a fights no matter?”   “It makes to do it.”   “You make it?”   “Sit down,” I said. “I wish I wasn’t do a little with the man.”   “You found it.”   “I don’t know.”   “You see, I’m sorry of chatches,” Bill said. “You think it’s a friend off back and make you really drunk.”``

``There were a few people inside at the bar, and outside, alone, sat Harvey Stone. He had a pile of saucers in front of him, and he needed a shave.   “Sit down,” said Harvey, “I’ve been looking for you.”   “What’s the matter?”   “Nothing. Just looking for you.”   “Been out to the races?”   “No. Not since Sunday.”   “What do you hear from the States?”   “Nothing. Absolutely nothing.”   “What’s the matter?”``

### Super Mario: 利用神级网络进行Mario过关训练

In 2015, Nintendo 宣布了 Super Mario Maker™ 用于Wii U游戏系统上。

``-------------------------- -------------------------- -------------------------- #??#---------------------- -------------------------- -------------------------- -------------------------- -##------=--=----------==- --------==--==--------===- -------===--===------====- ------====--====----=====- =========================-``

• `-` 代表空白

• `=` 代表坚固的方块

• `#` 代表那些可以被撞破的块

• `?` 代表钱币块

``-----------= -------#---= -------#---= -------?---= -------#---= -----------= -----------= ----------@= ----------@= -----------= -----------= -----------= ---------PP= ---------PP= ----------== ---------=== --------==== -------===== ------====== -----======= ---========= ---=========``

``-------------------------- LL+<&=------P------------- -------- ---------------------T--#-- ----- -=--=-=------------=-&--T-------------- -------------------- --=------\$-=#-=-_ --------------=----=<---- -------b -``

``-- -----------= ----------= --------PP= --------PP= -----------= -----------= -----------= -------?---= -----------= -----------=``

``--------PP=  --------PP=  ----------=  ----------=  ----------=  ---PPP=---=  ---PPP=---=  ----------=``

• Lakitu，就是那个小怪兽被放到了半空中，跟Mario关卡一样一样的。

• 它认知到了应该把管道插入大地

• 并没有让玩家无路可走

• 看起来风格非常像最传统的马里奥的版本

## Object Recognition In Images With Deep Learning: 利用深度学习对于图片中对象进行识别

### The Solution is Convolution:卷积神经网络

• 地上覆盖着草皮与水泥

• 有个宝宝

• 宝宝坐在个木马上

• 木马在草地上

#### 4. 缩减像素采样

Max pooling处理过程上呢就是将原特征矩阵按照2*2分割为不同的块，然后从每个方块中找出最有兴趣的位保留，然后丢弃其他三个数组。

#### 6. 添加更多的步骤

• Convolution: 卷积

• Max-pooling: 特征各维最大汇总

• Full-connected: 全连接网络

### Building our Bird Classifier: 构建一个真实的鸟儿分类器

``# -*- coding: utf-8 -*-  """ Based on the tflearn example located here: https://github.com/tflearn/tflearn/blob/master/examples/images/convnet_cifar10.py """  from __future__ import division, print_function, absolute_import  # Import tflearn and some helpers import tflearn  from tflearn.data_utils import shuffle from tflearn.layers.core import input_data, dropout, fully_connected from tflearn.layers.conv import conv_2d, max_pool_2d from tflearn.layers.estimator import regression from tflearn.data_preprocessing import ImagePreprocessing from tflearn.data_augmentation import ImageAugmentation import pickle  # Load the data set X, Y, X_test, Y_test = pickle.load(open("full_dataset.pkl", "rb"))  # Shuffle the data X, Y = shuffle(X, Y)  # Make sure the data is normalized img_prep = ImagePreprocessing() img_prep.add_featurewise_zero_center() img_prep.add_featurewise_stdnorm()  # Create extra synthetic training data by flipping, rotating and blurring the # images on our data set. img_aug = ImageAugmentation() img_aug.add_random_flip_leftright() img_aug.add_random_rotation(max_angle=25.) img_aug.add_random_blur(sigma_max=3.)  # Define our network architecture:  # Input is a 32x32 image with 3 color channels (red, green and blue) network = input_data(shape=[None, 32, 32, 3],                      data_preprocessing=img_prep,                      data_augmentation=img_aug)  # Step 1: Convolution network = conv_2d(network, 32, 3, activation='relu')  # Step 2: Max pooling network = max_pool_2d(network, 2)  # Step 3: Convolution again network = conv_2d(network, 64, 3, activation='relu')  # Step 4: Convolution yet again network = conv_2d(network, 64, 3, activation='relu')  # Step 5: Max pooling again network = max_pool_2d(network, 2)  # Step 6: Fully-connected 512 node neural network network = fully_connected(network, 512, activation='relu')  # Step 7: Dropout - throw away some data randomly during training to prevent over-fitting network = dropout(network, 0.5)  # Step 8: Fully-connected neural network with two outputs (0=isn't a bird, 1=is a bird) to make the final prediction network = fully_connected(network, 2, activation='softmax')  # Tell tflearn how we want to train the network network = regression(network, optimizer='adam',                      loss='categorical_crossentropy',                      learning_rate=0.001)  # Wrap the network in a model object model = tflearn.DNN(network, tensorboard_verbose=0, checkpoint_path='bird-classifier.tfl.ckpt')  # Train it! We'll do 100 training passes and monitor it as it goes. model.fit(X, Y, n_epoch=100, shuffle=True, validation_set=(X_test, Y_test),           show_metric=True, batch_size=96,           snapshot_epoch=True,           run_id='bird-classifier')  # Save model when training is complete to a file model.save("bird-classifier.tfl") print("Network trained and saved as bird-classifier.tfl!")``

#### Testing out Network

``  # -*- coding: utf-8 -*-  from __future__ import division, print_function, absolute_import  import tflearn from tflearn.layers.core import input_data, dropout, fully_connected from tflearn.layers.conv import conv_2d, max_pool_2d from tflearn.layers.estimator import regression from tflearn.data_preprocessing import ImagePreprocessing from tflearn.data_augmentation import ImageAugmentation import scipy import numpy as np import argparse  parser = argparse.ArgumentParser(description='Decide if an image is a picture of a bird') parser.add_argument('image', type=str, help='The image image file to check') args = parser.parse_args()  # Same network definition as before img_prep = ImagePreprocessing() img_prep.add_featurewise_zero_center() img_prep.add_featurewise_stdnorm() img_aug = ImageAugmentation() img_aug.add_random_flip_leftright() img_aug.add_random_rotation(max_angle=25.) img_aug.add_random_blur(sigma_max=3.)  network = input_data(shape=[None, 32, 32, 3],                      data_preprocessing=img_prep,                      data_augmentation=img_aug) network = conv_2d(network, 32, 3, activation='relu') network = max_pool_2d(network, 2) network = conv_2d(network, 64, 3, activation='relu') network = conv_2d(network, 64, 3, activation='relu') network = max_pool_2d(network, 2) network = fully_connected(network, 512, activation='relu') network = dropout(network, 0.5) network = fully_connected(network, 2, activation='softmax') network = regression(network, optimizer='adam',                      loss='categorical_crossentropy',                      learning_rate=0.001)  model = tflearn.DNN(network, tensorboard_verbose=0, checkpoint_path='bird-classifier.tfl.ckpt') model.load("bird-classifier.tfl.ckpt-50912")  # Load the image file img = scipy.ndimage.imread(args.image, mode="RGB")  # Scale it to 32x32 img = scipy.misc.imresize(img, (32, 32), interp="bicubic").astype(np.float32, casting='unsafe')  # Predict prediction = model.predict([img])  # Check the result. is_bird = np.argmax(prediction[0]) == 1  if is_bird:     print("That's a bird!") else:     print("That's not a bird!")``

### How accurate is 95% accurate?: 怎么理解这95%的准确率

• 首先，我们将正确被标识为鸟类的鸟类图片称为：True Positives

• 其次，对于标识为鸟类的非鸟类图片称为：True Negatives

• 对于划分为鸟类的非鸟类图片称为：False Positives

• 对于划分为非鸟类的鸟类图片称为：False Negatives