Pytorch-神经网络相关记录

torch.manual_seed

在训练开始时，参数的初始化是随机的，为了让每次的结果一致，我们需要设置随机种子。

torch.manual_seed(args.seed)#为CPU设置随机种子

torch.cuda.manual_seed(seed) #为当前GPU设置随机种子

torch.cuda.manual_seed_all(seed) #为所有GPU设置随机种子

from future import print_function

该语句是python2的概念，python3对于python2就是future，该句表示，在python2的环境下，超前使用python3的print函数。该形式语句就是将下一个新版本的特性导入到当前版本。

比如，在python2.x的环境下，第二句语法检查通过，第三句语法检查失败。

 from __future__ import print_function
 print('you are good')
 print 'you are good'

pytorch 中设定使用指定的GPU

PyTorch默认使用从0开始的GPU，如果GPU0正在运行程序，需要指定其他GPU。有两种方式指定需要使用的GPU：
一、类似tensorflow中方式，使用CUDA_VISIBLE_DEVICES
直接终端指定：

CUDA_VISIBLE_DEVICES=0,1 python main.py

python代码中设定：

import os  
os.environ["CUDA_VISIBLE_DEVICES"] = "2"

二、使用函数 set_device

import torch
torc.cuda.set_device(id)

官方建议使用CUDA_VISIBLE_DEVICES，不建议使用 set_device 函数

torch.nn.Module

torch.nn.Module类是PyTorch中用于表示多层网络的。构建自己的网络就是定义torch.nn.Module的一个子类，这时需要重写初始化函数__init__()和前向过程forward()。
__init__()函数中需要调用父类的初始化函数；
forward()函数用于构建网络从输入到输出的过程。
这是Module的初始化方法：

def __init__(self): 
    self._backend = thnn_backend 
    self._parameters = OrderedDict() 
    self._buffers = OrderedDict() 
    self._backward_hooks = OrderedDict() 
    self._forward_hooks = OrderedDict() 
    self._forward_pre_hooks = OrderedDict() 
    self._modules = OrderedDict() 
    self.training = True

self._parameters 用来存放注册的Parameter对象
self._buffers 用来存放注册的Buffer对象。（pytorch中的buffer概念就是不需要反向传导更新的值）
self._modules 用来保存注册的Module对象。
self.training 标志位，用来表示是不是在training状态下。
**_hooks 用来保存注册的 hook

只要在nn.Module的子类中定义了forward函数，backward函数会被自动实现。
注：Pytorch基于nn.Module构建的模型中，只支持mini-batch的Variable输入方式，比如，只有一张输入图片，也需要变成 N x C x H x W 的形式：
input_image = torch.FloatTensor(1, 28, 28)
input_image = Variable(input_image)
input_image = input_image.unsqueeze(0) # 1 x 1 x 28 x 28

torch.nn.Sequential() 搭建网络的几种方法

torch.nn.Sequential()是可以将一整个网络搭建在一个Sequential中。首先导入几种方法用到的包：

import torch
import torch.nn.functional as F
from collections import OrderedDict

第一种方式：

class Net1(torch.nn.Module):
    def __init__(self):
        super(Net1, self).__init__()
        self.conv1 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.dense1 = torch.nn.Linear(32 * 3 * 3, 128)
        self.dense2 = torch.nn.Linear(128, 10)
    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), 2)
        x = x.view(x.size(0), -1)
        x = F.relu(self.dense1(x))
        x = self.dense2(x)
        return x

print("Method 1:")
model1 = Net1()
print(model1)

输出为：

Net1(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (dense2): Linear(in_features=128, out_features=10, bias=True)
) 这种方式早期常用。

第二种方式：

class Net2(torch.nn.Module):
    def __init__(self):
        super(Net2, self).__init__()
        self.conv1 = torch.nn.Sequential(
            torch.nn.Conv2d(3, 32, 3, 1, 1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(2))
        self.dense = torch.nn.Sequential(
            torch.nn.Linear(32 * 3 * 3, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, 10)
        )
    def forward(self, x):
        conv_out = self.conv1(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out

print("Method 2:")
model2 = Net2()
print(model2) 输出：

Net2(
  (conv): Sequential(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (0): Linear(in_features=288, out_features=128, bias=True)
    (1): ReLU()
    (2): Linear(in_features=128, out_features=10, bias=True)
  )
)

这种方式使用torch.nn.Sequential()容器进行快速搭建，模型的各层被顺序添加到容器中。缺点是每层的编号是默认的阿拉伯数字，不易区分。

第三种方式：
是对第二种方法的改进，通过add_module()添加每一层，并且为每一层增加了一个单独的名字。

class Net3(torch.nn.Module):
    def __init__(self):
        super(Net3, self).__init__()
        self.conv=torch.nn.Sequential()
        self.conv.add_module("conv1",torch.nn.Conv2d(3, 32, 3, 1, 1))
        self.conv.add_module("relu1",torch.nn.ReLU())
        self.conv.add_module("pool1",torch.nn.MaxPool2d(2))
        self.dense = torch.nn.Sequential()
        self.dense.add_module("dense1",torch.nn.Linear(32 * 3 * 3, 128))
        self.dense.add_module("relu2",torch.nn.ReLU())
        self.dense.add_module("dense2",torch.nn.Linear(128, 10))

    def forward(self, x):
        conv_out = self.conv(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out

print("Method 3:")
model3 = Net3()
print(model3)

输出：

Net3(
  (conv): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)

第四种方式：
第四种方式是第三种方式的另一种写法。通过字典的形态添加每一层，并且设置单独的层名称。

class Net4(torch.nn.Module):
    def __init__(self):
        super(Net4, self).__init__()
        self.conv1 = torch.nn.Sequential(
            OrderedDict(
                [
                    ("conv1", torch.nn.Conv2d(3, 32, 3, 1, 1)),
                    ("relu1", torch.nn.ReLU()),
                    ("pool", torch.nn.MaxPool2d(2))
                ]
            ))
        self.dense = torch.nn.Sequential(
            OrderedDict([
                ("dense1", torch.nn.Linear(32 * 3 * 3, 128)),
                ("relu2", torch.nn.ReLU()),
                ("dense2", torch.nn.Linear(128, 10))
            ])
        )
    def forward(self, x):
        conv_out = self.conv1(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out
print("Method 4:")
model4 = Net4()
print(model4)

输出：

Net4(
  (conv1): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)

torch.nn.Linear

class torch.nn.Linear(in_features, out_features, bias=True)

对输入数据做线性变换： y = Ax+b
参数：

in_features: 每个输入样本的大小
out_features : 每个输出样本的大小
bias : 若设置为False，这层不会学习偏置。默认值： True。

形状：

输入： (N, in_features)
输出： (N, out_features)

参数初始化

self.mudules() ：返回当前模型的所有模型的迭代器

for m in self.modules():
    if isinstance(m, nn.Conv2d):
        n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
        m.weight.data.normal_(0, math.sqrt(2. / n))
    elif isinstance(m, nn.BatchNorm2d):
        m.weight.data.fill_(1)
        m.bias.data.zero_()
    elif isinstance(m, nn.Linear):
        m.bias.data.zero_()