转载请注明出处:
本部分多试几次就可以弄得清每一层具体怎么访问了。
step1. 网络定义如下:
require "dpnn"local net = nn.Sequential()net:add(nn.SpatialConvolution(3, 64, 7, 7, 2, 2, 3, 3))net:add(nn.SpatialBatchNormalization(64))net:add(nn.ReLU())net:add(nn.SpatialMaxPooling(3, 3, 2, 2, 1, 1))net:add(nn.Inception{ inputSize = 64, kernelSize = { 3, 5}, kernelStride = { 1, 1}, outputSize = { 128, 32}, reduceSize = { 96, 16, 32, 64}, pool = nn.SpatialMaxPooling(3, 3, 1, 1, 1, 1), batchNorm = true })net:evaluate()
上面的网络,包含conv+BatchNorm+ReLU+Maxpool+Inception层。
step2. 直接通过print(net)便可得到其网络结构:
nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output] (1): nn.SpatialConvolution(3 -> 64, 7x7, 2,2, 3,3) (2): nn.SpatialBatchNormalization (3): nn.ReLU (4): nn.SpatialMaxPooling(3x3, 2,2, 1,1) (5): nn.Inception @ nn.DepthConcat { input |`-> (1): nn.Sequential { | [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output] | (1): nn.SpatialConvolution(64 -> 96, 1x1) | (2): nn.SpatialBatchNormalization | (3): nn.ReLU | (4): nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1) | (5): nn.SpatialBatchNormalization | (6): nn.ReLU | } |`-> (2): nn.Sequential { | [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output] | (1): nn.SpatialConvolution(64 -> 16, 1x1) | (2): nn.SpatialBatchNormalization | (3): nn.ReLU | (4): nn.SpatialConvolution(16 -> 32, 5x5, 1,1, 2,2) | (5): nn.SpatialBatchNormalization | (6): nn.ReLU | } |`-> (3): nn.Sequential { | [input -> (1) -> (2) -> (3) -> (4) -> output] | (1): nn.SpatialMaxPooling(3x3, 1,1, 1,1) | (2): nn.SpatialConvolution(64 -> 32, 1x1) | (3): nn.SpatialBatchNormalization | (4): nn.ReLU | } |`-> (4): nn.Sequential { [input -> (1) -> (2) -> (3) -> output] (1): nn.SpatialConvolution(64 -> 64, 1x1) (2): nn.SpatialBatchNormalization (3): nn.ReLU } ... -> output }}
但实际上该网络还包括input,output,gradInput等参数。
step3. 使用下面代码便可输出网络比较详细的参数:
for k,curLayer in pairs(net) do print(k,curLayer)end
step4. 输出:
_type torch.DoubleTensor output [torch.DoubleTensor with no dimension]gradInput [torch.DoubleTensor with no dimension]modules { 1 : { dH : 2 dW : 2 nInputPlane : 3 output : DoubleTensor - empty kH : 7 train : false gradBias : DoubleTensor - size: 64 padH : 3 bias : DoubleTensor - size: 64 weight : DoubleTensor - size: 64x3x7x7 _type : "torch.DoubleTensor" gradWeight : DoubleTensor - size: 64x3x7x7 padW : 3 nOutputPlane : 64 kW : 7 gradInput : DoubleTensor - empty } 2 : { gradBias : DoubleTensor - size: 64 output : DoubleTensor - empty gradInput : DoubleTensor - empty running_var : DoubleTensor - size: 64 momentum : 0.1 gradWeight : DoubleTensor - size: 64 eps : 1e-05 _type : "torch.DoubleTensor" affine : true running_mean : DoubleTensor - size: 64 bias : DoubleTensor - size: 64 weight : DoubleTensor - size: 64 train : false } 3 : { inplace : false threshold : 0 _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty train : false val : 0 } 4 : { dH : 2 dW : 2 kW : 3 gradInput : DoubleTensor - empty indices : DoubleTensor - empty train : false _type : "torch.DoubleTensor" padH : 1 ceil_mode : false output : DoubleTensor - empty kH : 3 padW : 1 } 5 : { outputSize : { 1 : 128 2 : 32 } inputSize : 64 gradInput : DoubleTensor - empty modules : { 1 : { train : false _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : { 1 : {...} 2 : {...} 3 : {...} 4 : {...} } dimension : 2 size : LongStorage - size: 0 } } kernelStride : { 1 : 1 2 : 1 } _type : "torch.DoubleTensor" module : { train : false _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : { 1 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : {...} train : false } 2 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : {...} train : false } 3 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : {...} train : false } 4 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : {...} train : false } } dimension : 2 size : LongStorage - size: 0 } poolStride : 1 padding : true reduceStride : {...} transfer : { inplace : false threshold : 0 _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty val : 0 } batchNorm : true train : false pool : { dH : 1 dW : 1 kW : 3 gradInput : DoubleTensor - empty indices : DoubleTensor - empty train : false _type : "torch.DoubleTensor" padH : 1 ceil_mode : false output : DoubleTensor - empty kH : 3 padW : 1 } poolSize : 3 reduceSize : { 1 : 96 2 : 16 3 : 32 4 : 64 } kernelSize : { 1 : 3 2 : 5 } output : DoubleTensor - empty }}train false
上面的modules中,分别为conv、BatchNorm、ReLU、Maxpool、Inception对应的参数。
step5. 可通过net.modules[1]来索引nn.SpatialConvolution。如print(net.modules[1])得到:
nn.SpatialConvolution(3 -> 64, 7x7, 2,2, 3,3)
step6. 如果想更进一步,输出该层的参数,可以使用如下代码(实际上step4中已经输出了):
for k,curLayer in pairs(net.modules[1]) do if type(curLayer) ~= 'userdata' then print(k,curLayer) else local strval = ' ' for i = 1, curLayer:dim() do strval = strval .. curLayer:size(i) .. " " end print(k .. " " .. type(curLayer) .. " " .. string.format("\27[31m size: %s", strval)) endend
step7. 得到的结果为:
dH 2 dW 2 nInputPlane 3 output userdata size: kH 7 train false gradBias userdata size: 64 padH 3 bias userdata size: 64 weight userdata size: 64 3 7 7 _type torch.DoubleTensor gradWeight userdata size: 64 3 7 7 padW 3 nOutputPlane 64 kW 7 gradInput userdata size:
step8. 对于Inception层,step4中并没有完全显示出来。按照step5中的方式,使用net.modules[5]来得到Inception层。将step6进行更改,可输出:
outputSize { 1 : 128 2 : 32}inputSize 64 gradInput userdata size: modules { 1 : { train : false _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : { 1 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : { 1 : {...} 2 : {...} 3 : {...} 4 : {...} 5 : {...} 6 : {...} } train : false } 2 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : { 1 : {...} 2 : {...} 3 : {...} 4 : {...} 5 : {...} 6 : {...} } train : false } 3 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : { 1 : {...} 2 : {...} 3 : {...} 4 : {...} } train : false } 4 : { _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty modules : { 1 : {...} 2 : {...} 3 : {...} } train : false } } dimension : 2 size : LongStorage - size: 0 }}kernelStride { 1 : 1 2 : 1}_type torch.DoubleTensor module nn.DepthConcat { input |`-> (1): nn.Sequential { | [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output] | (1): nn.SpatialConvolution(64 -> 96, 1x1) | (2): nn.SpatialBatchNormalization | (3): nn.ReLU | (4): nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1) | (5): nn.SpatialBatchNormalization | (6): nn.ReLU | } |`-> (2): nn.Sequential { | [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output] | (1): nn.SpatialConvolution(64 -> 16, 1x1) | (2): nn.SpatialBatchNormalization | (3): nn.ReLU | (4): nn.SpatialConvolution(16 -> 32, 5x5, 1,1, 2,2) | (5): nn.SpatialBatchNormalization | (6): nn.ReLU | } |`-> (3): nn.Sequential { | [input -> (1) -> (2) -> (3) -> (4) -> output] | (1): nn.SpatialMaxPooling(3x3, 1,1, 1,1) | (2): nn.SpatialConvolution(64 -> 32, 1x1) | (3): nn.SpatialBatchNormalization | (4): nn.ReLU | } |`-> (4): nn.Sequential { [input -> (1) -> (2) -> (3) -> output] (1): nn.SpatialConvolution(64 -> 64, 1x1) (2): nn.SpatialBatchNormalization (3): nn.ReLU } ... -> output}poolStride 1 padding true reduceStride {}transfer nn.ReLUbatchNorm true train false pool nn.SpatialMaxPooling(3x3, 1,1, 1,1)poolSize 3 reduceSize { 1 : 96 2 : 16 3 : 32 4 : 64}kernelSize { 1 : 3 2 : 5}output userdata size:
step9. 在step8中,modules中为对应的inception各层(3*3卷积,5*5卷积,pooling,1*1reduce)。可通过net.modules[5].module来得到这些层。该层也有train,output,gradInput,modules等变量。可通过print(net.modules[5].module)来输出。
step10. 根据step5中的思路,可通过net.modules[5].module.modules[1]来得到3*3卷基层具体情况:
_type torch.DoubleTensor output userdata size: gradInput userdata size: modules { 1 : { dH : 1 dW : 1 nInputPlane : 64 output : DoubleTensor - empty kH : 1 train : false gradBias : DoubleTensor - size: 96 padH : 0 bias : DoubleTensor - size: 96 weight : DoubleTensor - size: 96x64x1x1 _type : "torch.DoubleTensor" gradWeight : DoubleTensor - size: 96x64x1x1 padW : 0 nOutputPlane : 96 kW : 1 gradInput : DoubleTensor - empty } 2 : { gradBias : DoubleTensor - size: 96 output : DoubleTensor - empty gradInput : DoubleTensor - empty running_var : DoubleTensor - size: 96 momentum : 0.1 gradWeight : DoubleTensor - size: 96 eps : 1e-05 _type : "torch.DoubleTensor" affine : true running_mean : DoubleTensor - size: 96 bias : DoubleTensor - size: 96 weight : DoubleTensor - size: 96 train : false } 3 : { inplace : false threshold : 0 _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty train : false val : 0 } 4 : { dH : 1 dW : 1 nInputPlane : 96 output : DoubleTensor - empty kH : 3 train : false gradBias : DoubleTensor - size: 128 padH : 1 bias : DoubleTensor - size: 128 weight : DoubleTensor - size: 128x96x3x3 _type : "torch.DoubleTensor" gradWeight : DoubleTensor - size: 128x96x3x3 padW : 1 nOutputPlane : 128 kW : 3 gradInput : DoubleTensor - empty } 5 : { gradBias : DoubleTensor - size: 128 output : DoubleTensor - empty gradInput : DoubleTensor - empty running_var : DoubleTensor - size: 128 momentum : 0.1 gradWeight : DoubleTensor - size: 128 eps : 1e-05 _type : "torch.DoubleTensor" affine : true running_mean : DoubleTensor - size: 128 bias : DoubleTensor - size: 128 weight : DoubleTensor - size: 128 train : false } 6 : { inplace : false threshold : 0 _type : "torch.DoubleTensor" output : DoubleTensor - empty gradInput : DoubleTensor - empty train : false val : 0 }}train false
注意:此处有一个module和一个modules,具体不太明白。
step11. 可通过net.modules[5].module.modules[1].modules进一步查看该层的情况:
1 nn.SpatialConvolution(64 -> 96, 1x1)2 nn.SpatialBatchNormalization3 nn.ReLU4 nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1)5 nn.SpatialBatchNormalization6 nn.ReLU
可见,该层包括1*1conv,BatchNorm,ReLU,3*3conv,BatchNorm,Relu这些。
step12. 若要查看step11中的3*3卷基层信息,可使用如下索引:
net.modules[5].module.modules[1].modules[4]
结果如下:
dH 1 dW 1 nInputPlane 96 output userdata size: kH 3 train false gradBias userdata size: 128 padH 1 bias userdata size: 128 weight userdata size: 128 96 3 3 _type torch.DoubleTensor gradWeight userdata size: 128 96 3 3 padW 1 nOutputPlane 128 kW 3 gradInput userdata size:
step13. 到了step12,已经索引到了step1中网络的最深层。网络中每层均有input,output等。
step14. 对于net.modules[5]的Inception层,net.modules[5].output的结果和net.modules[5].module.output的结果是一样的,如(为方便显示,只显示了一小部分。如果输出net.modules[5].output,可能会有很多全为0的):
local imgBatch = torch.rand(1,3,128,128)local infer = net:forward(imgBatch)print(net.modules[5].output[1][2][3])print(net.modules[5].module.output[1][2][3])
结果为:
0.01 * 2.7396 2.9070 3.1895 1.5040 1.9784 4.0125 3.2874 3.3137 2.1326 2.3930 2.8170 3.5226 2.3162 2.7308 2.8511 2.5278 3.3325 3.0819 3.2826 3.5363 2.5749 2.8816 2.2393 2.4765 2.4803 3.2553 3.0837 3.1197 2.4632 1.5145 3.7101 2.1888[torch.DoubleTensor of size 32]0.01 * 2.7396 2.9070 3.1895 1.5040 1.9784 4.0125 3.2874 3.3137 2.1326 2.3930 2.8170 3.5226 2.3162 2.7308 2.8511 2.5278 3.3325 3.0819 3.2826 3.5363 2.5749 2.8816 2.2393 2.4765 2.4803 3.2553 3.0837 3.1197 2.4632 1.5145 3.7101 2.1888[torch.DoubleTensor of size 32]