본문 바로가기

머신러닝공부

PyTorch_Tutorial_Built the NN 파이토치 공식사이트

728x90
반응형

Built the Neural Network(신경망 모델 구성하기)

Neural networks comprise of layers/modules that perform operations on data.

The torch.nn namespace provides all the building blocks you need to build your own neaural network.

Every module in PyTorch subclasses the nn.Module.

A neural network is a module itself that consits of other modules(layers).

This nested structure allows for building and manging complex architectures easily.

PyTorch의 모든 모듈은 nn.Module의 하위 클래스임

신경망은 다른 모듈(계층; layer)로 구성된 모듈

 

In the following sections, we'll build a neural network to classify images in the FashionMNIST dataset.

import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

Get Device for Training

We want to be able to train our model on a hardware accelerator like the GPU, if it is available. Let's check to see if torch.cuda is available, else we continue to use the CPU.

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
Using cuda device

Define the Class

We define our neural network by subclassing nn.Module, and initialize the neural network layers in __init__.

Every nn.Module subclass implements the operations on input data in the forward method.

nn.Module을 상속받은 모든 클래스는 forward 메소드에 입력 데이터에 대한 연산들을 구현한다.

class NeuralNetwork(nn.Module):
	def__init__(self):
    	super(NeuralNetwork, self).__init__()
        self.flatten =. nn.Flatten()
        self.linear_relu_stack =. n.Sequential(
        	nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )
	def forward(self, x):
    	x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

We create an instance of NeuralNetwork, and move it to the device, and print its structure.

NeuralNetwork의 인스턴스를 생성하고 이를 device로 이동한다.

model = NeuralNetwork().to(device)
print(model)
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)

To use the model, we pass it the input data. This executes the model's forward, along with some background operations. Do not call model.forward() directly!

모델을 사용하기 위해 입력데이터를 넣는다. 일부 백그라운드 연산들과 함꼐 모델 forward를 실행한다. 직접 호출하지는 않음

Calling the model on the input returns a 2-dimensional tensor with dim=0 corresponding to each output of 10 raw predicted values for each class, and dim=1 corresponding to the individual values of each output.

We ge the prediction probabilities by passing it through an intance of the nn.Softmax module.

원시 예측값을 nn.Softmax 모듈의 인스턴스에 동과시켜 예측 확률을 얻는다.

X = torch.rand(1, 28, 28, device=device)
logits = model(X)
pred_probab = nn.Softma(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")
Predicted class: tensor([5], device='cuda:0')

Model Layers

Let's break down the layers in the FashionMNIST model. To illustrate it, we will take a sample minibatch of 3 images of size 28x28 and see what happens to it as we pass it through the network.

input_image = torch.rand(3, 28, 28)
print(input_image.size())
torch.Size([3, 28, 28])

nn.Flatten

nn.Flatten 계층을 초기화하여 각 28x28의 2D 이미지를 784픽셀 값을 갖는 연속된 데이터 배열로 변환합니다. (dim=0의 미니배치 차원은 유지된다.)

flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())
torch.Size([3, 784])

nn.ReLU

Non-linear activations(비선형 활성화모델) are whar create the complex mappings between the model's inputs and outputs.

They are applied after linear transformations to introduce nonlinearity, helping neural networks learn a wide variety of phenomena.

print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")
Before ReLU: tensor([[ 0.0374, -0.3056,  0.0848,  0.3595,  0.0654,  0.1709,  0.1897, -0.5805,
          0.3290, -0.3406, -0.1706,  0.1420, -0.3289, -0.1867,  0.1950,  0.1746,
          0.3885,  0.2657, -0.0523,  0.0655],
        [-0.1608, -0.0763,  0.0113,  0.2765,  0.0735, -0.1160,  0.0648, -0.2053,
          0.0918, -0.2683, -0.1944,  0.0756, -0.2193, -0.1026,  0.0572,  0.0362,
          0.3400,  0.4623,  0.3123,  0.1824],
        [-0.1569,  0.2844, -0.3348, -0.0135, -0.1282, -0.0258, -0.2016, -0.2160,
         -0.0594, -0.1819, -0.2847,  0.0146, -0.1432, -0.1348,  0.2704,  0.4425,
          0.2608, -0.1394, -0.1564, -0.3325]], grad_fn=<AddmmBackward0>)


After ReLU: tensor([[0.0374, 0.0000, 0.0848, 0.3595, 0.0654, 0.1709, 0.1897, 0.0000, 0.3290,
         0.0000, 0.0000, 0.1420, 0.0000, 0.0000, 0.1950, 0.1746, 0.3885, 0.2657,
         0.0000, 0.0655],
        [0.0000, 0.0000, 0.0113, 0.2765, 0.0735, 0.0000, 0.0648, 0.0000, 0.0918,
         0.0000, 0.0000, 0.0756, 0.0000, 0.0000, 0.0572, 0.0362, 0.3400, 0.4623,
         0.3123, 0.1824],
        [0.0000, 0.2844, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
         0.0000, 0.0000, 0.0146, 0.0000, 0.0000, 0.2704, 0.4425, 0.2608, 0.0000,
         0.0000, 0.0000]], grad_fn=<ReluBackward0>)

nn.Sequential

nn.Sequential is an ordered container of modules. The data is passed through all the modules in the same order as defined. You can use sequential containers to put together a quick network like seq_modules.

nn.Sequential은 순서를 갖는 모듈의 컨테이너

seq_modules = nn.Sequential(
	flatten,
    layer1,
    nn.ReLU(),
    nn.Linear(20, 10)
)
input_image = torch.rand(3, 28, 28)
logits = seq_modules(input_image)

nn.Softmax

The last linear layer of the neural network returns logits- raw values in [-infty, infty] - which are passed to the nn.Softemax module. The logits are scaled to values[0,1] representing the model's predicted probabilities for each class. dim parameter indicates the dimension along which the values must sum to 1.

softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)

Model Parameters

Many layers inside a neural network are parameterized, i.e. have associated weights and biases that are optimized during training. Subclassing nn.Module automatically tracks all fields defined inside your model object, and makes all parameters accessible using your model's parameters() or named_parameters() methods

print(f"Model structure: {model}\n\n")
for name, param in model.named_parameters():
	print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]}\n")
Model structure: NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


Layer: linear_relu_stack.0.weight | Size: torch.Size([512, 784]) | Values : tensor([[-0.0047,  0.0085,  0.0057,  ..., -0.0256, -0.0020,  0.0332],
        [ 0.0163, -0.0193,  0.0129,  ..., -0.0240,  0.0232,  0.0134]],
       device='cuda:0', grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.0.bias | Size: torch.Size([512]) | Values : tensor([0.0338, 0.0183], device='cuda:0', grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.2.weight | Size: torch.Size([512, 512]) | Values : tensor([[ 0.0267, -0.0284,  0.0260,  ...,  0.0068, -0.0394,  0.0128],
        [ 0.0184, -0.0429, -0.0166,  ..., -0.0362,  0.0020, -0.0371]],
       device='cuda:0', grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.2.bias | Size: torch.Size([512]) | Values : tensor([ 0.0425, -0.0381], device='cuda:0', grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.4.weight | Size: torch.Size([10, 512]) | Values : tensor([[ 0.0436,  0.0070,  0.0214,  ..., -0.0170,  0.0320,  0.0129],
        [ 0.0198, -0.0013, -0.0326,  ..., -0.0114,  0.0281, -0.0406]],
       device='cuda:0', grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.4.bias | Size: torch.Size([10]) | Values : tensor([ 0.0101, -0.0142], device='cuda:0', grad_fn=<SliceBackward0>)
반응형