The Pipeline

One time setup

activation function, preprocessing, weight initialization, regularization, gradient checking

한번만 setup하면 추가적인 setup이 필요없는 것들
Training dynamics

babysitting the learning process, parameter updates, hyperparameter optimization

training 과정에서 계속해서 변환되는 것들
Evaluation

modle ensembles

모델의 성능

Activation Functions

뉴런을 도식화한 것

뉴런으로부터 입력을 받아 가중치를 곱한 값을 뉴런에 전달(dendrite)

cell body에 해당하는 뉴런은 두가지 부분으로 이루어져있다

: 각각의 dendrite 입력들의 합, activation function

가중치들의 곱과 bias를 더한 것에 activation function을 적용시켜 activation function의 특성에 맞게 출력을 나타내고 그 출력값이 다음 모델에 전달되도록!

Activation functions - 대표적인 예들

Sigmoid

$\sigma(x) = {1 \over 1 + e ^ {-x}}$
tanh

$\tanh(x)$
ReLU

$\max(0, x)$
Leaky ReLU

$\max(0.1x, x)$
ELU

$\begin{cases} {x} & x \ge 0 \\ \alpha (e^x - 1) & x < 0 \end{cases}$
Maxout

$\max(w_1^Tx + b_1, w_2^T + b_2)$

이들은 모두 neural net의 non linear한 모습을 표현하는데 기여

Sigmoid