The Neural Network

Neural Networks

Let’s introduce the architecture that revolutionized everything: the Neural Network. Most of the modern AI and Deep Learning advancements are based on Neural Networks.

It is a remarkably simple model inspired by the functioning of the brain. Today, we will look into the simplest form of a Neural Network, known as a Feed-forward Neural Network (FNN for short).
So, what exactly is a neural network?

Neural Networks

The Neuron

The fundamental unit of the FNN is the neuron. It’s not as complex as it may seem. You’ve actually seen it before... The neuron is essentially the perceptron! Typically, it is represented as follows:

Neural Networks

The Neuron

The neuron consists of a weighted sum (linear combination) of the inputs:

$$y = w_0 + w_1 x_1 + w_2 x_2 + \dots + w_k x_k$$

where $w_0$ is a bias parameter, $x_i$ is the i-th input, and $w_i$ is an arbitrary multiplication factor.

Neural Networks

The Neuron

$$y = w_0 + w_1 x_1 + w_2 x_2 + \dots + w_k x_k$$

The result $y$ is then transformed by passing it through a function $g(\cdot)$:

$$z = g(y)$$

The function $g(\cdot)$ is commonly referred to as the activation function.

Neural Networks

The Neuron

$$y = w_0 + w_1 x_1 + w_2 x_2 + \dots + w_k x_k$$

$$z = g(y)$$

These equations completely define the concept of the Neuron within the context of the FNN.

Neural Networks

Wait, what's an Activation Function?

Activation Functions are usually non-linear functions used to transform the linear output of a neuron. They help increase the expressive power of our Neural Network.

Among these options, ReLU is the most commonly used nowadays*.

*The ReLU function has numerous variations that are used instead to help mitigate some of its shortcomings.

Neural Networks

Layer

Neurons are usually assembled in a structure called a layer.

Neural Networks

Layer

An arbitrary number of neurons can be arranged side by side to create a layer with $m$ outputs, where $m$ is the number of neurons in the layer.

This kind of layer is sometimes called a Dense layer, Fully-connected layer or Linear layer.

Neural Networks

Stacking Layers

Layers, in turn, can be stacked on top of each other to enhance the complexity of the final model and increase its expressive power.

The output of a layer will be the input for the next layer.

Neural Networks

Stacking Layers

More specifically, the single Neuron of a layer receives inputs from all the outputs of the previous layer, hence the name Fully-connected layer.

N.B. The propagation of signals goes only from the input to the output, there are no loops! So, that's why we call it a Feed-forward Neural Network!

Neural Networks

Stacking Layers

Given a certain amount of neurons and layers, a FNN can approximate any possible function.

Neural Networks

The importance of Activation Functions

If it weren't for activation functions, a neural network would be just a sum of linear functions.
A sum of linear functions is still a linear function.

Activation functions introduce non-linearity in the model, allowing us to model functions that are non-linear.

Neural Networks

Remember

Even though FNNs are conceptually simple, they can become quite big and complex very easily, which brings forth all sorts of problems: they can be hard to train and are prone to overfitting. Exercise with maximum care!