iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🐻‍❄️

[Activation Function] 01: What is an Activation Function?

に公開

Introduction

In this article, I will explain the activation function, one of the components of a neural network.

You likely often encounter the term "activation function" when reading articles about AI, machine learning, and deep learning. Activation functions are indispensable to neural networks.

However, many people might feel like, "I looked it up, but I don't quite get it yet." In this post, I will provide a clear and thorough explanation of activation functions specifically for beginners.

What we will cover:

  1. Review the flow of a neural network.
  2. Introduce the role of activation functions and their visual concepts.

Please read through to the end.

↓ You can also watch it in video format here ↓
https://www.youtube.com/watch?v=Oqu6sVWSE9U

Reviewing the Flow of Neural Networks

First, let's briefly review the flow of a neural network.

In a neural network, various transformations (calculations) are performed on the input numbers to produce a final result. Simply put, this is all a neural network does.

To explain it in a bit more detail, calculations are performed in the following three steps:

  • Step 1: Multiply by weights.
  • Step 2: Sum up the neuron values.
  • Step 3: Perform transformation using an activation function.

Let's take a closer look at the flow of numbers within this neural network.

▶️ 1st Transformation

First, the numbers input into the neural network enter "neurons." A neuron refers to these circular elements. You can imagine a neuron as a box that can hold exactly one number.

The value of that neuron travels through a line called a "synapse" to the next neuron. A synapse is a pathway for numbers.

When passing through this synapse, the value is converted into a different number. This is the first transformation. In this transformation, the "weight" assigned to the synapse is multiplied by the neuron's value.

For example, if the value of the first neuron is 3, it is multiplied by a weight of 2, and 6 is carried to the next neuron. This weight varies for each synapse.

▶️ 2nd Transformation

Next is the second transformation.

Numbers coming from many previous neurons gather into a single neuron and are added together.

This "addition" process can also be considered a type of numerical transformation.

In other words, you can imagine it as consolidating values into a single neuron.

▶️ 3rd Transformation

Now, that number proceeds to the next neuron again.

However, at this point, the neuron's value cannot proceed as is. Within the neuron, the value undergoes a certain transformation before moving on to the next one. This transformation is called an "activation function." This is the third transformation.

The above covers the basic numerical transformations occurring within a neural network.

From here, we will delve deeper into the main topic: activation functions.

So, What is an Activation Function?

▶️ What is it in the first place?

Activation functions are not simple calculations like addition or multiplication; they play the crucial role of "determining what kind of information to output to the next neuron."

To put it a bit more specifically, using activation functions allows for more complex calculations.

While very simple problems can be solved without activation functions, you cannot perform difficult predictions or classifications without them.

For example, classifying massive amounts of data or recognizing speech—most of the problems we want to solve are difficult.

Activation functions are what enable us to solve these kinds of difficult problems that occur in the real world.

▶️ A bit more detail

Now, let's look at activation functions in a bit more detail.

To put the previous explanation more technically, we can say that activation functions increase the "degree of freedom in expression."

Most people probably don't quite get it when told it "increases the degree of freedom in expression." So first, let's just look at the visual concept.

The concept is like this.

Here, we want to cleanly classify two types of data by drawing a single line. Imagine that each circle in the diagram represents an individual piece of data.

In this case, without an activation function, you can only draw a straight line. However, by using an activation function, you can draw a wavy curve. This is the idea behind "increasing the degree of freedom in expression."

It's okay if you don't understand the details perfectly as long as you grasp the general concept of activation functions. Just keep in mind that using an activation function allows us to classify data by drawing wavy curves.

Conclusion

Next time, I will explain what types of activation functions are actually used.

By learning the specific details, your understanding of activation functions will deepen further.

↓ Next... ↓
https://zenn.dev/nekoallergy/articles/ml-basic-act-02

Discussion