Machine Learning ---- Gradient Descent

目录

[一、The concept of gradient:](#一、The concept of gradient:)

[① In a univariate function:](#① In a univariate function:)

[②In multivariate functions:](#②In multivariate functions:)

[二、Introduction of gradient descent cases:](#二、Introduction of gradient descent cases:)

[三、Gradient descent formula and its simple understanding:](#三、Gradient descent formula and its simple understanding:)

[四、Formula operation precautions:](#四、Formula operation precautions:)


一、The concept of gradient:

In a univariate function

gradient is actually the differentiation of the function, representing the slope of the tangent of the function at a given point

In multivariate functions

a gradient is a vector with a direction, and the direction of the gradient indicates the direction in which the function rises the fastest at a given point

二、Introduction of gradient descent cases:

Do you remember the golf course inside the cat and mouse? It looks like this in the animation:

Let's take a look at these two pictures. You can easily see the distant hill, right? We can take it as the most typical example, and the golf course can also be abstracted into a coordinate map:

So in this coordinate, we will correspond the following (x, y) to (w, b) respectively. Then, when J (w, b) is at its maximum, which is the peak in the red area of the graph, we start the gradient descent process.

Firstly, we rotate one circle from the highest point to find the direction with the highest slope. At this point, we can take a small step down. The reason for choosing this direction is actually because it is the steepest direction. If we walk down the same step length, the height of descent will naturally be the highest, and we can also walk faster to the lowest point (local minimum point). At the same time, after each step, we look around and choose. Finally, we can determine this path:Finally reaching the local minimum point A, is this the only minimum point? Of course not:

It is also possible to reach point B, which is also a local minimum point. At this point, we have introduced the implementation process of gradient descent, and we will further understand its meaning through mathematical formulas.

三、Gradient descent formula and its simple understanding:

We first provide the formula for gradient descent:

In the formula, corresponds to what we call the learning rate, and the equal sign is the same as the assignment symbol in computer program code. J (w, b) can be found in the regression equation blog in the previous section. As for the determination of the learning rate, we will share it with you next time. Here, we will first understand the meaning of the formula:

Firstly, let's simplify the formula and takeb equal to 0as an example. This way, we can better understand its meaning through a two-dimensional Cartesian coordinate system:

In this J (w, b) coordinate graph, which is a quadratic function, since we consider b in the equation to be 0,So we can assume that = ,So, such a partial derivative can be seen as the derivative in the unary case. At this point, it can be seen that when >0 and the corresponding w value is in the right half, the derivative is positive, that is, its slope is positive. This is equivalent to subtracting a positive number from w, and its w point will move to the left, which is the closest to its minimum value, which is the optimal solution. Similarly, when in the left half of the function, its w will move to the right, which is close to the minimum value, So the step size for each movement is .

This is a simple understanding of the gradient descent formula.


四、Formula operation precautions:

This is a simple understanding of the gradient descent formula

just like this:

The following is an incorrect order of operations that shouldbe avoided:

This is the understanding of the formula and algorithm implementation for gradient descent. As for the code implementation, we will continue to explain it in future articles.

Machine Learning ---- Cost function-CSDN博客

相关推荐
小黑随笔9 分钟前
【Golang玩转本地大模型实战(二):基于Golang + Web实现AI对话页面】
前端·人工智能·golang
上海源易14 分钟前
GEO vs SEO:从搜索引擎到生成引擎的优化新思路
人工智能·搜索引擎·chatgpt·deepseek·ai搜索优化
从零开始学习人工智能18 分钟前
深度解析 MindTorch:无缝迁移 PyTorch 到 MindSpore 的高效工具
人工智能·pytorch·python
Cleo_Gao24 分钟前
交我算使用保姆教程:在计算中心利用singularity容器训练深度学习模型
人工智能·深度学习·容器·计算中心
古希腊掌管学习的神33 分钟前
[Agent]AI Agent入门02——ReAct 基本理论与实战
人工智能·语言模型·chatgpt·gpt-3·agent
C灿灿数模42 分钟前
2025五一杯数学建模C题:社交媒体平台用户分析问题;思路分析+模型代码
数据库·人工智能·python
__Benco1 小时前
OpenHarmony - 小型系统内核(LiteOS-A)(十三),LMS调测
人工智能·harmonyos
Takoony1 小时前
Transformer Prefill阶段并行计算:本质、流程与思考
人工智能·深度学习·transformer
创客匠人老蒋1 小时前
当 AI 成为 “数字新物种”:人类职业的重构与进化
人工智能
说私域1 小时前
基于开源AI智能名片链动2+1模式S2B2C商城小程序的私域电商与微商融合创新研究
人工智能·小程序·开源·零售