Optimizing Artificial Intelligence Algorithms with Gradient Descent

Optimizing Artificial Intelligence Algorithms with Gradient Descent

Naveen Joshi 06/04/2020 3
Optimizing Artificial Intelligence Algorithms with Gradient Descent

With the help of gradient descent algorithms, coders can reduce the cost function and increase the optimization of algorithms.

ML algorithms and deep learning neural networks work on a set of parameters like weights and biases, and a cost function that evaluates how good a set of parameters is. The lower the cost function goes, the greater is the accuracy of training data and output. To reduce this cost function is one of the key aspects of optimization in AI models. There are various optimization algorithms available that can be used to optimize AI models. And, the gradient descent algorithm is a highly popular optimization algorithm from them.

How a Gradient Descent Algorithm Works

Suppose you have a ball and you place it on an inclined plane, according to the gravitational law, the ball will roll downwards until it rests on a gentle plane and stops rolling. Gradient descent algorithm functions in a similar way to the above example. Suppose the ball in the above example is the cost function, then gradient descent algorithms will act as a gravitational force to bring it down until it reaches an optimal value which would be the gentle plane. Every time the gradient descent algorithm runs, it calculates a cost function and then it iterates to find whether the cost function can be further minimized or not. And, this process continues till it finds the optimal cost function which cannot be minimized. The gradient descent starts with a set of parameters, and then it improves them slowly. The gradient or derivative calculated tells the developer the incline or slope of the cost function. Then to reduce the cost, the developer moves in the direction opposite to gradient. The gradient is calculated with the help of different types of gradient descent algorithms.

What are the Different Types of Gradient Descent Algorithms?

On the basis of the amount of data a gradient descent algorithm ingests, it can be classified into three types:

Optimizing Artificial Intelligence Algorithms with Gradient Descent

Batch Gradient Descent

It computes the gradient of the cost function with respect to the parameter for entire training data. Since batch gradient descent calculates the gradient for the entire dataset to perform one parameter update, it can be very slow.

Stochastic Gradient Descent

It computes the gradient for each data using a single training point chosen at random. Therefore learning happens after computing the gradient for each data. Since the gradient is calculated individually for each data, the initial calculation becomes very slow.

Mini-Batch Gradient Descent

In mini-batch gradient descent, first, the model developer divides the entire dataset into mini-batches, and then the gradient is calculated for each mini-batch. It becomes faster than batch gradient descent as while a parameter update, it has to go through a lot less data. But while minimizing the cost function, the gradient curve in mini-batch gradient descent goes back and forth and does not converge. Hence, it becomes difficult to get the exact learning curve.

Gradient descent algorithms work mostly fine, but there are some challenges because of which gradient descent might not function properly. For instance, if the input data is arranged in a way that it poses a non-convex optimization problem like invex functions, random coordinate descent, and issues with convergence rate, then it becomes difficult to perform optimization with the help of gradient descent algorithms. The difficulty arises as for non-convex optimization, the incline and slope of gradient become difficult to find.

Share this article

Leave your comments

Post comment as a guest

terms and condition.
  • Matt Stappleton

    I cannot believe how many teachers get paid in schools and in universities to make students feel stupid just because they cannot explain a very important topic in a similar way like you! So much respect!

  • Jordan F

    Clear explanation ! Thanks a lot !

  • Melissa Potter

    Excellent explanation

Share this article

Naveen Joshi

Tech Expert

Naveen is the Founder and CEO of Allerin, a software solutions provider that delivers innovative and agile solutions that enable to automate, inspire and impress. He is a seasoned professional with more than 20 years of experience, with extensive experience in customizing open source products for cost optimizations of large scale IT deployment. He is currently working on Internet of Things solutions with Big Data Analytics. Naveen completed his programming qualifications in various Indian institutes.

Cookies user prefences
We use cookies to ensure you to get the best experience on our website. If you decline the use of cookies, this website may not function as expected.
Accept all
Decline all
Read more
Tools used to analyze the data to measure the effectiveness of a website and to understand how it works.
Google Analytics