Gradient descent, or 'Downhill algorithm,' is one of the simplest algorithms for optimization problems. The algorithm searches for the local minimum. It can apply to both constrained and unconstrained problems. The algorithm requires computing the first derivative of the function. Another drawback of the method is its computation speed, as it's a first-order method.

Define objective function f(x)

Compute gradient of f w.r.t. x, i.e.gradient(f,x) = [df/dx1 , df/dx2, . . .]

where x = {x1, x2, . . .}

3. Perform iterative loop; given s = learning rate (constant). s can't be too large to prevent solution divergence.

for i=1:maxiter

fo = f(x)

x = x - gradient(f,x)*s % update variable

f = f(x)

if norm(f-fo,2) < tol

break

end

end

__EXAMPLE 1__: Himmelblau's function

The test example with Himmelblau's function is shown here

The objective function f and its gradients (f,x and f,y) are given as

Himmeblau's function is a nonconvex type, which contains four local minima. The optimum solution depends on starting point as we roll a stone downhill.

__EXAMPLE 2__: Profit target

A company plans to submit a product's price (x) to its customer. The cost (c) and profit target (a) are given. The objective function can be written in the form:

f = 1/2*(x-c-a)^2

where

gradient(f,x) = x-c-a,

The gradient descent solution is as follows:

Given, f

gradient(f,x) = x-c-a

for i = 1:maxiter

fo = 1/2*(x-c-a)^2

x = x - (x-c-a)*s

f = 1/2*(x-c-a)^2

if norm(f-fo,2) < tol

break

end

end

Further reading

## Comments