The Unbreakable Relation Of GPU And Deep Learning

Often times we hear about how GPU And Deep Learning are tied together and that later performs better on GPU than CPU. Several enthusiast performance benchmarking has shown that the basic Matrix Multiplication is at times 55x faster on GPU than CPU. Well I would admit that most of such examples I have seen are at the best rookie. But there is no denying in the fact that GPUs are more tuned for basic Math operation than CPUs. But what makes GPU so much efficient?

It's All About Mathematics

To understand why GPU perform better for Deep Learning, we need to understand what is Deep Learning at first place. If you haven't seen it already, refer this article for basic explanation. The Deep Learning fundamentally involves a computer algorithm. And just like any other computer program has to deal with Input-Process-Output cycles. Let's understand it with an example.

Let's assume that you are working on a ML model that is meant to predict wind speed for particular region. You have the large set of statistical data comprising of few parameters like pressure, altitude, temperature, time of year, etc. You decide to employ deep learning for your prediction but how would you go about configuring it?

GPU And Deep Learning Mathematics


Our brains have neurons and so does the machine learning. The very fundamental of neuron is to take multiple inputs and generate one output. That output can be final to can be input to another neuron.

In case of your wind prediction model, the input of your first layer of neurons could be the parameters from your dataset (pressure, altitude, temperature, etc.)


As and when your neurons learn they tend to assign weight to each input ranking them important than others based on pattern observed while training on the dataset. If 'a' were your input and 'w' were the weight then the input to the neuron becomes a multiplied by w, i.e., 'a*w' (see maths seeping in)


The Bias is defined as the tendency of algorithm to boost several inputs and as result derive erroneous result. High Bias algorithms results in underfit (incorrect) output.  Mathematically, just after the neuron is finished processing input with weight it applies bias on it. If 'b' were the bias the result of the operation would be 'a*w+b'

Activation Function

This is the 'processing' part of the neurons. After the weight and biases are applied, we need a function which can translate all of that into an output. Usually this function is non-linear in nature, i.e., the change in input is not direction proportional to output. This very concept makes the neural network so powerful and math intensive. Many such function exists like Sigmoid (1/(1+e-x)), ReLU (max(x,0)), Softmax, etc.

Neural Network

Now multiply these neurons in hundreds with multiple layers and each assessing different attribute of wind model and you have a big number crunching problem in front of you!

The Matrix

GPU And Deep Learning MatrixLet me take you back to the schooling. If you remember we learnt something called as 'Matrix' or 'Matrices' as part of the Mathematics subject. It is defined as arrangement of numbers in Rows and Columns. These Rows and Columns can be added, subtracted, multiplied, divided with each other. Single row matrix is called Row vector and single Column matrix is called Column Vectors.

But why are we discussing Matrix here? Well all of the math we discussed above is actually represented in Matrix form. The computer language used to build and execute the ML algorithm like Python or MATLAB provides rich libraries to Matrix manipulation.

Why Matrix/Vectors?

For a very simple reason. It is much more efficient than the traditional programming constructs of 'looping', 'recursion', etc. Most of the ML problems are statistical in nature and linear algebra is the de facto choice.

How Does It All Ties to GPU?

The GPUs are different from CPUs in the sense that they are specifically designed to perform number crunching. The Matrix especially two, three or four dimensions are natively supported by GPU. And operations like Matrix Multiplication (extensively used in ML) are executed faster. Moreover, the GPU contains far more cores (hundreds of cores) than CPU and each can perform Matrix Multiplication thus increasing the per second throughput. This is called parallalization.

Does this mean the CPU is less powerful? Not really, in fact on the contrary, the CPU has much higher frequency (processing power) than GPU. But CPU supports more generic operations required to 'operate' your machine while having limited amount of cores to do the processing.


With advent of Machine Learning, GPU has become popular computing platform. It indeed is faster in raw calculation throughput but only if carefully done. Don't forget much of the data plumbing (loading/unloading into and from GPU memory) is done by CPU. Therefore to harness the full potential the algorithms have to be carefully designed to spend majority of their time in GPU.

Leave a Reply

Your email address will not be published. Required fields are marked *