Computer Revolution (www.comrevo.com): CUDA

Showing posts with label CUDA. Show all posts

Saturday, 8 August 2020

CUDA Memory Architecture of GPU | CUDA GPU Architecture

In this post, we will see CUDA Memory Architecture of GPU | CUDA GPU Architecture | cuda memory architecture of gpu,cuda gpu architecture,cuda programming

Basics of CUDA Programming | CUDA Terminologies | Host, Device, Kernel, Thread, Block, Grid, Warp:
https://www.youtube.com/watch?v=lwA4SK-82rI

What is CUDA? / Basics of CUDA (Necessity of GPU, Host, Device, Kernel, Stream Multiprocessor, Stream Processor, Thread, Block, Grid, Warp, Memory architecture of GPU):
https://www.comrevo.com/2015/05/what-is-cuda-basics-gpu-host-device-kernel-stream-multiprocessor-thread-block-grid-warp-memory-architecture-global-shared-constant-texture-local-registers.html

Watch following video:

Watch on YouTube: https://www.youtube.com/watch?v=9zeAcO2Etlk

Friday, 7 August 2020

CUDA Vector Addition Program | Basics of CUDA Programming with CUDA Array Addition with All Cases

Basics of CUDA Programming | CUDA Terminologies | Host, Device, Kernel, Thread, Block, Grid, Warp

Thursday, 30 July 2020

How to run CUDA program on Google Colab | How to run CUDA program online | Run CUDA prog without GPU

In this post, we will see How to run CUDA program on Google Colab | How to run CUDA program online | Run CUDA prog without GPU | how to run cuda program on google colab,how to run cuda program without gpu,how to run cuda program online,how to run cuda program,google colab gpu tutorial,google colab gpu example,google colab gpu vs cpu,google colab gpu acceleration,google colab tensorflow gpu

Link for steps to set up Google Colab for CUDA Programming:
https://medium.com/@harshityadav95/how-to-run-cuda-c-or-c-on-google-colab-or-azure-notebook-ea75a23a5962

Watch following video:

Watch on YouTube: https://www.youtube.com/watch?v=gggQ9-_crmU

Wednesday, 25 July 2018

How to run CUDA Program on Remote Machine

In this post, we will see how to run CUDA program on remote machine i.e. to run CUDA program on a system which doesn't have GPU and doesn't have CUDA installation and you want to use gpu of another system in your network.

Interview Questions on CUDA Programming

Following are the questions on CUDA programming generally asked in interviews. Go through them.

OpenCL Program for Vector / Array Addition

To learn Parallel Computing with OpenCL, you should start with example of Array Addition as it illustrates the proper use of multi-threading paradigm.
In this post, we will see OpenCL program for Array / Vector Addition.

Cuda Programming

Go through the following posts:

1. Interview Questions on CUDA Programming

2. What is CUDA? / Basics of CUDA (Necessity of GPU, Host, Device, Kernel, Stream Multiprocessor, Stream Processor, Thread, Block, Grid, Warp, Memory architecture of GPU)

3. How to run CUDA program on Google Colab | How to run CUDA program online | Run CUDA prog without GPU

4. How to run CUDA Program on Remote Machine

5. Cuda program for addition of two one dimensional arrays

CUDA Multi-GPU : To set a GPU of required Compute Capability as current GPU

Suppose, your system has multiple CUDA enabled GPUs. Only one GPU can be set as current GPU. In this situation, you can set a particular GPU having specific compute capability(say 1.2) as a current GPU.
There are two ways to achieve the same as follows:

Cuda program using Constant memory of a GPU

Cuda program for matrix multiplication using shared memory

In this post, we will see CUDA Matrix Multiplication Shared Memory | CUDA Matrix Multiplication Code and Tutorial | cuda matrix multiplication code,cuda matrix multiplication tutorial,cuda matrix multiplication shared memory,cuda matrix multiplication,cuda programming,cuda programming tutorial,cuda programming c++.

Watch following video:

Watch on YouTube: https://www.youtube.com/watch?v=XeR400_QFXQ

In last post, we have seen matrix multiplication in Cuda. In this post, we will see matrix multiplication using shared memory.
Here, I have considered two matrices of sizes row1*col1 and row2*col2. Resultant matrix(product), definitely will be of size row1*col2. That's why, I have considered a two dimensional grid having row1*col2 blocks. Each block will be responsible to calculate one value of product. To find each value in product, there will be col1 or row2 number of multiplications. So I considered, Col1 number of threads in each block; So one thread for one multiplication. In short, total number of blocks = row1*col2 and in each block, number of threads = col1(or row2).
I have used shared array p[] to save intermediate multiplication values. In each block, each thread will find out value of p[i]. __syncthreads() will make sure that all threads has finished their computation i.e. all values of array p[] are available which will be added together to get one value of product.

Cuda: Finding Compute Capability of a GPU

What is Compute Capability?
Compute capability represents the micro architecture generation of a gpu. e.g. 1.x represents Tesla micro architecture generation. Similarly 2.x represents Fermi, 3.x represents Kepler, 5.x represents Maxwell. The number before decimal point is called as major. It represents significant change in the generation of micro architecture. The number after the decimal point is called minor. It represents smaller change in the micro architecture generation. It is just like Android versions. 4.x is called Kitkat while 5.x is called Lollypop version. 4.1 and 4.2 shows the smaller changes in the Kitkat version.

Cuda program for multiplication of two matrices

For Cuda multiplication using shared memory, check Next Post.

Watch following video:

Watch on YouTube: https://www.youtube.com/watch?v=XeR400_QFXQ

In this post of "Cuda program for multiplication of two matrices", I have considered two cases:
1. Two dimensional blocks and one thread per block.
2. One block and two dimensional threads in that block.

CUDA program to add two matrices

In this post, we will see CUDA Matrix Addition | CUDA Program for Matrices Addition | CUDA Programming | cuda matrix addition,cuda programming,cuda programming tutorial,cuda programming c++,cuda programming model,cuda programming tutorial for beginners,cuda programming for beginners,cuda programming nvidia,cuda programming linux

Watch following Video:

Watch on YouTube: https://www.youtube.com/watch?v=jTAxCbcxwJA

Here, two cases are considered.
1. Two dimensional blocks and one thread per block.
2. One block and two dimensional threads in that block.

Cuda program for addition of two one dimensional arrays

In this post, we will see CUDA Vector Addition Program | Basics of CUDA Programming with CUDA Array Addition with All Cases | cuda vector addition,cuda programming,cuda programming tutorial,cuda programming c++,cuda programming model,cuda programming tutorial for beginners,cuda programming for beginners,cuda programming basics,cuda programming nvidia,cuda processor,cuda program for array addition.

Watch following video:

Watch on YouTube: https://www.youtube.com/watch?v=vo0eCxoAf68

Following programs are checked on Nsight Eclipse. Nsight Eclipse is Eclipse IDE for C/C++ bundled with the libraries of CUDA. These results are checked on a system which has GPU of compute capability of 1.2. But you will get the same result on GPU of any compute capability for the following code.

Here three cases are considered for addition of two arrays:
1. n blocks and one thread per block.
2. 1 block and n threads in that block.
3. m blocks and n threads per block.

Program's complete code and the respective output is given below.

What is CUDA? / Basics of CUDA (Necessity of GPU, Host, Device, Kernel, Stream Multiprocessor, Stream Processor, Thread, Block, Grid, Warp, Memory architecture of GPU)

In this post, we will see Basics of CUDA Programming | CUDA Terminologies | Host, Device, Kernel, Stream Multiprocessor, Stream Processor, Thread, Block, Grid, Warp, gpu vs cpu,what is cuda,what is cuda cores,what is cuda cores in graphics cards,what is cuda gpu,what is cuda programming,what is cuda and opencl,what is cuda toolkit,what is cuda nvidia,what is cuda cores in gpu.

Watch following video:

Watch on YouTube: https://www.youtube.com/watch?v=lwA4SK-82rI

Why do we need GPU when already we have CPU?
GPU stands for Graphics Processing Unit while CPU stands for Central Processing Unit. For CPUs which has say four cores, we at max can run four threads simultaneously (one thread on each core). In graphics(e.g. image) processing, so many pixels are processed simultaneously. For so much simultaneous processing, we need so many threads running simultaneously. But as I told before, for a CPU(which has limited number of cores), we can run limited number of threads simultaneously. Hence we need GPU which consisted of many more cores, specially in a case of graphics processing(e.g. Games on Computers).
For your information, Nvidia's GPU GeForce GTX TITAN X consisted of 3072 cores. For it's specification, go through http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-titan-x/specifications.

Computer Revolution (www.comrevo.com)

Search This Website

Menu Bar

Saturday, 8 August 2020

CUDA Memory Architecture of GPU | CUDA GPU Architecture

Friday, 7 August 2020

CUDA Vector Addition Program | Basics of CUDA Programming with CUDA Array Addition with All Cases

Basics of CUDA Programming | CUDA Terminologies | Host, Device, Kernel, Thread, Block, Grid, Warp

Thursday, 30 July 2020

How to run CUDA program on Google Colab | How to run CUDA program online | Run CUDA prog without GPU

Wednesday, 25 July 2018

How to run CUDA Program on Remote Machine

Sunday, 28 May 2017

Interview Questions on CUDA Programming

Thursday, 9 March 2017

OpenCL Program for Vector / Array Addition

Wednesday, 31 August 2016

Cuda Programming

Monday, 25 May 2015

CUDA Multi-GPU : To set a GPU of required Compute Capability as current GPU

Cuda program using Constant memory of a GPU

Monday, 18 May 2015

Cuda program for matrix multiplication using shared memory

Thursday, 14 May 2015

Cuda: Finding Compute Capability of a GPU

Wednesday, 13 May 2015

Cuda program for multiplication of two matrices

CUDA program to add two matrices

Monday, 11 May 2015

Cuda program for addition of two one dimensional arrays

Saturday, 9 May 2015

What is CUDA? / Basics of CUDA (Necessity of GPU, Host, Device, Kernel, Stream Multiprocessor, Stream Processor, Thread, Block, Grid, Warp, Memory architecture of GPU)

Domains List

Contact Form