Monday 25 May 2015

Cuda program using Constant memory of a GPU




1. Memory architecture of a GPU & Use of Constant memory
            Please check this link What is CUDA? / Basics of CUDA / An overview of CUDA .

2. Overview
            In CUDA, constants are declared with keyword __constant__  which is used as a prefix in a constant definition (e.g. __constant__ int x). There is no need to use cudaMalloc() for constants as automatically memory(constant memory) from GPU is allocated to constants when we specify keyword __constant__. Instead of cudaMemcpy, we have to use cudaMemcpyToSymbol() to transfer the contents from cpu variable to gpu constant. When we use cudaMemcpyToSymbol(), definitely it means, we are transfering from host(cpu) to device(gpu). Thats why, there is no need to mention cudaMemcpyHostToDevice in the arguments of cudaMemcpyToSymbol().



3. Program to add constant array of size 5, {1,2,3,4,5} into the user entered array:
#include<stdio.h>
#include<cuda.h>

__constant__ int d[5];

__global__ void add(int *c)
{
    int id=threadIdx.x;

    c[id]=c[id]+d[id];
}

int main()
{
    int a[5];
    int b[5]={1,2,3,4,5};
    int *c;
    int i;

    printf("Enter the five elements of your array:\n");

    for(i=0;i<5;i++)
    {
        scanf("%d",&a[i]);
    }

    cudaMalloc((void **)&c,5*sizeof(int));

    cudaMemcpy(c,a,5*sizeof(int),cudaMemcpyHostToDevice);

    cudaMemcpyToSymbol(d,b,5*sizeof(int)); /*copying contents of array b to constant array d */

    add<<<1,5>>>(c);

    cudaMemcpy(a,c,5*sizeof(int),cudaMemcpyDeviceToHost);

    printf("Elements of your array after addition with constant array {1,2,3,4,5} :\n");

    for(i=0;i<5;i++)
    {
        printf("%d\t",a[i]);
    }

    cudaFree(c);
    cudaFree(d);

}

4. Output:
Enter the five elements of your array:
2 3 4 5 6
Elements of your array after addition with constant array {1,2,3,4,5} :
3    5    7    9    11   




No comments:

Post a Comment