Number of threads in a block running simultaneously on a SM is called a Warp.
Suppose a block has 128 threads. That block is going to be run on one SM which has only 8 SPs. On 8 SPs, only 8 threads can be run. If you consider an instruction pipeline of four phases(say Fetch, Decode, Execute, Write-back), then 4 threads can be run on one SP. Hence number of threads running simultaneously on that SM will be 8 * 4=32. These 32 threads form one warp. As in our example, block contains 128 threads, this block will have 4 warps. This means on that SM, first warp will get run, then second, third, fourth and after that again first, second....
Post a Comment