gpu-shared-memory

Interpreting the verbose output of ptxas, part II

This question is a continuation of Interpreting the verbose output of ptxas, part I . When we compile a kernel .ptx file with ptxas -v , or compile it from a .cu file with -ptxas-options=-v , we get a few lines of output such as: ptxas info : Compiling entry function 'searchkernel(octree, int*,...

GPU shared memory size is very small - what can I do about it?

The size of the shared memory ("local memory" in OpenCL terms) is only 16 KiB on most nVIDIA GPUs of today. I have an application in which I need to create an array that has 10,000 integers. so the amount of memory I will need to fit 10,000 integers = 10,000 * 4b = 40kb. How can I work around this?...