File kernel.h¶
Kernel functions.
Functions
-
int
GpuKernel_init(GpuKernel *k, gpucontext *ctx, unsigned int count, const char **strs, const size_t *lens, const char *name, unsigned int argcount, const int *types, int flags, char **err_str)¶ Initialize a kernel structure.
lensholds the size of each source string. If is it NULL or an element has a value of 0 the length will be determined using strlen() or equivalent code.If
*err_stris returned not NULL then it must be free()d by the caller- Parameters
k: a kernel structurectx: context in which to build the kernelcount: number of source code stringsstrs: C array of source code stringslens: C array with the size of each string or NULLname: name of the kernel functionargcount: number of kerner argumentstypes: typecode for each argumentflags: kernel use flags (see ga_usefl)err_str: (if not NULL) location to write GPU-backend provided debug info
- Return
GA_NO_ERROR if the operation is successful
- Return
any other value if an error occured
-
void
GpuKernel_clear(GpuKernel *k)¶ Clear and release data associated with a kernel.
- Parameters
k: the kernel to release
-
gpucontext *
GpuKernel_context(GpuKernel *k)¶ Returns the context in which a kernel was built.
- Return
a context pointer
- Parameters
k: a kernel
-
int
GpuKernel_sched(GpuKernel *k, size_t n, size_t *gs, size_t *ls)¶ Do a scheduling of local and global size for a kernel.
This function will find an optimal grid and block size for the number of elements specified in n when running kernel k. The parameters may run a bit more instances than n for efficiency reasons, so your kernel must be ready to deal with that.
If either gs or ls is not 0 on entry its value will not be altered and will be taken into account when choosing the other value.
- Parameters
k: the kernel to schedule forn: number of elements to handlegs: grid size (in/out)ls: local size (in/out)
-
int
GpuKernel_call(GpuKernel *k, unsigned int n, const size_t *gs, const size_t *ls, size_t shared, void **args)¶ Launch the execution of a kernel.
- Parameters
k: the kernel to launchn: dimensionality of the grid/blocksgs: sizes of launch gridls: sizes of launch blocksshared: amount of dynamic shared memory to allocateargs: table of pointers to arguments
-
struct
GpuKernel¶ - #include <kernel.h>
Kernel information structure.