airpack.deploy.trt_utils¶
Utility code for performing inference using the TensorRT framework with settings optimized for AIR-T hardware.
Module Contents¶
- 
airpack.deploy.trt_utils.make_cuda_context(gpu_index=0)¶
- Initializes a CUDA context for use with the selected GPU and makes it active. - This context is created with a set of flags that will allow us to use device mapped (pinned) memory that supports zero-copy operations on the AIR-T. - Parameters
- gpu_index (int) – Which GPU in the system to use, defaults to the first GPU (index 0) 
- Return type
- pycuda.driver.Context 
 
- 
class airpack.deploy.trt_utils.MappedBuffer(num_elems, dtype)¶
- A device-mapped memory buffer for sharing data between CPU and GPU. - Once created, the - hostfield can be used to access the memory from CPU as a- numpy.ndarray, and the- devicefield can be used to access the memory from the GPU.- Example usage: - # Create a buffer of 16 single-precision floats buffer = MappedBuffer(num_elems=16, dtype=numpy.float32) # Zero the buffer by writing to it on CPU buffer.host[:] = 0.0 # Pass the device pointer to an API that works with GPU buffers func_that_uses_gpu_buffer(buffer.device) - Note - Device-mapped memory is meant for Jetson embedded GPUs like the one found on the AIR-T, where both the host and device pointers refer to the same physical memory. Using this type of memory buffer on desktop GPUs will be very slow. - Parameters
- num_elems (int) – Number of elements in the created buffer 
- dtype (numpy.dtype) – Data type of an element (e.g., - numpy.float32or- numpy.int16)
 
- Variables
- host (numpy.ndarray) – Access to the buffer from the CPU 
- device (CUdeviceptr) – Access to the buffer from the GPU