Towards Calculating Hpc Cuda Kernel Performance On Nvidia Gpus