Benz, Jason and Locher, Adrian (2023) CuSharp - A GPU Compute Framework for .NET. Other thesis, OST Ostschweizer Fachhochschule.
FS 2023-BA-EP-Benz-Locher-CuSharp - A GPU Compute Framework for .NET.pdf - Supplemental Material
Download (1MB)
Abstract
The number of computationally intensive applications is growing. For many easily parallelizable problems, GPUs offer better performance than CPUs. As a result, GPUs are now used not only for graphical applications, but also for machine learning and cryptography. GPU-accelerated programs have traditionally been written in C, C++ for high-performance applications such as physics simulations and graphical applications and more recently in Python to optimize machine learning algorithms. Most GPU-APIs, including Nvidia CUDA, restrict their developers to using C, C++ or Python to write programs targeting those APIs.
In this thesis a framework called CuSharp has been developed that allows developers to build and run GPU-executable kernels directly in C#. This is achieved by using existing toolchains complemented by a specifically developed cross-compiler. In a first step, the Roslyn compiler is used to compile the C# kernel to Microsoft Intermediate Language (MSIL). Subsequently, the CuSharp compiler cross-compiles MSIL to NVVM IR, which is a low-level but platform-independent intermediate representation. Finally, NVVM IR is translated to PTX ISA, an assembly-like language for Nvidia GPUs, using the NVVM compiler library (libNVVM). By further encapsulating all device-specific compiler settings, we allow future development efforts to add support for devices other than those manufactured by Nvidia.
CuSharp supports the compilation of static methods, written in a specific C# subset, either just-in-time or ahead-of-time. The resulting PTX ISA kernels’ performance was benchmarked and compared to kernels compiled by the Nvidia CUDA Compiler (NVCC). For a kernel that computes matrix multiplications, a performance slowdown between 1.4% and 4.8% was measured for CuSharp-compiled kernels compared to NVCC-compiled kernels. The measurements refer to the execution time of the kernel on the graphics card, excluding the compilation process and the data transfers.
This thesis shows the challenges of interfacing with the LLVM compiler infrastructure and the Nvidia CUDA API. In addition, it provides an overview of the complex landscape of APIs that can be used to interface with GPU devices in general, by comparing their toolchains and languages. Furthermore, it demonstrates that GPU kernels can be written in a high-level language such as C#, which is widely used in the industry, while suffering only minor performance degradation.
Item Type: | Thesis (Other) |
---|---|
Subjects: | Topics > Software Topics > Software > Performance Area of Application > Development Tools Technologies > Programming Languages > C# Technologies > Programming Languages > C++ Technologies > Frameworks and Libraries Technologies > Frameworks and Libraries > .NET Technologies > Parallel Computing Technologies > Parallel Computing > CUDA (Compute Unified Device Architecture) Brands > nVidia |
Divisions: | Bachelor of Science FHO in Informatik > Bachelor Thesis |
Depositing User: | OST Deposit User |
Contributors: | Contribution Name Email Thesis advisor Kramer, Philipp UNSPECIFIED |
Date Deposited: | 21 Oct 2023 12:18 |
Last Modified: | 21 Oct 2023 12:18 |
URI: | https://eprints.ost.ch/id/eprint/1156 |