CuSharp - A GPU Compute Framework for .NET

Benz, Jason and Locher, Adrian (2023) CuSharp - A GPU Compute Framework for .NET. Other thesis, OST Ostschweizer Fachhochschule.

[thumbnail of FS 2023-BA-EP-Benz-Locher-CuSharp - A GPU Compute Framework for .NET.pdf] Text
FS 2023-BA-EP-Benz-Locher-CuSharp - A GPU Compute Framework for .NET.pdf - Supplemental Material

Download (1MB)

Abstract

The number of computationally intensive applications is growing. For many easily parallelizable problems, GPUs offer better performance than CPUs. As a result, GPUs are now used not only for graphical applications, but also for machine learning and cryptography. GPU-accelerated programs have traditionally been written in C, C++ for high-performance applications such as physics simulations and graphical applications and more recently in Python to optimize machine learning algorithms. Most GPU-APIs, including Nvidia CUDA, restrict their developers to using C, C++ or Python to write programs targeting those APIs.

In this thesis a framework called CuSharp has been developed that allows developers to build and run GPU-executable kernels directly in C#. This is achieved by using existing toolchains complemented by a specifically developed cross-compiler. In a first step, the Roslyn compiler is used to compile the C# kernel to Microsoft Intermediate Language (MSIL). Subsequently, the CuSharp compiler cross-compiles MSIL to NVVM IR, which is a low-level but platform-independent intermediate representation. Finally, NVVM IR is translated to PTX ISA, an assembly-like language for Nvidia GPUs, using the NVVM compiler library (libNVVM). By further encapsulating all device-specific compiler settings, we allow future development efforts to add support for devices other than those manufactured by Nvidia.

CuSharp supports the compilation of static methods, written in a specific C# subset, either just-in-time or ahead-of-time. The resulting PTX ISA kernels’ performance was benchmarked and compared to kernels compiled by the Nvidia CUDA Compiler (NVCC). For a kernel that computes matrix multiplications, a performance slowdown between 1.4% and 4.8% was measured for CuSharp-compiled kernels compared to NVCC-compiled kernels. The measurements refer to the execution time of the kernel on the graphics card, excluding the compilation process and the data transfers.

This thesis shows the challenges of interfacing with the LLVM compiler infrastructure and the Nvidia CUDA API. In addition, it provides an overview of the complex landscape of APIs that can be used to interface with GPU devices in general, by comparing their toolchains and languages. Furthermore, it demonstrates that GPU kernels can be written in a high-level language such as C#, which is widely used in the industry, while suffering only minor performance degradation.

Item Type: Thesis (Other)
Subjects: Topics > Software
Topics > Software > Performance
Area of Application > Development Tools
Technologies > Programming Languages > C#
Technologies > Programming Languages > C++
Technologies > Frameworks and Libraries
Technologies > Frameworks and Libraries > .NET
Technologies > Parallel Computing
Technologies > Parallel Computing > CUDA (Compute Unified Device Architecture)
Brands > nVidia
Divisions: Bachelor of Science FHO in Informatik > Bachelor Thesis
Depositing User: OST Deposit User
Contributors:
Contribution
Name
Email
Thesis advisor
Kramer, Philipp
UNSPECIFIED
Date Deposited: 21 Oct 2023 12:18
Last Modified: 21 Oct 2023 12:18
URI: https://eprints.ost.ch/id/eprint/1156

Actions (login required)

View Item
View Item