CUDA Tile Now Available for BASIC!

CUDA 13.1 introduced CUDA Tile, a next-generation tile-based GPU programming paradigm designed to make fine-grained parallelism more accessible and flexible. One of its key strengths is language openness: any programming language can target CUDA Tile, enabling developers to bring tile-based GPU acceleration into a wide range of ecosystems. In response to overwhelming demand from seasoned developers everywhere, we’re releasing cuTile BASIC for GPUs, bringing CUDA Tile programming to this long-overlooked language.

What is cuTile BASIC? cuTile BASIC is an expression of the CUDA Tile programming model in BASIC, built on top of the CUDA Tile IR specification. It enables you to write tile kernels in BASIC using a tile-based model, which is a natural fit for a programming language like BASIC that predates multi-threaded programming. cuTile BASIC is the perfect marriage of the power of GPUs with the anachronistic charm and syntactic simplicity of the BASIC programming language – an elegant language from a more pixelated era. Manually numbering your lines of code has never looked so good or run so fast!

Who is cuTile BASIC for? BASIC is one of the oldest programming languages around and is revered by a whole generation of developers who fondly remember the sound of a 300 baud dial-up modem. For many such developers, BASIC was their first introduction to computer programming. Now, developers with BASIC still burned into their brains can take legacy applications onto NVIDIA GPU-accelerated computing for the first time. This unlocks performance and functionality that the BASIC programming language could never have previously imagined, allowing your Lunar Lander to zip around the moon’s surface faster than an Artemis mission.

To get started, first install cuTile BASIC with PIP: pip install git+https://github.com/nvidia/cuda-tile.git@basic-experimental. The full hardware and software requirements for running cuTile BASIC are listed at the end of this post (64k of RAM or more recommended).

If you’ve learned CUDA C++, you probably encountered the canonical vector addition kernel. A vector add kernel in CUDA C++ looks something like this: __global__ void vecAdd(float* A, float* B, float* C, int vectorLength) {...}. Now let’s look at the equivalent code written in cuTile BASIC. We don’t need to specify what each thread does. We only have to break the data into tiles and specify what mathematical operations should happen to these tiles. Everything else is handled for us.

Now we’ll show how to run this vector add kernel in BASIC. The straightforward workflow is that first the BASIC function is compiled to a cubin, and then it’s launched on the GPU. If you have the proper versions of CUDA Toolkit and Python installed and have downloaded the cuTile BASIC repo from GitHub, you can execute the following command: $ python examples/vector_add.py. If your output looks the same, congratulations, you just ran your very first cuTile BASIC program!

CUDA Tile Now Available for BASIC!

Похожие статьи

Unlock Insights from Video with Amazon Bedrock Models

NVIDIA Advances Autonomous Networks with Agentic AI and Reasoning Models

NVIDIA Launches Nemotron 3 Super with 5x Higher Throughput