What it is
ClangIR is a new high-level Intermediate Representation (IR) for the Clang C compiler, originally incubated at Meta. It sits between the Clang AST and LLVM IR to enable better optimizations and static analysis.
When I started looking at it, ClangIR was in its early incubation phase and the CUDA section was completely unimplemented. I implemented the CUDA backend to lower from the Clang AST to ClangIR and from ClangIR to LLVM IR.
What I did
I implemented the lowering logic required to translate ClangIR constructs into something the NVPTX (NVIDIA) backend could understand. The core work involved:
- Host vs. Device Split: Writing the logic to correctly identify and separate host code (CPU) from device code (GPU) during the lowering phase.
- Variable Handling: Mapping global and local variables to their correct CUDA address spaces.
- Texture Support: Implementing support for CUDA-specific surface and texture types.
Notes / results
The goal wasn't just to write code, but to get it merged upstream.
- Validation: I used the PolyBench benchmark suite to test completeness.
- Upstreaming: All code was reviewed by the core ClangIR maintainers (including the project creator) and merged into the main incubator repository.
- Status: The groundwork is now in place for others to build more complex CUDA optimizations on top of ClangIR.