Mojo: The new AI programming language

29 Dec, 2024

I stumbled across Mojo a few months ago and while it seemed super interesting, I never got the time to really look into what it was. Here I've tried to explain the motivation for building the language in simpler terms.

So the MOJO team realized most traditional compiler tech isn't built for modern ai hardware. Traditional compiler technologies like LLVM and GCC were built for regular computers. So then they saw how Google came up with MLIR (Multi-Level Intermediate Representation), which works with domain-specific hardware. What makes MLIR special is its ability to handle "weird domains" - not just AI hardware, but quantum computing systems, custom silicon, etc. Mojo was built to take full advantage of MLIR.

Why Python

They chose to use the Python ecosystem because it was so widely used and embraced by the AI community. However, while they want full compatibility with Python, they want complete control over low-level resources and management. They also don't want to fragment it, so they will support CPython for existing Python code, and MLIR will run for native Mojo code, giving developers the freedom when they want dynamic or static. They're also building a migration tool to help developers move code from Python to Mojo incrementally. One feature they mention is the 'backtick' feature, which is supposed to let you use any variable name which may even be keywords in the new language. Note: they're trying to make Mojo a 'first-class' language, not just 'Python but faster'.

Python's Problems: The Two-World Problem

They start by acknowledging the two-world problem that Python has. While Python has bindings to C and C++ which allows people to build high-performance libraries, it requires low-level knowledge of CPython, knowledge of C/C++, and pushes coding from a more 'graph-based' programming model, which emphasizes defining everything before running computations, as opposed to a more 'eager mode' approach currently implemented in Python. There's also the problem of tooling fragmentation - debuggers can't step across Python and C code boundaries, which makes debugging difficult.

The N-World Problem

This problem gets extended to a 'three-world' or a 'N-world' problem with ML frameworks like CUDA which have their own quirks and limitations. There are now several limited programming systems for accelerators (OpenCL, Sycl, OneAPI, and others). All of these are further worsening the problem of fragmentation.

Deployment Challenges

The team then talks about deployment issues faced in Python, like dependency management for programs, deploying 'hermetically compiled a.out files', and improving multi-threading and performance. Hermetically compiled refers to programs that are completely self-contained and have everything they need to run. This is super important in ML pipelines, as even one library version mismatch can wreck the entire pipeline.

I'll probably end up writing a few more blogs once I get around to building stuff with it, so look out for those posts!