‘Codon’ Compiles Python to Native Machine Code That’s Even Faster Than C

Python

Codon is a new “high-performance Python compiler that compiles Python code to native machine code without any runtime overhead,” according to its README file on GitHub.

Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon’s performance is typically on par with (and sometimes better than) that of C/C++. Unlike Python, Codon supports native multithreading, which can lead to speedups many times higher still.

Its development team includes researchers from MIT’s Computer Science and Artificial Intelligence lab, according to this announcement from MIT shared by long-time Slashdot user Futurepower(R):

The compiler lets developers create new domain-specific languages (DSLs) within Python — which is typically orders of magnitude slower than languages like C or C++ — while still getting the performance benefits of those other languages. “We realized that people don’t necessarily want to learn a new language, or a new tool, especially those who are nontechnical. So we thought, let’s take Python syntax, semantics, and libraries and incorporate them into a new system built from the ground up,” says Ariya Shajii SM ’18, PhD ’21, lead author on a new paper about the team’s new system, Codon. “The user simply writes Python like they’re used to, without having to worry about data types or performance, which we handle automatically — and the result is that their code runs 10 to 100 times faster than regular Python. Codon is already being used commercially in fields like quantitative finance, bioinformatics, and deep learning.”

The team put Codon through some rigorous testing, and it punched above its weight. Specifically, they took roughly 10 commonly used genomics applications written in Python and compiled them using Codon, and achieved five to 10 times speedups over the original hand-optimized implementations…. The Codon platform also has a parallel backend that lets users write Python code that can be explicitly compiled for GPUs or multiple cores, tasks which have traditionally required low-level programming expertise…. Part of the innovation with Codon is that the tool does type checking before running the program. That lets the compiler convert the code to native machine code, which avoids all of the overhead that Python has in dealing with data types at runtime.

“Python is the language of choice for domain experts that are not programming experts. If they write a program that gets popular, and many people start using it and run larger and larger datasets, then the lack of performance of Python becomes a critical barrier to success,” says Saman Amarasinghe, MIT professor of electrical engineering and computer science and CSAIL principal investigator. “Instead of needing to rewrite the program using a C-implemented library like NumPy or totally rewrite in a language like C, Codon can use the same Python implementation and give the same performance you’ll get by rewriting in C. Thus, I believe Codon is the easiest path forward for successful Python applications that have hit a limit due to lack of performance.”

The other piece of the puzzle is the optimizations in the compiler. Working with the genomics plugin, for example, will perform its own set of optimizations that are specific to that computing domain, which involves working with genomic sequences and other biological data, for example. The result is an executable file that runs at the speed of C or C++, or even faster once domain-specific optimizations are applied.


Source link