In our bii internal series we’ve walked through the process to convert our python code into C code, compile it as a python native extension to distribute it for different platforms. One of the major drawbacks of using native code is that we are not supporting all systems but, on the other side, we gain in efficiency and have more control over the environments where the app runs. We’re doing some benchmark to see how much faster is biicode processing projects: running all in python code or with the native extensions.
First of all, let’s explain a bit what biicode does on every operation. At first it makes a check in to read files from the hard disk and it checks if they’ve changed and caches them. Then, if any files have changed, it processes your project which means that it parses source code, searching for dependencies, analyzes your dependencies, configuration etc. The final step is checking out to disk file changes and external dependencies that were already in local cache.
We’ve tested how biicode processes different libraries: running python code vs cythonized code. We’ve measured following times:
- Check-In: Time to read all files and cache them in memory.
- Process: Time to parse code, and analyze dependencies.
- Reprocess: Time to reprocess files without changes.
These are the results (in seconds) for SDL library, which contains 2130 files:
|Check-In||0.26206111908 s||0.262398004532 s|
|Process||9.54270887375 s||6.47844004631 s|
|Reprocess||1.45510792732 s||1.36480784416 s|
As you can see that check-in time is the same in both cases as it involves reading tons of files from disk so it’s IO bounded not processor bounded.
Also reprocess time is very similar, with a slightly improvement in native extensions. Reprocess makes sure there’s no need to calculate anything new.
Performance gain is not constant for every library, but it increases along with number of files/relations being processed. For projects smaller than 500 files performance gain is between 7% and 8%, for larger projects it boosts up to 32% as observed in SDL case.
So, is it worth compiling to C code? Well, depends on your project, of course. If your program is IO bounded then probably it isn’t worthy, but if you need to do tons of processing then you might consider it, setting up the compile–package process is very easy.
You can check all the posts in the series, and in case of doubts, I’ll be happy to help you, contact me.