Preferences

There's also the argument that at a certain scale the time of a developer is simply more expensive than time on a server.

If I write something in C++ that does a task in 1 second and it takes me 2 days to write, and I write the same thing in Python that takes 2 seconds but I can write it in 1 day, the 1 day of extra dev time might just pay for throwing a more high performance server against it and calling it a day. And then I don't even take the fact that a lot of applications are mostly waiting for database queries into consideration, nor maintainability of the code and the fact that high performance servers get cheaper over time.

If you work at some big corp where this would mean thousands of high performance servers that's simply not worth it, but in small/medium sized companies it usually is.


Realistically something that takes 1 second in C++ will take 10 seconds (if you write efficient python and lean heavily on fast libraries) to 10 minutes in python. But the rest of your point stands
I spend most of my time waiting on IO, something like C++ isn't going to improve my performance much. If C++ takes 1ms to transform data and my Python code takes 10ms, it's not much of a win for me when I'm waiting 100ms for IO.

With Python I can write and test on a Mac or Windows and easily deploy on Linux. I can iterate quickly and if I really need "performance" I can throw bigger or more VPSes at the problem with little extra cognitive load.

I do not have anywhere near the same flexibility and low cognitive load with C++. The better performance is nice but for almost everything I do day to day completely unnecessary and not worth the effort. My case isn't all cases, C++ (or whatever compiled language you pick) will be a win for some people but not for me.

And how much code is generally written that actually is compute heavy? All the code I've ever written in my job is putting and retrieving data in databases and doing some basic calculations or decisions based on it.
Rule of thumb:

Code is "compute heavy" (could equally be memory heavy or IOPs heavy) if it's deployed into many servers or "the cloud" and many instances of it are running serving a lot of requests to a lot of users.

Then the finance people start to notice how much you are paying for those servers and suddenly serving the same number of users with less hardware becomes very significant for the company's bottom line.

The other big one is reducing notable latency for users of your software.

That is absolutely true.

But sometimes, you do end up writing that compute heavy piece of code. At that stage, you have to learn how to write your own native library :)

Speaking of which, I've written some Python modules in Rust using PyO3, its' a very agreeable experience.

Damn! Is the rule of thumb really a 10x performance hit between Python/C++? I don’t doubt you’re correct, I’m just thinking of all the unnecessary cycles I put my poor CPU through.
Outside cases where Python is used as a thin wrapper around some C library (simple networking code, numpy, etc) 10x is frankly quite conservative. Depending on the problem space and how aggressively you optimize, it's easily multiple orders of magnitude.
Those cases are about 95% of scientific programming.

This is the first line in most scientific code:

    import numpy
FFI into lean C isn't some perf panacea either, beyond the overhead you're also depriving yourself of interprocedural optimization and other Good Things from the native space.
Of course it depends on what you are doing, but 10x is a pretty good case. I recently re-wrote a C++ tool in python and even though all the data parsing and computing was done by python libraries that wrap high performance C libraries, the program was still 6 or 7 times slower than C++. Had I written the python version in pure python (no numpy, no third party C libraries) it would no doubt have been 1000x slower.
It depends on what you're doing. If you load some data, process it with some Numpy routines (where speed-critical parts are implemented in C) and save a result, you can probably be almost as fast as C++... however if you write your algorithm fully in Python, you might have much worse results than being 10x slower. See for example: https://shvbsle.in/computers-are-fast-but-you-dont-know-it-p... (here they have ~4x speedup from good Python to unoptimized C++, and ~1000x from heavy Python to optimized one...)
It can be anywhere from 2-3x for IO-heavy code to 2000x for tight vectorizable loops. But 20x-80x is pretty typical.
Last time I checked (which was a few years ago), the performance gain of porting a non-trivial calculation-heavy piece of code from Python to OCaml was actually 25x. I believe that performance of Python has improved quite a lot since then (as has OCaml's), but I doubt it's sufficient to erase this difference.

And OCaml (which offers a productivity comparable to Python) is sensibly slower than Rust or C++.

It really depends on what you're doing, but I don't think it is generally accurate.

What slows Python down is generally the "everything is an object" attitude of the interpreter. I.e. you call a function, the interpreter has to first create an object of the thing you're calling.

In C++, due to zero-cost abstractions, this usually just boils down to a CALL instruction preceded by a bunch of PUSH instructions in assembly, based on the number of parameters (and call convention). This is of course a lot faster than running through the abstractions of creating some Python object.

> What slows Python down is generally the "everything is an object" attitude of the interpreter

Nah, it’s the interpreter itself. Due to it not having JIT compilation there is a very high ceiling it can not even in theory surpass (as opposed to things like pypy, or graal python).

I don't think this is true: Other Python runtimes and compilers (e.g. Nuitka) won't magically speed up your code to the level of C++.

Python is primarily slowed down because of the fact that each attribute and method access results in multiple CALL instructions since it's dictionaries and magic methods all the way down.

Other than this, dynamic typing is a big culprit. I can't find back the article with the numbers, but its performance overhead is enormous.
Well at least 10x, sometimes more. Not really surprising when you think about that it's a VM reading and parsing your code as a string at runtime.
> it's a VM reading and parsing your code as a string at runtime.

Commonly it creates the .pyc files, so it doesn't really re-parse your code as a string every time. But it does check the file's dates to make sure that the .pyc file is up to date.

On debian (and I guess most distributions) the .pyc files get created when you install the package, because generally they go in /usr and that's only writeable by root.

It does include the full parser in the runtime, but I'd expect most code to not be re-parsed entirely at every start.

The import thing is really slow anyway. People writing command lines have to defer imports to avoid huge startup times to load libraries that are perhaps needed just by some functions that might not even be used in that particular run.

> re-parse your code as a string every time

That doesn’t really take any significant time though on modern processors.

Aren't those pyc files still technically just string bytecode, but encoded as hex?
Well bytecode isn't the same as the actual code you write in your editor.
As a long-time Python lover, yes that's a decent rule of thumb.
It is anywhere from 1x to 100x+.
If the 1 second is spent waiting for IO, it will take 1 second in whatever language.

But yes python is slow.

However I've seen good python code be faster than bad C code.

Well, to be fair the "good python code" is probably just executing something written in c lol. But lots of python is backed up by stuff written in c.
Not necessarily. Just using a better optimized sort or hash algorithm can make a big difference.

I was talking specifically of pure python code (except the python's standard library itself, where it really is unavoidable).

Of course algorithmic complexity will trump anything else at big enough n values.
Not for everything. There are plenty of Python operations that are not 10x slower than c.
That is true, but there are relatively few real world applications that consist of only those operations. In the example I mentioned below, there where actually some parts of my python rewrite that ended up faster than the original C++ code, but once everything was strung together into a complete application those parts where swamped by the slow parts.
Most of the time these are arithmetic tight loops that require optimisations, and it's easy to extract those into separate compiled cython modules without losing overal cohesion within the same Python ecosystem.
If Python was merely twice as slow then I could agree with you.
Not all code needs to process terabytes of data.

I have code running that reads ~20 bytes, checks the internal status on an hashmap and flips a bit.

Would it be faster in C? Of course.

Would it have taken me much longer to write to achieve absolutely no benefit? Yes.

Speeding up the time critical parts with Cython or Numba or ... is rather easy.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal