Why I hate programming in Python
Python is a very popular and powerful programming language. It has many strengths that have contributed to its success as a programming language:
- It is a good introductory language that many people start out with.
- It is an interpreted language and interactive Python (e.g.
ipython
, or interactive debugging withIPython.embed()
orcode.interact(local=locals())
) allows for powerful problem solving and debugging to be done. - It has a large body of freely available libraries that enable programmers to do many things with relative ease. For example whether it is web development with the Django framework, working with numerical data with
numpy
andscipy
, usingmatplotlib
for data visualization, or doing machine learning with Python API like that provided bykeras
.
I use Python myself quite a lot for my own research work primarily because of the useful libraries that exist. I have been very productive using the language and I think it has deserved the success it has experienced. However the more I use Python, the more I have come to notice some short-comings of it, and have come to see it as an old, out-dated language. This post is meant to describe weaknesses that I feel are intrinsic to the language itself. This means that I will not be talking about the libraries that exist in the language, nor will I be complaining about things like the package management system or unit testing libraries. I am not trying to advocated against people using Python, because if you can live with its idiosyncrasies then Python can be a very powerful tool to do a great many things, whether it is building a website, analyzing complex data sets, or writing a program to automate your smart devices.
- Does not feel object-oriented
- Plethora of double-underscore method names
- Global methods
- List comprehensions are backwards
- Whitespace and lack of block-end delimiters
- Lack of multiline blocks
1 – Does not feel object-oriented
Python is supposed to be an object-oriented programming language, yet it often feels like it is a C-like language masquerading as an object-oriented language.
1.1 – self
as first argument
In C, one common technique is to define struct
s to store some data, and have a set of methods which take a pointer to this struct as a first argument.
1
2
3
4
5
6
7
8
typedef struct {
float x, y;
} Point;
void translate_point(Point* point, const Point* translation) {
point->x += translation->x;
point->y += translation->y;
}
In Python, emulating this behavior with a class would look like this
1
2
3
4
5
6
7
8
class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y
def translate(self, other):
self.x += other.x
self.y += other.y
One can see that the translate
method here takes self
as a first parameter.
However, why is it necessary to pass self
as the first argument to each method here?
Other object-oriented programming languages do not require explicit self
because it is obvious, and this removes boilerplate code and improves readability.
Languages like C++, C#, Java, and Ruby do not require this parameter, and its presence makes Python feel like the object-oriented capabilities was tacked on:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Ruby
class Point
attr_reader :x, :y
def initialize(x, y) # no `self`
@x = x
@y = y
end
def translate(other) # no `self`
@x += other.x
@y += other.y
end
end
1
2
3
4
5
6
7
8
9
10
11
12
// C++
class Point {
public:
double x, y;
Point(double x_, double y_) : x(x_), y(y_) {} // no `this`
void translate(const Point& point) { // no `this`
x += point.x;
y += point.y;
}
};
Similarly, class methods in Python require a class parameter as the first argument, in addition to the eye-opening decorator @classmethod
:
1
2
3
4
class PointFactory(object):
@classmethod
def build(klass, x, y): # <= Requiring class as 1st argument is verbose
return Point(x, y)
1.2 – super
Another place where Python feels like it falls short is the super
syntax.
It is way more complicated to do something as common as calling the super constructor than it has to be.
This difference is very apparent when you compare similar Python and Ruby code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Python
class Animal(object):
def __init__(self, name):
self.name = name
def make_noise(self):
print("The animal named {} made noise".format(self.name))
class Dog(Animal):
def __init__(self, name, breed):
super(Dog, self).__init__(name) # <= That's a mouthful
self.breed = breed
def make_noise(self):
print("The {} dog named {} barked".format(self.breed, self.name))
Notice in particular the line super(Dog, self).__init__(name)
, which is extremely complicated.
One can simplify this by typing directly Animal.__init__(name)
, but this is not DRY – the sub-class should not need to directly reference the specific class it is inheriting from within the code because it is implied.
This equivalent line in Ruby is just super name
as seen below.
The super class is already known by the class at the point that method is invoked.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Ruby
class Animal
attr_reader :name
def initialize(name)
@name = name
end
def make_noise
puts "The animal named #{name} made noise"
end
end
class Dog < Animal
attr_reader :breed
def initialize(name, breed)
super name # <= nice and simple
@breed = breed
end
def make_noise
puts "The #{breed} dog named #{name} barked"
end
end
Part of the reason for why this is the case is because Python supports multiple inheritance, which greatly complicates object oriented programming. It is my preference to stick with single inheritance augmented with interfaces/abstract classes (C++ and Java) or with modules/mixins (see Ruby), and avoid multiple inheritance.
1.3 – Lack of public, protected, private distinctions
When creating a class hierarchy, defining certain attributes/methods to be public
, protected
and private
is a common and useful strategy. While the exact definition of protected
varies from language to language, the meaning of public
and private
are fairly universal. Python only technically supports private
, and has no support for protected
whatsoever.
Python accomplishes making methods private by appending two underscores in the front of the name, for instance def __check_validations(self)
. Whenever this method is called inside the class, it works as expected, self.__check_validations()
, but outside this would not work. This feature is implemented ad-hocly through name mangling; if the class is User
, then the actual method name becomes _User__check_validations
. This is a gross and verbose hack to add something as commonplace as private methods to the language.
2 – Plethora of double-underscore method names
In programming languages, appending _
or __
to the front of some name traditionally means “please stay away”. It is often used for denoting internals that should not be messed with by people who do not know exactly what they are doing. This convention is partially reflected in the double-underscore naming convention for private methods in Python. However, there are also plenty of methods that are extremely common which have the format __foo__
! Here is a partial list of ones that it is not unusual for you to come across a need for (see also http://minhhh.github.io/posts/a-guide-to-pythons-magic-methods):
__init__
: This is the constructor of an object, and so you will be writing it very often when you write classes.- Comparison operators:
__eq__
,__ne__
,__lt__
,__le__
,__gt__
,__ge__
, and__cmp__
. For instance, if you want your objects to be sortable, you will need to use these methods. - Arithmetic operators: unary
__pos__
and__neg__
, binary__add__
,__sub__
,__mul__
,__div__
,__truediv__
,__floordiv__
,__and__
,__or__
, and__xor__
, in addition to those with anr
in front to denote reflection (which is needed to allow something like1 + foo
to work, where the first argument is not an instance of your class). - String representations:
__str__
and__repr__
. The first is used to convert to a human readable string, while the second is used for instance by the command line interpreter to output a representation of objects. - Attribute access:
__getattr__
,__setattr__
,__hasattr__
,__delattr__
for getting, setting, checking presence of, and deleting attributes of an object. - Container behavior:
__len__
,__getitem__
,__setitem__
,__hasitem__
,__delitem__
- Iterators:
__iter__
- Calling like a function:
__call__
- Context managing:
__enter__
and__exit__
- Main method through the idiom
if __name__ == "__main__":
, which is just plain verbose.
Don’t you all get tired typing all these underscores? I sure do! The reason for these long and awkward names is that it avoids name conflicts. However there are other approaches that do not make so many underscores necessary. For example, a language like C++ and C# uses the keyword operator
to avoid conflicts when writing operator methods. Ruby uses the method name initialize
for the constructor, the appropriate symbols for comparisons and arithmetic operators directly (<
, ==
, +
, etc), coerce
to avoid a need for reverse operators like __radd__
, to_s
and inspect
convention method names for string representation methods, allows you to define assignment operators like name=
for attribute assignment which alleviates the need for the __setattr__
like methods, container like behavior is superceded by strong support for iterators through blocks and allowing definitions of []
and []=
methods, the Proc#call
method for function-like calling, and there is no need for context management due to blocks (e.g. File.open("README.md", "r") {|f| puts f.read}
).
3 – Global methods
Python has some global methods that are commonly used. These are given the nice name “built-in functions”. However, since Python is an object-oriented language, I feel it makes more sense for many of these to be instance methods, or methods associated with a specific module. Furthermore, by being global methods these are essentially keywords which further restricts the names available for use (e.g. defining a len
variable automatically makes the len
method fail to work where the variable is in scope). Shown here are some global method calls in Python, and better hypothetical ways to doing the same thing using instance methods or module methods.
1
2
3
4
5
6
7
8
9
abs(-2)
(-2).abs() # instance method
Math.abs(-2) # Math module class method
len([1,2,3])
[1,2,3].len() # instance method
chr(ord('a'))
'a'.ord().chr() # instance methods
str
, set
, list
, dict
are also types, so these “global functions” are really just constructors, which is a valid use-case. Having the names be lower case is a bad choice though, as the official style guide itself even says classes should use camel-case: https://www.python.org/dev/peps/pep-0008/#class-names, so the chances of colliding with user defined variables is high.
Functional programming methods filter
, reduce
, sorted
, map
, and reversed
make more sense as instance methods on iterable objects. For example, [1, 2, 3, 4].filter(lambda x: x % 2 == 0)
would be preferred to filter(lambda x: x % 2 == 0, [1, 2, 3, 4])
. This also makes code more readable, as the application of operations is in the order you read, left to right.
4 – List comprehensions are backwards
List and dictionary comprehensions in Python are useful constructs for doing common manipulations. For instance, to build a list of squares, you just type [x**2 for x in range(10)]
. Nice and simple and readable. The concern I have with this is that the order you write/read it is backwards from how humans think, and readability can be improved when you flip the order.
To see what I mean, consider the following Python and Ruby code which sums the squares of odd integers:
1
2
# Python
sum([x**2 for x in [x for x in range(10) if x % 2 == 1]])
1
2
# Ruby
(0...10).select {|x| x.odd?}.map {|x| x**2}.sum
The Ruby is much easier to read because reading things left-to-right exactly corresponds to the order of operations: start with the numbers from 0 up to 10, then select the odd ones, then map them to their squares, then sum the result. It requires a lot more mental work to dissect the Python code. While I wrote the above Ruby code in one pass, when writing the Python list comprehension I wrote it in the following complicated steps from the inside out:
1
2
3
4
range(10)
[x for x in range(10) if x % 2 == 1]
[x**2 for x in [x for x in range(10) if x % 2 == 1]]
sum([x**2 for x in [x for x in range(10) if x % 2 == 1]])
5 – Whitespace and lack of block-end delimiters
One use of whitespace in code is to improve readability (e.g. indenting blocks so it is easy to see at a glance what the block consists of). By forcing users to indent their code, this forces new programmers to develop better programming style habits. However, this restriction is only truly useful for absolute beginners to programming languages, as any decent developer should know how to write readable code using indentation already so this is an unnecessary feature that can be handled by a linter.
Related to this is the fact that there are no ending delimiters to blocks like conditionals and loops. This means that, for example, if you have a code fragment that has had its whitespace mangled, it is literally impossible to know where whitespace should be in general. When writing code in your favorite editor, there is often a way to fix indentation of code, either to bring everything into the same style (space vs tabs, number of spaces per indentation) or to quickly repair code (for instance if you delete surrounding namespace markers and want to have everything inside de-indent correctly automatically this can be a quick way of doing it). This is not possible in Python. I come across this very often while programming in Python. For instance I will copy a useful line or block of code from some other file and paste it into the file I am currently working on, only to have the indentation be completely off. And my trusty =
vim command to re-align indentation does not work, so I have to manually increase/decrease the indentation to fit. This problem does not exist in other languages which have ending delimiters.
This lack of ending delimiters also means that jumping to the end of a large block in your editor cannot be done simply. For instance, using the %
key in vim will jump to the matching brace, and using this on {}
curly braces allows for quick jumping to the start/end of for
and if
statements.
6 – Lack of multiline blocks
A common and useful feature of many modern programming languages is the ability to define anonymous functions/classes within code at the point of usage. A prominent example of where this is useful is in user interface programming. When a user clicks a button, it is necessary to write code that does something when the click occurs. This is called a callback. In C++ this is done through lambda expressions, in Java through anonymous classes or lambda expressions, and in Ruby using blocks.
Python has support for lambdas, for example in sorted([1, 2, 3, 4], key = lambda x: -x)
to sort numbers in reverse order, however these lambdas are limited to a single line. What if you wanted a button callback to get the name field, pull out the first name, and then pop-up a message that says “Hello ${first_name}”? In Ruby you could do it like this:
1
2
3
4
5
6
7
8
9
text_field = MyUI::TextField.new(label: "Full name")
button = MyUI::Button.new
# When the button is pressed, we will do more than just one thing,
# so the existence of a multi-line block is very convenient
button.on_press do
first_name = text_field.text.split(" ").first
MyUI::PopUp.new("Hello #{first_name}")
end
Like in the above sorted
example, such multiline blocks would also come in handy when doing functional programming with the methods map
, filter
, and reduce
.