Although I have written Python for several years, I am still not very familiar with its idioms. In particular, I don't quite understand how to compare and sort the objects in Python. That's the reason why I am writing this article.

The rest of this article consists of following sections:

- Comparison Operators -- lists the comparison operators.
- Built-in Types -- lists the
**semantics of the comparison operators**corresponding to the operand types: - User-defined Classes -- discusses how to define our own comparison
operators for the
**user-defined classes**. - Sort the List -- will talk about how to customize the ordering while sorting the elements in a list.

## Comparison Operators

There are several comparison operators in Python. All of them are listed in the following table:

Operator | Semantics | Override Method |
---|---|---|

`<` |
Strictly less than | `__lt__` |

`<=` |
Less than or equal | `__le__` |

`>` |
Strictly greater than | `__gt__` |

`>=` |
Greater than or equal | `__ge__` |

`==` |
Equal | `__eq__` |

`!=` |
Not equal | `__ne__` |

`is` |
Object identity | Not overridable |

`is not` |
Negated object identity | Not overridable |

The six items from the beginning are the comparison operators that determine
**the order of the values**.

The last two items at the end are the **identity operators** which can tell
whether the objects on both sides are the same, i.e. both the expressions are
evaluated to same object.

For what it's worth, it is a bad idea to compare `int`

or `float`

values with `is`

or `is not`

operator. You will confront some
non-intuitive results. We will find out more examples in the following
section.

## Built-in Types

There are several built-in types in Python 3. Most of them have reasonable
comparison ordering. However, notice that `complex`

numbers,
`dict`

, and `range`

are not ordered.

Category | Types | Ordering |
---|---|---|

Numberic | `bool` , `int` ,
`float` |
Real number ordering |

String | `str` |
Unicode lexicographical |

Sequential | `tuple` , `list` ,
`bytearray` , `bytes` |
Element-wise |

Set | `set` , `frozenset` |
Subset |

Unordered |
`complex` ,
`dict` , `range` ,
`NoneType` |
N/A (Equality comparable) |

### Numbers: Bool, Int, and Float

The number types in Python have reasonable comparison rules:

- For the
`bool`

values,`False`

is equal to`0`

, and`True`

is equal to`1`

. - The
`int`

values are mapped to corresponding real numbers. - Most
`float`

values are mapped to corresponding real numbers. Here are some special rules:`float('nan')`

is a special floating point number which is standing for*not a number*. NAN is not equal to all of the numbers. Besides,`<`

,`<=`

,`>`

, and`>=`

will return`False`

as well.`float('inf')`

will be treated as*infinite*. It will be greater than other numbers.`float('-inf')`

will be treated as negative*infinite*. It will be less than other numbers.

These examples demonstrate the simple cases with implicit type conversions:

```
>>> 1 < 2
True
>>> 1.0 < 2.0
True
>>> 1 < 1.2
True
>>> 1 == 1.0
True
>>> -0.0 == 0
True
>>> -0.0 == 0.0
True
>>> False == 0.0
True
>>> True == 1
True
```

Here are the examples for `float('nan')`

:

```
>>> nan = float('nan')
>>> nan == nan
False
>>> nan != nan
True
>>> nan < 0
False
>>> nan > 0
False
>>> nan <= 0
False
>>> nan >= 0
False
```

Here are the examples for `float('inf')`

and `float('-inf')`

:

```
>>> inf = float('inf')
>>> inf == inf
True
>>> inf != inf
False
>>> inf < inf
False
>>> inf < nan
False
>>> inf > nan
False
```

```
>>> ninf = float('-inf')
>>> ninf < ninf
False
>>> ninf < nan
False
>>> ninf > nan
False
```

Here are the examples which compare `float('inf')`

with the maximum
floating point number on the system and a huge integer numbers.

```
>>> import sys
>>> mx = sys.float_info.max
>>> huge = (1 << 5000)
>>> mx < huge
True
>>> huge < inf
True
>>> mn = sys.float_info.min
>>> -huge < mn
True
>>> ninf < -huge
True
```

Please notice that the floating point arithmetics are performed in binary. As a result, the outcome might be different from the one computed in decimal:

```
>>> 0.1 + 0.2 > 0.3
True
>>> 0.1 + 0.2
0.30000000000000004
```

By the way, **DON'T** compare `int`

or `float`

values with
**identity operators** (`is`

or `is not`

.) Some implementations
will reuse the objects for small numbers and give *non-intuitive*,
*unexpected*, and *surprising* results:

```
>>> (1 + 2) is 3
True
>>> (1 << 4) is 16
True
>>> (1 << 30) is 0x40000000
False
>>> (1 << 30) == 0x40000000
True
```

### Complex Numbers

We can compare the complex numbers with equality or inequality operators:

```
>>> complex(1) == 1.0
True
>>> complex(1.4j) == 1.4j
True
>>> complex(1.4j) == 1.4
False
>>> complex(2j) != 2
True
```

However, the `complex`

numbers are unordered:

```
>>> 1 < 1j
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: int() < complex()
```

Even if their imaginary parts are equal to zero, they are unordered either:

```
>>> complex(4) < complex(5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: complex() < complex()
```

### String

The `str`

objects are ordered by lexicographical ordering. In the other
words, the characters in the strings are element-wise compared. The characters
are encoded in unicode and the comparison is case sensitive. If two strings
have common prefix, then the shorter one is less than the longer one.

```
>>> 'a' < 'b'
True
>>> 'aa' < 'b'
True
>>> 'a' < 'aa'
True
>>> '' < 'a'
True
>>> 'aa' < 'aa'
False
>>> 'a' == 'A'
False
```

### Sequential Types: Tuple, List, ByteArray, and Bytes

The sequential types, such as `tuple`

, `list`

, `bytearray`

and `bytes`

, are element-wise compared. Like `str`

objects, if two
objects have common prefix, then the shorter one is less than the longer one.

Here are the examples for `tuple`

:

```
>>> () < (0,)
True
>>> (0, 1) < (0, 1, 2)
True
>>> (0,) < (1,)
True
>>> (0, 1) == (0, 1)
True
>>> (0, 1) == (0, 1, 2)
False
```

Here are the examples for `list`

:

```
>>> [] < [0,]
True
>>> [0, 1] < [0, 1, 2]
True
>>> [0] < [1]
True
>>> [0, 1] == [0, 1]
True
>>> [0, 1] == [0, 1, 2]
False
```

Here are the examples for `bytes`

:

```
>>> b'' < b'a'
True
>>> b'ab' < b'abc'
True
>>> b'a' < b'b'
True
>>> b'ab' == b'ab'
True
>>> b'ab' == b'abc'
False
```

We are allowed to compare `bytearray`

with `bytes`

:

```
>>> bytearray(b'a') == b'a'
True
>>> bytearray(b'aa') > b'a'
False
```

We are also allowed to compare `tuple`

and `list`

with equality and
inequality operations:

```
>>> [0, 1] == (0, 1)
True
>>> [0, 1] != (0,)
True
```

But, unfortunately, we can't get the order between `tuple`

object and
`list`

object without type conversion:

```
>>> [0, 1] < (0, 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: list() < tuple()
```

```
>>> (0, 2) > [0, 1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: tuple() > list()
```

### Sets and FrozenSets

The `set`

and `frozenset`

instances are ordered by the **subset**
relationship. For example,

```
>>> a = {0, 1}
>>> b = {0}
>>> a < b
False
>>> a <= b
False
>>> a == b
False
>>> a != b
True
>>> a > b
True
>>> a >= b
True
```

Notice that **subset** is not a total order. We can check this by adding
`2`

to `b`

:

```
>>> b.add(2)
>>> a < b
False
>>> a <= b
False
>>> a == b
False
>>> a != b
True
>>> a > b
False
>>> a >= b
False
```

Of course, it is fine to compare `set`

objects with `frozenset`

objects:

```
>>> c = frozenset(a)
>>> c == a
True
>>> a.remove(0)
>>> a < c
True
```

## User-defined Classes

We can declare a new class with a `class`

statement. For example, we can
define a `Date`

class in `date.py`

:

```
#!/usr/bin/env python3
class Date:
def __init__(self, year, month, date):
self.year = year
self.month = month
self.date = date
def __str__(self):
return '{0}/{1:02}/{2:02}'.format(self.year, self.month, self.date)
def __repr__(self):
return 'Date({0},{1},{2})'.format(self.year, self.month, self.date)
```

What will happen if we compare two instances of `Date`

? By default, we
can only compare them with **equality/inequality operators** and **identity
operators**:

```
$ python3
```

```
>>> from date import Date
>>> a = Date(2015, 6, 14)
>>> b = Date(2015, 6, 14)
>>> a == b
False
>>> a != b
True
>>> a is b
False
>>> a is not b
True
```

As we can see, the default behavior for **equality/inequality operators** are
equal to the **identity operators**.

If we compare the objects with other operators, such as *less-than operator*,
then the `TypeError`

exception will be raised:

```
>>> a < b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: Date() < Date()
```

### Customize the Equality and Inequality Operators

We can customize the **equality operators** by overriding `__eq__`

:

```
class Date:
# ... skipped ...
def __eq__(self, other):
return (self.year == other.year and
self.month == other.month and
self.date == other.date)
```

Let's try out:

```
>>> from date import Date
>>> a = Date(2015, 6, 14)
>>> b = Date(2015, 6, 14)
>>> c = Date(2015, 6, 15)
>>> a == b
True
>>> a == c
False
```

We can see that the *equality* operator works as expected.

Let's look at the *inequality* operator. If you don't have special logic, it
is suggested to define `__ne__`

in terms of `not`

and `==`

.
For example:

```
class Date:
# ... skipped ...
def __ne__(self, other):
return not self == other
```

```
>>> from date import Date
>>> a = Date(2015, 6, 14)
>>> b = Date(2015, 6, 14)
>>> c = Date(2015, 6, 15)
>>> a != b
False
>>> a != c
True
```

On a side note, we should not assume `a == b`

is complement with
`a != b`

. This assumption may not be true in general. Some class may
return `True`

on both cases.

**Remarks**: If you override `__eq__`

, then you should override
`__hash__`

as well. Otherwise, the object won't be hashable. In the
other words, we can't insert the objects to `set`

and the objects can't
be used as the key of `dict`

. Besides, we have to make sure that
`a == b`

implies `hash(a) == hash(b)`

; otherwise, the uniqueness
property of the `set`

will become invalid. Here's the `__hash__`

function for our `Date`

class:

```
class Date:
# ... skipped ...
def __hash__(self):
return self.year * 10000 + self.month * 100 + self.date
```

```
>>> from date import Date
>>> a = Date(2015, 6, 14)
>>> b = Date(2015, 6, 14)
>>> c = Date(2015, 6, 15)
>>> hash(a)
20150614
>>> set([a, b, c])
{Date(2015,6,14), Date(2015,6,15)}
```

### Customize the Ordering Comparison Operators

We can customize the ordering comparison operators by overriding `__lt__`

,
`__le__`

, `__gt__`

, and `__ge__`

. They are corresponding to
`<`

, `<=`

, `>`

, and `>=`

respectively. For example:

```
class Date:
# ... skipped ...
def __hash__(self):
return self.year * 10000 + self.month * 100 + self.date
def __lt__(self, other):
return hash(self) < hash(other)
```

In our example, we simply implement `__lt__`

by comparing the hash value
of the object. You may implement this with a series of integer comparisons and
boolean arithmetics as well. Here are the results:

```
>>> from date import Date
>>> a = Date(2015, 6, 14)
>>> b = Date(2015, 6, 14)
>>> c = Date(2015, 6, 15)
>>> a < b
False
>>> a < c
True
>>> c < a
False
```

However, you might be surprised that the following code works as expected even
though we haven't overridden `__gt__`

yet:

```
>>> a > b
False
>>> c > a
True
```

It is working because the Python run-time will *reflect the comparison*. If
(1) the left-hand-side class does not override the corresponding method, or (2)
the corresponding method returns `NotImplemented`

, then the Python
run-time will **swap the arguments** and call the **reflected method**
defined in the right-hand-side class. Here are the mapppings:

Original | Reflects To | Mnemonic |
---|---|---|

`__lt__(a, b)` |
`__gt__(b, a)` |
`a < b` <=> `b > a` |

`__gt__(a, b)` |
`__lt__(b, a)` |
(ditto) |

`__le__(a, b)` |
`__ge__(b, a)` |
`a <= b` <=> `b >= a` |

`__ge__(a, b)` |
`__le__(b, a)` |
(ditto) |

`__eq__(a, b)` |
`__eq__(b, a)` |
`a == b` <=> `b == a` |

`__ne__(a, b)` |
`__ne__(b, a)` |
`a != b` <=> `b != a` |

With the reflection rules, it will be sufficient to define a total ordering
class by overriding `__eq__`

, `__ne__`

, `__lt__`

, and
`__le__`

.

### Total Ordering Decorator

To define a **class with total ordering**, we should override all of
`__eq__`

, `__ne__`

, `__lt__`

, and `__le__`

. However,
it might be cumbersome to override all of them. It will be simpler to decorate
the class with `functools.total_ordering`

.

With `functools.total_ordering`

decorator, overriding `__eq__`

and
one of `__lt__`

, `__le__`

, `__gt__`

, or `__ge__`

is
sufficient to define a total ordering class:

```
import functools
@functools.total_ordering
class Date:
# ... skipped ...
def __hash__(self):
return self.year * 10000 + self.month * 100 + self.date
def __eq__(self, other):
return (self.year == other.year and
self.month == other.month and
self.date == other.date)
def __lt__(self, other):
return hash(self) < hash(other)
```

```
>>> from date import Date
>>> a = Date(2015, 6, 14)
>>> b = Date(2015, 6, 14)
>>> c = Date(2015, 6, 15)
>>> (a == b, a != b, a < b, a <= b, a > b, a >= b)
(True, False, False, True, False, True)
>>> (a == c, a != c, a < c, a <= c, a > c, a >= c)
(False, True, True, True, False, False)
```

Althoguh it is much simpler to define a total ordering class with the
decorator, it comes with the *performance penalty*. If this has been proved to
be a performance bottleneck, then it will be better to override six methods
directly.

## Sort the List

We can sort the list with either `a.sort()`

or `b = sorted(a)`

.
The former will sort the associate list and the later will create a new sorted
list from the iteratables (and leave the argument unchanged). Both of them
will sort the elements by *less-than* comparison (i.e. in ascending order.)
For example,

```
>>> a = [2, 5, 3, 13, 8, 1]
>>> sorted(a)
[1, 2, 3, 5, 8, 13]
>>> a
[2, 5, 3, 13, 8, 1]
>>> a.sort()
>>> a
[1, 2, 3, 5, 8, 13]
```

The element type doesn't matter as long as they are pairwise comparable:

```
>>> sorted([set([1]), frozenset([1,2,3]), set(), set([4, 2, 3, 1])])
[set(), {1}, frozenset({1, 2, 3}), {1, 2, 3, 4}]
```

### Sort the List with Customized Ordering

If we want to sort the elements in different order, we can specify the **key
function** to extract the key from the elements.

First, let's sort the list of tuples with `sorted()`

directly:

```
>>> xs = [(0, 3), (1, 1), (0, 2), (2, 3), (1, 3)]
>>> sorted(xs)
[(0, 2), (0, 3), (1, 1), (1, 3), (2, 3)]
```

As what we have expected, the tuple are pairwisely compared.

Second, let's sort the tuples by the first element:

```
>>> xs = [(0, 3), (1, 1), (0, 2), (2, 3), (1, 3)]
>>> sorted(xs, key=lambda x: x[0])
[(0, 3), (0, 2), (1, 1), (1, 3), (2, 3)]
```

As we can see, the tuples with `x[0]`

equals to `0`

comes before
the tuples with `x[0]`

equals to `1`

or `2`

. However,
`(0, 3)`

and `(0, 2)`

are not reordered. This is due to the
**stable** guarantee of the sorting algorithm.

Third, the key function can **return a new tuple** as well. For example, we
can compare the second element first and then compare the first element.

```
>>> xs = [(0, 3), (1, 1), (0, 2), (2, 3), (1, 3)]
>>> sorted(xs, key=lambda x: (x[1], x[0]))
[(1, 1), (0, 2), (0, 3), (1, 3), (2, 3)]
```

However, returning a new tuple might result in performance penalty since

- we have to allocate new tuples from the heap, and
- the element-wise comparison for tuples might be more expensive than necessary.

It will be better to return primitive types if it is possible. For example, if
it is guaranteed that both `x[0]`

and `x[1]`

are less than 1000,
then we can simply sort the tuples with:

```
>>> xs = [(0, 3), (1, 1), (0, 2), (2, 3), (1, 3)]
>>> sorted(xs, key=lambda x: (x[1] * 1000 + x[0]))
[(1, 1), (0, 2), (0, 3), (1, 3), (2, 3)]
```

## Wrap Up

In this article, we have discussed the Comparison Operators that are available in the Python programming language. Besides, we have discussed the semantics of the comparison operators for the Built-in Types and how to create User-defined Classes with specific ordering. Finally, we have mentioned how to Sort the List with customized order by extracting the keys.

Hope you enjoy this post!