Python sort by multiple keys

Say we have

L = ["a", "b", "zz", "zzz", "aa"]

We want to sort by length first, then by alphabetical order. i.e. we want to obtain [‘a’, ‘b’, ‘aa’, ‘zz’, ‘zzz’] in the end. How to do that?

The usual sorted() will give us this.

>>> sorted(L)
['a', 'aa', 'b', 'zz', 'zzz']

Very much alphabetical, not what we want.

Try sorted() with key=len will give us this.

>>> sorted(L, key=len)
['a', 'b', 'zz', 'aa', 'zzz']

The string length is taken care of, but that’s just half of it.

Notice zz appears before aa, that’s because zz is before aa in the original input list. The compare function len thinks they are equal.

Turns out we can expand the compare function a little to use a lambda function, so that we can pack in a little more custom logic into the comparison operation.

>>> sorted(L, key=lambda x: len(x))
['a', 'b', 'zz', 'aa', 'zzz']

This gives us the same result. But with a lambda function, we can do more than just giving a built-in or predefined function.

Since we want to sort by length first, then by alphabetical order, how should we look at a string? What is “aa” compared to “zzz”? We can look at “aa” as (2, “aa”), which is a tuple with int 2 and the string “aa”. For “zzz” that’s (3, “zzz”).

So now, we can do our comparison using tuples.

>>> sorted(L, key=lambda x: (len(x), x))
['a', 'b', 'aa', 'zz', 'zzz']

Now the resulting order seems right. To sort in descending order for a particular field, just put a negative sign there to reverse the order.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s