I’m Doubling Down on a Mistake

Let me double down on a mistake. Two days ago, I wrote that Python lists are not lists, they’re arrays.

I got called out on twitter for not realising that Lists refers to the abstract data type called list and not a particular implementation.

These people are right that list is used for an abstract data type that may be implemented by an array.

§

However, the word list carries more meaning than the abstract data type list. When people say a list they are not thinking of an array. If a group of theoreticians uses the term list to in a restricted context, sure, they can go ahead, but it does not change the fact that when people say list in the context of computer programming, they most often mean “some type of linked list”.

Which is why it is not meaningless to say Python lists are arrays, not lists or to ask should we use an array or a list for this?

§

If I look at the Most used programming languages, Python is the only one I know of whose list type is not a list. Even PHP calls its arrays arrays [1]. Other languages, use array or vector for arrays, and list for, well, lists.

§

Calling arrays lists is confusing. I mean this quite literally [2]: I have cleared up confusions in other people’s minds about Python lists [3] by just saying they’re not lists, they’re arrays.

§

I also do not particularly care about implementation details, but I do care about performance. I think that if you are going to call something a list, it should have O(1) splicing.

I quite like the way that C++ does it (one of my #UnpopularOpinon is that C++ is still one of the most advanced programming languages around and others are still catching up). Its standard data structures are defined as abstract data types plus big-O performance guarantees. Naturally, this often constrains the implementation to a specific family of techniques (particularly for the basic types), but it still leaves a bit of wiggle room for implementers to tune to their particular architectures.

More importantly, it lets me use the structure without caring for the internal implementation if I have the right performance guarantees. If you tell me here’s this structure with these functions it is not really very usable until I have an idea of the performance. This is more and more important as data grows. Implementing car/cdr with arrays is fine if you have 100 elements, not if you have a billion.

[1] Cheap shot, sorry. PHP probably got this right by coincidence.
[2] In this case, I am using literal in the strict sense, although literally also means emphatically. The history of the word literally is similar to the overall gist of this post: in a restricted context, literally means non-figuratively, but the word has morphed to mean figuratively.
[3] Because I started using Python a long time ago and am fairly vocal about how great it is, a lot of people turn to me for doubts when they’re learning the language.

8 thoughts on “I’m Doubling Down on a Mistake

  1. > If I look at the Most used programming languages, Python is the only one I know of whose list type is not a list.

    In Java, List defines the abstract data type (interface) that is implemented by either ArrayList or LinkedList (and others).

    1. Thanks for the info. I actually managed to never really have to use modern Java. I only learned a version from 10 years ago in college, never had to use it since.

      1. Java’s connection framework has been in Java since 1998 🙂 Anyway, I’m not using Java nowadays, either.

  2. I agree 100% about defining standard containers as abstract types with performance guarantees. Even though I’m almost completely invested in Python these days, I am still befuddled that Big-O performance does not seem to be a concern anywhere in the Python docs.

  3. It’s also worth pointing out that C# also calls C++ style vectors Lists. There is the separate LinkedList, but List is backed by an underlying array.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.