I sometimes encounter problems like TypeError: unhashable type: 'list'
while programming. Therefore, I decide to set aside some time to undertand the important Python concepts, mutable, hashable and iterable.
1. mutable and immutable
Python represents all its data as objects. Objects are identified by an unique and constant integer during its lifetime. Using the build-in function id(object)
returns the identity for a given object.
Python objects can be categorized into two types: mutable and immutable. For mutable objects, its content can be altered without changing their identity. One trick to check if a type is mutable or not is to use id(object)
. For instance,
# immutable type, str()
>>> s = 'abc'
>>> id(s)
140125615331648
>>> s += 'def' # a new object is created
>>> id(s)
140125614721616 # the id is changed
# mutable type, set()
>>> s = set(['a', 'b'])
>>> id(s)
140125614845184
>>> s.add('c')
>>> id(s)
140125614845184
Note that changing the content of an immutable object results in creating a new object. (A new object has to be created if a different value has to be stored)
The principal built-in types in Python are numerics, sequences, mappings, classes, instances and exceptions.
(1) immutable types
- numbers:
int()
,float()
,complex()
- sequences:
str()
,tuple()
,frozenset()
,bytes()
(2) mutable types
- sequences:
list()
,set()
,bytearray()
- mapping types:
dict()
,collections.OrderedDict([items])
- classes, instances and exceptions
2. hashable and unhashable
The detailed description of hashable is excerpted from Python documentation: Glossary.
An object is hashable if it has a hash value which never changes during its lifetime (it needs a
__hash__()
method), and can be compared to other objects (it needs an__eq__()
method). Hashable objects which compare equal must have the same hash value.Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.
All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal (except with themselves), and their hash value is derived from their
id()
.
3. iteration, iterable and iterator
Excerpt from [2]:
Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.
An iterable is an object that has an
__iter__
method which returns an iterator, or which defines a__getitem__
method that can take sequential indexes starting from zero (and raises anIndexError
when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.An iterator is an object with a
next
(Python 2) or__next__
(Python 3) method.Whenever you use a
for
loop, or map, or a list comprehension, etc. in Python, thenext
method is called automatically to get each item from the iterator, thus going through the process of iteration.
A good example from [2] to explain those concepts.
>>> s = 'cat' # s is an ITERABLE
# s is a str object that is immutable
# s has no state
# s has a __getitem__() method
>>> t = iter(s) # t is an ITERATOR
# t has state (it starts by pointing at the "c"
# t has a next() method and an __iter__() method
>>> next(t) # the next() function returns the next value and advances the state
'c'
>>> next(t) # the next() function returns the next value and advances
'a'
>>> next(t) # the next() function returns the next value and advances
't'
>>> next(t) # next() raises StopIteration to signal that iteration is complete
Traceback (most recent call last):
...
StopIteration
>>> iter(t) is t # the iterator is self-iterable
Excerpt from iterable:
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,str
, andtuple
) and some non-sequence types likedict
,file objects
, and objects of any classes you define with an__iter__()
or__getitem__()
method.Iterables can be used in a
for
loop and in many other places where a sequence is needed (zip()
,map()
, ...). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with iterator objects yourself. Thefor
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop.
Excerpt from interator:
An object representing a stream of data. Repeated calls to the iterator’s
__next__()
method (or passing it to the built-in functionnext()
) return successive items in the stream.When no more data are available, a
StopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()
method just raiseStopIteration
again.Iterators are required to have an
__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted.One notable exception is code which attempts multiple iteration passes. A container object (such as a
list
) produces a fresh new iterator each time you pass it to theiter()
function or use it in afor
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.
References:
[1] Immutable vs mutable types - Python
[2] What exactly are Python's iterator, iterable, and iteration protocols?