Understand Common Sequence Data Types in Python – String, Tuple, and List

string, tuple, and list are the three common build-in ordered collection data types in Python. Those sequence data types share some common operations.

Common Sequence Operations in Python
Name Operator Example
reference: Operations on Any Sequence in Python (interactivepython.org), 5.6 Sequence Types
indexing [n]
            data = [1,2,3,4,5]
            data[3] # return 4
concatenation +
            data = [1,2,3,4,5]
            data + [9] # return [1,2,3,4,5,9]
repetition *
            data = [1,2,3,4,5]
            data * 2 # return [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
membership in
            data = [9,2,4,4,6,2,8]
            for val in data: print(val), # return 9 2 4 4 6 2 8
length len()
            data = [1,2,3,4,5]
            len(data) # return 5
slicing with step k [i:j:k]
            data = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
            data[0:10:2] # return [1, 3, 5, 7, 9]
slicing [i:j]
            data = [1,2,3,4,5]
            data[1:3] # return [2,3]
Minimum min(data)
            data = [1,2,3,4,5]
            min(data) # return 1
Maximum max(data)
            data = [1,2,3,4,5]
            max(data) # return 5
Index data.index(sub[, start[, end]])
            data = [1,2,3,4,5]
            data.index(3) # return 2
Count data.count(i)
            data = [1,2,3,4,5,3]
            data.count(3) # return 2


Those 3 sequence data types, string and tuple are immutable; list is mutable. Above common operations can be used on both mutable and immutable data types.


my_list = [1,2,3,4,5,6,7,8]
my_string = 'My name is Eva'
my_tuple = (1,2,3,4,'A','B','C')
my_list[0:2] # return [1, 2]
my_string[0:2] # return 'My'
my_tuple[0:2] # return (1, 2)

Tuple

Before we talk about the special methods for each data type, I’d like to talk about tuple first.

In Python, both list and tuple are heterogenous collections(although list is intended to be homogeneous sequences); however, there’s no special methods for tuple because tuple is immutable.

So, why using tuple?

Gred Wilson suggested that tuples should be one of the things Python 3000 could leave out, but Phillip Eby pointed out that tuples are not just constant lists but heterogeneous data structures.

Tuples are not constant lists — this is a common misconception. Lists are intended to be homogeneous sequences, while tuples are hetereogeneous data structures.

— form Python Tuples are Not Just Constant Lists (jtauber.com)

If you treat tuple as a constant list, then it is probably very confused for using tuple; but if you understand tuple as a data structure, just like JSON, it would be easier to understand the purpose of using tuple.


data = ('Eva', 20, 'Front-End Software Engineer', 'F') # tuple packing
(name, age, job_title, gender) = data # tuple unpacking
name # return 'Eva'
age # return 20
job_title # return 'Front-End Software Engineer'
gender # return 'F'

tuple can be very useful if you want to store data because tuple is immutable, which means it takes less memory than list.

Measured in bytes using Python 2.5 in 64-bit Ubuntu Linux
data type bytes
source: Python Memory Usage: What values are taking up so much memory?
int 24
float 24
tuple 63
list 101
dict 298
old-style class 345
new-style class 336
subclassed tuple 79
Record 79
Record with old class mixin 79
Record with new class mixin 79

If you want to know more things about tuples, here are more related articles about tuple.

Releated Articles:

List

In tuple section, we mentioned list a little bit. List is an ordered heterogenous collection data type which starts counting with 0. Here are some common methods for list:

Methods for list
Name Operator Example
Delete item del data[i:j]
data = [0,1,2,3,4,5]
del data[1:2]
data # return [0, 2, 3, 4, 5]
Delete item with k step del data[i:j:k]
data = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
del data[0:10:2]
data # return [1, 3, 5, 7, 9, 10, 11, 12, 13, 14, 15]
Append data.append(item)
data = [0,1,2,3,4,5]
data.append('ABC')
data # return [0, 1, 2, 3, 4, 5, 'ABC']


data = [0,1,2,3,4,5]
data.append(['a', 'b', 'c'])
data # return [0, 1, 2, 3, 4, 5, ['a', 'b', 'c']]
Extend data.extend(item)
data = [0,1,2,3,4,5]
data.extend('ABC')
data # return [0, 1, 2, 3, 4, 5, 'A', 'B', 'C']


data = [0,1,2,3,4,5]
data.extend(['a', 'b', 'c'])
data # return [0, 1, 2, 3, 4, 5, 'a', 'b', 'c']
Insert data.insert(i,item)
data = [0,1,2,3,4,5]
data.insert(2, 'ABC')
data # return [0, 1, 'ABC', 2, 3, 4, 5]
Pop data.pop([i])
data = [0,1,2,3,4,5]
data.pop() # return 5
data # return [0, 1, 2, 3, 4]

You can also assign the index of the item you want to pop

data = [0,1,2,3,4,5]
data.pop(2) # return 2
data # return [0, 1, 3, 4, 5]
Reverse data.reverse()
data = [0,1,2,3,4,5]
data.reverse()
data # return [5, 4, 3, 2, 1, 0]
Remove Item data.remove(item)
data = [0,1,2,3,4,5]
data.remove(3) # 3 is item vale, not item position
data # return [0, 1, 2, 4, 5]
Sorting data.sort([cmp, key, reverse])
data = [4,5,3,2,6,8,1,0]
data.sort()
data # return [0, 1, 2, 3, 4, 5, 6, 8]

More examples please check: sort() method

sort() method

In Python, both sort() and sorted() have three arguments: cmp, key, and reverse; however, using key and reverse is more preferred because they are much faster than cmp. When Python sort a list, cmp will be called multiple times for each list element, but key and reverse will only touch each element once. (please refer to this document)

Instead of using data.sort(), you can also use sorted(data). The difference between data.sort() and sorted(data) is data.sort() will modify the original data, but sorted(data) will return the new sorted data.


data = [4,5,3,2,6,8,1,0]
data.sort()
data # return [0, 1, 2, 3, 4, 5, 6, 8]


data = [4,5,3,2,6,8,1,0]
sorted(data) # return [0, 1, 2, 3, 4, 5, 6, 8]
data # return [4, 5, 3, 2, 6, 8, 1, 0]

cmp specifies a comparison function of two arguments. This comparison function will compare whether the first argument is smaller than, equal to, or larger than the second argument, and this function will return a negative, zero, or positive number depends on the comparing result.

Here is the simplest example for using cmp which shows the logic of how cmp doing soring base on the returning result.


data = [12,3,5,16,9,7,2,11,14]
data.sort(cmp=lambda x, y: x - y)
data # return [2, 3, 5, 7, 9, 11, 12, 14, 16]


data = [12,3,5,16,9,7,2,11,14]
data.sort(cmp=lambda x, y: y - x)
data # return [16, 14, 12, 11, 9, 7, 5, 3, 2]

key specifies a function of one argument and the default value is None. The key function takes 1 argument and returns 1 value.


data = ['bbbb', 'aa', 'ccc', 'eeeee', 'f']
data.sort()
data # return ['aa', 'bbbb', 'ccc', 'eeeee', 'f']


data = ['bbbb', 'aa', 'ccc', 'eeeee', 'f']
data.sort(key=len)
data # return ['f', 'aa', 'ccc', 'bbbb', 'eeeee']

reverse is a Boolean value, which tells sort() to reverse the result or not. This argument can also be used with cmp or key.


data = [12,3,5,16,9,7,2,11,14]
data.sort(cmp=lambda x, y: x - y, reverse=True)
data # return [16, 14, 12, 11, 9, 7, 5, 3, 2]

list and reference

When you use list to do some operations, you might need to be very carefule about the reference issue.


data = [1,2,3,4,5]
A = data * 3
A # return [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
data[0] = 'Castiel'
A # still returns [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

But if you do something like this:


data = [1,2,3,4,5]
A = [data] * 3
A # return [[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]
data[0] = 'Castiel'
A # will return [['Castiel', 2, 3, 4, 5], ['Castiel', 2, 3, 4, 5], ['Castiel', 2, 3, 4, 5]]

If you put your list into another list, Python copies each item by reference. If you don’t want Python copies items by reference, you can simply use [:] to copy the list.


data = [1,2,3,4,5]
A = [data[:]] * 3
A # return [[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]
data[0] = 'Castiel'
A # still return [[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]

However, [:] is just a shallow copy, which can only copy the first level of elements and will not recursively duplicate the elements within itself. Let’s just use the above example and see what will happen with nested list.


data = [['a','b', 'c'],2,3,4,5]
A = [data[:]] * 3
A # return [[['a', 'b', 'c'], 2, 3, 4, 5], [['a', 'b', 'c'], 2, 3, 4, 5], [['a', 'b', 'c'], 2, 3, 4, 5]]
data[1] = 'Castiel'
A # still return [[['a', 'b', 'c'], 2, 3, 4, 5], [['a', 'b', 'c'], 2, 3, 4, 5], [['a', 'b', 'c'], 2, 3, 4, 5]]
data[0][1] = 'Castiel'
A # return [[['a', 'Castiel', 'c'], 2, 3, 4, 5], [['a', 'Castiel', 'c'], 2, 3, 4, 5], [['a', 'Castiel', 'c'], 2, 3, 4, 5]]

To copy a nested list with reference, you will need to use copy.deepcopy(). Here are more related articles of copying lists in Python:

Related Articles:

String

Like tuple, string is an immutable sequence data type. Other than the common sequence collection data type operations we just introduced, string has more methods that can help you easily deal with the strings, but I am not going to list all of them. If you are want to know more information about string methods, please check the official Python document.

Common String Methods in Python
Name Method Example
source: 5.6.1. String Methods
Capitalize str.capitalize()
data = 'this is my string'
data.capitalize() # return 'This is my string'
Center str.center(width, [, fillchar])
data = 'In Center'
data.center(15) # return ' In Center '
data.center(15,'*') # return '***In Center***'
Ends With str.endswith(suffix[, start[, end]])
data = 'this is my string'
data.endswith('ing') # return True
data.endswith('ing', 0, 8) # return False
Expand Tabs to Spaces str.expandtabs([tabsize])
data = '01\t02\t03'
data.expandtabs(4) # return '01 02 03'
Find
str.find(sub[, start[, end]])
str.rfind(sub[, start[, end])

data = 'this is my string'
data.find('is') # return 2
data.find('is', 4) # return 5
data.find('is', 7, 10) # return -1
data.rfind('is') # return 5
Format str.format(*args, **kwargs)
"The three common sequence collection data types in Python are {0}, {1}, and {2}.".format('string', 'tuple', 'list')
# return 'The three common sequence collection data types in Python are string, tuple, and list.'
Left justified/Right justified
str.ljust(width[, fillchar])
str.rjust(width[, fillchar])

data = 'my string'
data.ljust(20, '-') # return 'my string-----------'
data.rjust(20) # return ' my string'
Join str.join(iterable)
'-'.join(['python', 'in', 'here']) # return 'python-in-here'
Partition str.partition(sep)
data = 'Me: This is a long article'
data.partition(':') # return ('Me', ':', ' This is a long article')
Replace str.replace(old, new[, count])
data='this is python'
data.replace(' ', '-') # return 'this-is-python'
data.replace(' ', ', ', 1) # return 'this, is python'
Strip
str.strip()
str.lstrip()
str.rstrip()

data = ' How are you~~ '
data.strip() # return 'How are you~~'
data.lstrip() # return 'How are you~~ '
data.rstrip() # return ' How are you~~'
Split
str.split(([sep[, maxsplit]])
str.rsplit(([sep[, maxsplit]])
str.splitlines([keepends])

data = 'my name is eva'
data.split(' ') # return ['my', 'name', 'is', 'eva']
data.split(' ', 2) # return ['my', 'name', 'is eva']
data.rsplit(' ', 2) # return ['my name', 'is', 'eva']
data = """Hello~
Where
is
Supernatural 9x14"""
data.splitlines() # return ['Hello~', 'Where', 'is', 'Supernatural 9x14']

Okay! This article is long enough.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s