Introducing Strings, Collections, and Iteration¶
Data type for strings in Python
Sequence of Unicode code points
Python 3 source encoding is UTF-8
'This is a string'
"This is also a string"
'"Yes!", he said, "I agree!"'
String Literals¶
>>> "first" "second"
Strings with Newlines¶
Multiline strings
Spread the literal across multiple lines
>>> "" "This is
... a multiline
... string"""
This is\na multiline\nstring'
>>> '''So
... is
... this.'''
>>> m = 'This string\nspans multiple\nlines'
>>> m
'This string\nspans multiple\nlines'
>>> print(m)
This string
spans multiple
Escape sequences
Embed escape sequences in a single-line literal
>>> "This is a \" in a string" 'This is a " in a string' >>> 'This is a \' in a string' "This is a ' in a string" >>> 'This is a \" and a \' in a string' 'This is a " and a \' in a string' >>> k = 'A \\ in a string' 'A \\ in a string' >>> print(k) A \ in a string
All Escape Sequences
Sequence |
Meaning |
newline |
Backslash and newline ignored |
\ |
Backslash () |
' |
Single quotes (“) |
" |
Double quote (“) |
a |
ASCII Bell (BEL) |
b |
ASCII Backspace (BS) |
f |
ASCII Formfeed (FF) |
n |
ASCII Linefeed (LF) |
r |
ASCII Carriage Return (CR) |
t |
ASCII Horizontal Tab (TAB) |
v |
ASCII Vertical Tab (VT) |
ooo |
Character with octal value 000 |
xhh |
Character with hex value hh |
Only recognized in string literals |
N{name} |
Character named name in the Unicode database |
uxxxx |
Character with 16-bit hex value XXXX |
Uxxxxxxxx |
Character with 32-bit hex value XxXXXXXX |
String Features¶
>>> path = r'C:\Users\Merlin\Documents\Spells'
>>> path
'C: \\Users\\Merlin\\Documents\\Spells'
>>> print(path)
>>> str(496)
>>> str(6.02e23)
>>> s = 'parrot'
>>> s[4]
>>> type(s[4])
<class 'str'>
Data type for sequences of bytes
Raw binary data
Fixed-width single-byte encodings
>>> b'data'
>>> b"data"
>>> d = b'some bytes'
>>> d[0]
>>> d.split()
[b'some', b'bytes']
Sequences of objects
A workhorse in Python
>>> [1, 9, 8]
[1, 9, 8]
>>> a = ["apple", "orrange", "pear"]
>>> a[1]
>>> a[1] = 7
>>> a
['apple', 7, 'pear']
>>> b = []
>>> b.append(1.618)
>>> b
>>> b.append(1.414)
>>> b
[1.618, 1.414]
>>> list("characters")
['c', 'h', 'a', 'r', 'a', 'c', 't', 'e', 'r', 's']
>>> c = ['bear',
... 'giraffe',
... 'elephant',
... 'caterpillar',]
>>> c
['bear', 'giraffe', 'elephant', 'caterpillar']
Fundamental data structure Python
Map keys to values
Also known as maps or associative arrays
>>> d = {'alice': '878-8728-922', 'bob': '256-4532-523', 'eve': '123-3432-342'}
>>> d['alice']
>>> d['alice'] = '323-3123-156'
>>> d
{'alice': '323-3123-156', 'bob': '256-4532-523', 'eve': '123-3432-342'}
>>> d['charles'] = '123-5232-125'
>>> d
{'alice': '323-3123-156', 'bob': '256-4532-523', 'eve': '123-3432-342', 'charles': '123-5232-125'}
Visit each item in an iterable sequence
for item in iterable: body
>>> cities = ["Cape Town", "London", "New York", "Paris"]
>>> for city in cities:
... print(city)
Cape Town
New York
>>> colors = {'crimson': 0xdc143c, 'coral': 0xff7f50, 'teal': 0x008080}
>>> for color in colors:
... print(color, colors[color])
crimson 14423100
coral 16744272
teal 32896
Putting it all Together¶
>>> from urllib.request import urlopen
>>> story = urlopen('')
>>> story_words = []
>>> for line in story:
... line_words = line.split()
... for word in line_words:
... story_words.append(word)
>>> story.close()
>>> story_words
[b'It', b'was', b'the', b'best', b'of', b'times', b'it', b'was', b'the', b'worst', b'of', b'times', b'it', b'was', b'the', b'age', b'of', b'wisdom', b'it', b'was', b'the', b'age', b'of', b'foolishness', b'it', b'was', b'the', b'epoch', b'of', b'belief', b'it', b'was', b'the', b'epoch', b'of', b'incredulity', b'it', b'was', b'the', b'season', b'of', b'Light', b'it', b'was', b'the', b'season', b'of', b'Darkness', b'it', b'was', b'the', b'spring', b'of', b'hope', b'it', b'was', b'the', b'winter', b'of', b'despair', b'we', b'had', b'everything', b'before', b'us', b'we', b'had', b'nothing', b'before', b'us', b'we', b'were', b'all', b'going', b'direct', b'to', b'Heaven', b'we', b'were', b'all', b'going', b'direct', b'the', b'other', b'way', b'in', b'short', b'the', b'period', b'was', b'so', b'far', b'like', b'the', b'present', b'period', b'that', b'some', b'of', b'its', b'noisiest', b'authorities', b'insisted', b'on', b'its', b'being', b'received', b'for', b'good', b'or', b'for', b'evil', b'in', b'the', b'superlative', b'degree', b'of', b'comparison', b'only']
Recall Bytes¶
Bytes literals prefixed with lowercase ‘b’
HTTP data is provided as bytes
Use bytes.decode() to get strings
>>> story = urlopen('')
>>> story_words = []
>>> for line in story:
... line_words = line.decode('utf8').split()
... for word in line_words:
... story_words.append(word)
>>> story.close()
>>> story_words
['It', 'was', 'the', 'best', 'of', 'times', 'it', 'was', 'the', 'worst', 'of', 'times', 'it', 'was', 'the', 'age', 'of', 'wisdom', 'it', 'was', 'the', 'age', 'of', 'foolishness', 'it', 'was', 'the', 'epoch', 'of', 'belief', 'it', 'was', 'the', 'epoch', 'of', 'incredulity', 'it', 'was', 'the', 'season', 'of', 'Light', 'it', 'was', 'the', 'season', 'of', 'Darkness', 'it', 'was', 'the', 'spring', 'of', 'hope', 'it', 'was', 'the', 'winter', 'of', 'despair', 'we', 'had', 'everything', 'before', 'us', 'we', 'had', 'nothing', 'before', 'us', 'we', 'were', 'all', 'going', 'direct', 'to', 'Heaven', 'we', 'were', 'all', 'going', 'direct', 'the', 'other', 'way', 'in', 'short', 'the', 'period', 'was', 'so', 'far', 'like', 'the', 'present', 'period', 'that', 'some', 'of', 'its', 'noisiest', 'authorities', 'insisted', 'on', 'its', 'being', 'received', 'for', 'good', 'or', 'for', 'evil', 'in', 'the', 'superlative', 'degree', 'of', 'comparison', 'only']