python语法笔记

参考教材：

Python编程基础

1.Python基础

操作符（按优先级从左到右排列）：

* * / // % + -

字符串连接：

'Alice'+' hello' 输出：’Alice hello’

字符串复制（格式只能是字符串*整数值）：

'Alice'*5 输出：’AliceAliceAliceAliceAlice’

赋值语句：可以覆写变量，不用声明数据类型

变量名中不允许有短横线、空格、数字开头、特殊字符

注释使用 #

输入输出函数：

print('hello,world')
myName=input()
print('It is good to meet you, '+myName)
print(len(myName))

str()函数可以传入一个整数值并求值为它的字符串形式

1 2	>>> str(29) '29'

int()和float()函数分别将传入值转化为整数和字符串形式

2.控制流

整数或浮点数的值永远不会与字符串相等

1 2	>>> 42=='42' False

<、>、<=、>=操作符仅用于整型和浮点型值

布尔操作符：

二元：and和or 一元：not

优先级： not>and>or

1 2	>>> 2+2==4 and not 2+2==5 and 2*2==4 True

控制流语句：

if else elif

# name='Mary'
name='lee'
password='1234'
age=10
if name=='Mary':
    print("hello,Mary")
    if password=='1234':
        print('Accedd granted.')
    else:
        print('Wrong password.')
elif age<12:
    print('you are not Alice')

while

spam=0
while spam<5:
    print('hello')
    spam=spam+1
# --------------------------------
name=''
while name!='lee':
    print('Please tpye your name.')
    name=input()
print('Thank you.')

break

while True:
    print('Type your name.')
    name=input()
    if name=='lee':
        break
print('Thank you.')

continue

while True:
    print('Type your name.')
    name=input()
    if name!='Joe':
        continue
    else:
        break

for循环和range()函数

print('My name is ')
for i in range(5):
    print('Jimmy Five Times (' + str(i)+')')
# 输出
My name is 
Jimmy Five Times (0)
Jimmy Five Times (1)
Jimmy Five Times (2)
Jimmy Five Times (3)
Jimmy Five Times (4)
# 等价的while循环
i=0
while i<5:
    print('Jimmy Five Times ('+str(i)+')')
    i=i+1

total = 0
for num in range(101):
    total = total + num
print(total)
# 输出
5050

range()函数的开始、停止和步长参数

for i in range(0,10,2):
    print(i)
# 输出
0
2
4
6
8
# --------------------------
for i in range(5,-1,-1):
    print(i)
# 输出
5
4
3
2
1
0

导入模块

import random
# 等价于 from random import *
# 同时引入多个模块 import random,sys,os,math
for i in range(5):
    # 随机生成从1到10的整数，包括1和10
    print(random.randint(1,10))
# 输出
8
8
9
1
5

用sys.exit()函数提前结束程序

import sys
while True:
    print('Type exit to exit.')
    response=input()
    if response=='exit':
        sys.exit()
    print('You typed '+response+'.')

3.函数

使用def关键字

# name为参数
def hello(name):
    print('Hello,'+name)
name='lee'
hello(name)

返回值和return语句

def getStr(numberStr):
    return int(numberStr)
n='56'
s=getStr(n)
print(s)

Python中有一个值称为None，等价于C++中的null

>>> spam=print('hello')
hello
>>> None==spam
True

end关键字参数将默认传入字符串末尾的换行符变成另一个字符串

print('Hello',end='')
print('World')
# 输出
HelloWorld

sep关键字参数替换默认的分隔字符串（默认的为空格）

print('cats','dogs','mice')
print('cats','dogs','mice',sep=',')
# 输出
cats dogs mice
cats,dogs,mice

global语句：在一个函数内修改全局变量

def spam():
    global eggs
    eggs='spam local'
eggs='global'
spam()
print(eggs)
# 输出
spam local

异常处理：

def spam(divideBy):
    try:
        return 42/divideBy
    except ZeroDivisionError:
        print('Error: Invalid argument.')
print(spam(2))
print(spam(21))
print(spam(0))
# 输出
21.0
2.0
Error: Invalid argument.
None

另一种写法

def spam(divideBy):
    return 42/divideBy
try:
    print(spam(2))
    print(spam(21))
    print(spam(0))
except ZeroDivisionError:
    print('Error: Invalid argument.')
# 输出
21.0
2.0
Error: Invalid argument.

小程序：Zigzag

import time,sys
indent=0
indentIncreasing=True
try:
    while True:
        print(' '*indent,end='')
        print('********')
        time.sleep(0.1)
        if indentIncreasing:
            indent=indent+1
            if indent==20:
                indentIncreasing=False
        else:
            indent=indent-1
            if indent==0:
                indentIncreasing=True
except KeyboardInterrupt: # 手动退出
    sys.exit() 
# 输出
********
 ********
  ********
   ********
    ********
     ********
      ********
       ********
        ********
         ********
          ********
           ********
            ********
             ********
              ********
               ********
                ********
                 ********
                  ********
                   ********
                    ********
                   ********
                  ********
                 ********
                ********
               ********
              ********
             ********
            ********
           ********
          ********
         ********
        ********
       ********
      ********
     ********
    ********
   ********
  ********
 ********
********

4.列表

“列表”是一个值，包含由多个值构成的序列。

1
2
3

>>> spam=['hello',3.1415,True,None,42]
>>> spam
['hello', 3.1415, True, None, 42]

spam变量只被赋予一个值：列表值。

值[]是一个空列表

用索引取得列表中的单个值：

>>> spam=['hello',3.1415,True,None,42]
>>> spam
['hello', 3.1415, True, None, 42]
>>> spam[0]
'hello'
>>> spam[1]
3.1415
>>> spam[2]
True
>>> spam[3]
>>> spam[4]
42

索引只能是整数，不能是浮点数

>>> spam[1.0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not float
>>> spam[int(1.0)]
3.1415

列表可以包含其他列表值，可以通过多重索引来访问

>>> spam=[['cat','bat'],[10,20,30]]
>>> spam[0]
['cat', 'bat']
>>> spam[0][1]
'bat'
>>> spam[1][2]
30

负数索引： -1表示最后一个索引，-2表示倒数第二个索引

>>> spam=['cat','bat','rat','elephant']
>>> spam[-1]
'elephant'
>>> spam[-3]
'bat'

利用切片取得子列表

>>> spam=['cat','bat','rat','elephant']
>>> spam[0:4]
['cat', 'bat', 'rat', 'elephant']
>>> spam[0:-1]
['cat', 'bat', 'rat']
>>> spam[:2]
['cat', 'bat']
>>> spam[1:]
['bat', 'rat', 'elephant']
>>> spam[:]

用len()函数取得列表长度

1
2
3

>>> spam=['cat','bat','rat','elephant']
>>> len(spam)
4

改变列表中的值

1
2
3

>>> spam[2]=12345
>>> spam
['cat', 'bat', 12345, 'elephant']

列表连接和复制

>>> [1,2,3]+['A','B','C']
[1, 2, 3, 'A', 'B', 'C']
>>> ['x','y','z']*3
['x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z']
>>> spam=[1,2,3]
>>> spam=spam+['A','B','C']
>>> spam
[1, 2, 3, 'A', 'B', 'C']

用del语句从列表中删除值

>>> spam
[1, 2, 3, 'A', 'B', 'C']
>>> del spam[2]
>>> spam
[1, 2, 'A', 'B', 'C']

列表用于循环

1 2	for i in [1,2,3,4]: print(i)

在for循环中使用range(len(someList))迭代列表的每一个索引

1
2
3

spam=[1,2,3,4]
for i in range(len(spam)):
    print(spam[i])

利用in和not in操作符，可以确定一个值是否在列表中

spam=[1,2,3,4]
num= 1 not in spam
t=5 in spam
print(num,t)
# 输出
False False

多重赋值技巧

>>> cat=[1,2,3]
>>> x,y,z=cat
>>> x
1
>>> y
2
>>> z
3

将enumerate()函数与列表一起使用，在循环的每次迭代中enumerate()函数将返回两个值：列表中的索引和表项本身

test=['a','b','c']
for index,item in enumerate(test):
    print('Index '+str(index)+' in test is: '+item)
# 输出
Index 0 in test is: a
Index 1 in test is: b
Index 2 in test is: c

random.choice()函数将从列表中然会一个随机选择的表项

>>> import random
>>> pets=['cat','dog','bat']
>>> random.choice(pets)
'bat'
>>> random.choice(pets)
'bat'
>>> random.choice(pets)
'dog'

random.shuffle()函数将对列表中的表项重新排序，就地修改列表而不返回新的列表

>>> import random
>>> pets=['cat','dog','bat']
>>> random.shuffle(pets)
>>> pets
['dog', 'bat', 'cat']
>>> random.shuffle(pets)
>>> pets
['bat', 'dog', 'cat']

增强的赋值语句	等价的赋值语句
spam+=1	spam=spam+1
spam-=1	spam=spam-1
spam*=1	spam=spam*1
spam/=1	spam=spam/1
spam%=1	spam=spam%1

方法：

用index()方法在列表中查找值

方法属于单个数据类型，append()和insert()方法是列表方法，只能在列表上调用

用remove()从列表中删除值，如果该值在列表中出现多次则只有第一次出现的值会被删除

>>> spam=['a','b','c']
>>> spam.index('a')
0
>>> spam.append('d')
>>> spam
['a', 'b', 'c', 'd']
>>> spam.insert(1,'e')
>>> spam
['a', 'e', 'b', 'c', 'd']
>>> spam.remove('e')
>>> spam
['a', 'b', 'c', 'd']
>>> spam.insert(1,'b')
>>> spam
['a', 'b', 'b', 'c', 'd']
>>> spam.remove('b')
>>> spam
['a', 'b', 'c', 'd']

用sort()方法将列表中的值排序

可指定reverse关键字参数为True，让sort()方法按逆序排序

>>> spam=[2,5,3.14,1,-7]
>>> spam.sort()
>>> spam
[-7, 1, 2, 3.14, 5]
>>> spam.sort(reverse=True)
>>> spam
[5, 3.14, 2, 1, -7]
>>> test=['b','a','c']
>>> test.sort()
>>> test
['a', 'b', 'c']
>>> test.sort(reverse=True)
>>> test
['c', 'b', 'a']

不能对既有数字又有字符串值的列表排序

sort()方法对字符串排序是使用“ASCII码字符顺序”，而不是实际的字典顺序

>>> spam=['A','a','B','b','C','c']
>>> spam.sort()
>>> spam
['A', 'B', 'C', 'a', 'b', 'c']

调用sort()方法时将关键字参数key设置为str.lower可按照普通的字典顺序来排序

>>> spam=['a','z','A','Z']
>>> spam.sort(key=str.lower)
>>> spam
['a', 'A', 'z', 'Z']

使用reverse()方法反转列表中的值

>>> spam
['a', 'A', 'z', 'Z']
>>> spam.reverse()
>>> spam
['Z', 'z', 'A', 'a']

字符串和列表是相似的，对列表的许多操作也可以作用于字符串和序列类型的其他值：按索引取值、切片、用于for循环、用于len()函数，以及用于in和not in操作符。

列表是可变的，字符串是不可变的

元组与列表的区别：

1、元组用圆括号而不是方括号

2、元组不可变，列表可变

如果需要一个永远不会改变值的序列，就使用元组

>>> eggs=('hello',42,0.5)
>>> eggs[0]
'hello'
>>> len(eggs)
3
>>> eggs[1]=99
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

如果元组中只有一个值，则需要在括号内该值的后面加上一个逗号来表明这种情况，否则Python会认为这是在一个普通括号内输入了一个值。

>>> ('hello',)
('hello',)
>>> ('hello')
'hello'

用list()和tuple()函数来转换类型

>>> tuple([1,2,3])
(1, 2, 3)
>>> list((1,2,3))
[1, 2, 3]
>>> list('hello')
['h', 'e', 'l', 'l', 'o']

引用变量后修改不会影响原来的变量的值，而引用列表后修改会影响原来的列表的值

# 变量
>>> spam=42
>>> c=spam
>>> spam=100
>>> spam
100
>>> c
42
#-----------------
# 列表
>>> spam=[0,1,2,3,4,5]
>>> c=spam
>>> c[1]='hello'
>>> spam
[0, 'hello', 2, 3, 4, 5]
>>> c
[0, 'hello', 2, 3, 4, 5]

Python中所有值都有一个唯一的标识，可以通过id()函数获得该标识

1 2	>>> id('lee') 1939550599184

列表是可变对象，append()方法不会创建新的列表对象，而是更改现有的列表对象。

>>> eggs=['a','b','c']
>>> id(eggs)
1939550606152
>>> eggs.append('d')
>>> id(eggs)
1939550606152
>>> eggs=[1,2,3,4] # 创建了一个新的列表
>>> id(eggs)
1939543515720 # 与之前两个id不同

传递引用：

def eggs(p):
    p.append('hello')
spam=[1,2,3]
eggs(spam)
print(spam) # 输出为 [1, 2, 3, 'hello']

copy模块的copy()和deepcopy()函数

copy.copy()函数可用于赋值列表或字典这样的可变值，而不只是复制引用

当要复制的列表中包含列表，就使用copy.deepcopy()函数，将同时复制它们内部的列表

# copy()
>>> import copy
>>> spam=[1,2,3,4]
>>> id(spam)
1939550606344
>>> c=copy.copy(spam)
>>> id(c)
1939550691208
>>> c[1]=42
>>> spam
[1, 2, 3, 4]
>>> c
[1, 42, 3, 4]
#-----------------------
# deepcopy()
>>> import copy
>>> spam=[[1,2,3],[4,5,6]]
>>> id(spam)
1939550691144
>>> c=copy.deepcopy(spam)
>>> id(c)
1939550606344
>>> c[1][1]=10
>>> spam
[[1, 2, 3], [4, 5, 6]]
>>> c
[[1, 2, 3], [4, 10, 6]]

5.字典和结构化数据

字典中的项是不排序的，无法用整数索引来访问其中的项，不能像列表那样切片，但可以用任意值作为键

创建序列值时字典将记住其键值对的插入顺序

1
2
3

>>> eggs={'name':'lee','species':'cat','age':'8'}
>>> list(eggs)
['name', 'species', 'age']

keys()、values()和items()方法分别返回键、值和键值对，这些返回值不是真正的列表，它们不能被修改，没有append()方法，但这些数据类型可用于for循环

items()方法返回的dict_item值包含的是键和值的元祖

>>> spam={'color':'red','age':42}
>>> spam.keys()
dict_keys(['color', 'age'])
>>> list(spam.keys())
['color', 'age']

多重赋值将键和值赋给不同的变量

>>> for k,v in spam.items():
...     print('key: '+k+' value: '+str(v))
...
key: color value: red
key: age value: 42

可使用in和not in检查某个键或值是否存在于字典中

字典有一个get()方法，分别为要取得其值的键以及当该键不存在时返回的备用值

>>> item={'cups':2}
>>> 'I am bringing '+str(item.get('cups',0))+' cups.'
'I am bringing 2 cups.'
>>> 'I am bringing '+str(item.get('eggs',0))+' eggs.'
'I am bringing 0 eggs.'

setdefault()方法第一个参数是要检查的键，第二个参数是当该键不存在时要设置的值，如果该键确实存在则返回键的值。

>>> item={'cups':2}
>>> item.setdefault('color','red')
'red'
>>> item
{'cups': 2, 'color': 'red'}
>>> item.setdefault('color','black')
'red'

一个可以计算一个字符串中每个字符出现的次数的小程序

message='It was a bright cold day in April, and the clocks were striking thirteen.'
counts={}
for character in message:
    counts.setdefault(character,0)
    counts[character]+=1
print(counts)
# 输出
{'I': 1, 't': 6, ' ': 13, 'w': 2, 'a': 4, 's': 3, 'b': 1, 'r': 5, 'i': 6, 'g': 2, 'h': 3, 'c': 3, 'o': 2, 'l': 3, 'd': 3, 'y': 1, 'n': 4, 'A': 1, 'p': 1, ',': 1, 'e': 5, 'k': 2, '.': 1}

导入pprint模块是的输出更美观，输出的键是排过序的

import pprint
message='It was a bright cold day in April, and the clocks were striking thirteen.'
counts={}
for character in message:
    counts.setdefault(character,0)
    counts[character]+=1
pprint.pprint(counts)
# 输出
{' ': 13,
 ',': 1, 
 '.': 1,
 'A': 1,
 'I': 1,
 'a': 4,
 'b': 1,
 'c': 3,
 'd': 3,
 'e': 5,
 'g': 2,
 'h': 3,
 'i': 6,
 'k': 2,
 'l': 3,
 'n': 4,
 'o': 2,
 'p': 1,
 'r': 5,
 's': 3,
 't': 6,
 'w': 2,
 'y': 1}

如果字典本身包含嵌套的列表或字典，那么pprint.pprint()函数就特别有用

如果希望文本作为字符串输出而不是显示在屏幕上就调用pprint.pformat()函数，下面两行代码是等价的

1 2	pprint.pprint(counts) print(pprint.pformat(counts))

6.字符串操作

字符串可以用双引号开始和结束，好处是字符串中可以使用单引号字符

可在字符串开始的引号之前加上r，使字符串成为原始字符串（不考虑转义字符）

>>> print(r'That is Lee\'s cat.')
That is Lee\'s cat.
>>> print('That is Lee\'s cat.')
That is Lee's cat.

多行字符串用3个单引号或3个双引号包围（开始和结尾处都有）

>>> print('''Dear Lee,
... hello.
... Sincerely,
... Bob''')
Dear Lee,
hello.
Sincerely,
Bob

多行字符串常常用于多行注释，下面是完全有效的python代码

import pprint
message='It was a bright cold day in April, and the clocks were striking thirteen.'
counts={}
for character in message:
    counts.setdefault(character,0)
    counts[character]+=1
""" pprint.pprint(counts)
print(pprint.pformat(counts)) """

字符串像列表一样，可以使用索引和切片，也可以使用in和not in操作符

利用“字符串插值”，其中字符串内的%s运算符充当标记，并由字符串后的值代替。好处之一是不必调用str()函数。

>>> name='Alice'
>>> age=18
>>> 'My name is %s,and I am %d years old.' % (name,age)
'My name is Alice,and I am 18 years old.'

使用“f字符串”，在起始引号之前带有一个f前缀

>>> name='Alice'
>>> age=18
>>> f'My name is {name},and I am {age} years old.'
'My name is Alice,and I am 18 years old.'

upper()和lower()字符串方法将原字符串中所有字母相应地转换为大写或小写，非字母字符保持不变。注意，这些方法并没有改变字符串本身，而是返回一个新字符串。

如果字符串中含有字母，并且所有字母都是大写或小写，那么isupper()和islower()方法会相应地返回布尔值True，否则返回False

isX()字符串方法：

方法	何时返回True值
isalpha()	只包含字母，并且非空
isalnum()	只包含字母和数字，并且非空
isdecimal()	只包含数字字符，并且非空
isspace()	只包含空格、制表符和换行符，并且非空
istitle()	只包含以大写字母开头、后面都是小写字母的单词、数字或空格

join()和split()

>>> 'ABC'.join(['a', 'b', 'c'])
'aABCbABCc'
>>> ' '.join(['a', 'b', 'c'])
'a b c'
>>> ', '.join(['a', 'b', 'c'])
'a, b, c'
>>> 'I am Lee'.split()
['I', 'am', 'Lee']
>>> 'IABCamABCLee'.split('ABC')
['I', 'am', 'Lee']

partition()方法分隔字符串

>>> 'hello,world'.partition('w')和
('hello,', 'w', 'orld')
>>> 'hello,world'.partition('x') # 找不到分隔符的情况
('hello,world', '', '')

用rjust()、ljust()和center()方法对齐文本

用strip()、rstrip()和lstrip()方法删除空白字符

用ord()函数获取单字符字符串的代码点，用chr()函数获取一个整数代码点的单字符字符串

>>> ord('A')
65
>>> ord('3')
51
>>> chr(65)
'A'

用pyperclip模块复制粘贴字符串

自动化任务

7.模式匹配与正则表达式

用正则表达式查找文本模式

re.compile()传入一个字符串值，表示正则表达式，返回一个Regex模式对象

如果找到该模式则·Regex对象的search()方法将返回一个Match对象，该对象有一个group()方法，返回被查找字符串中实际匹配的文本

import re
phoneNumberRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
mo = phoneNumberRegex.search('My number is 415-555-4242.')
print('Phone number found: ',mo.group())
# 输出
Phone number found:  415-555-4242

利用括号分组

>>> import re
>>> phoneNumberRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
>>> mo=phoneNumberRegex.search('My number is 415-555-4242.')
>>> mo.group(1)
'415'
>>> mo.group(2)
'555-4242'
>>> mo.group(0)
'415-555-4242'
>>> mo.group()
'415-555-4242'

利用groups()方法一次获取所有的分组

1 2	>>> mo.groups() ('415', '555-4242')

如果要在文本中匹配括号，则需要用转义字符$和$

>>> phoneNumberRegex = re.compile(r'(\(\d\d\d\)) (\d\d\d-\d\d\d\d)')
>>> mo=phoneNumberRegex.search('My number is (415) 555-4242.')
>>> mo.group()
'(415) 555-4242'
>>> mo.groups()
('(415)', '555-4242')

以下字符具有特殊含义，需要进行转义

1	. ^ $ * + ? { } [ ] \ \| ( )

字符|成为“管道”，可用于匹配多个分组

>>> h=re.compile(r'a|b')
>>> mo=h.search('a and b')
>>> mo.group()
'a'
>>> mo=h.search('b and a')
>>> mo.group()
'b'

匹配以某个字符串作为前缀的字符串

>>> bat=re.compile(r'bat(man|cow)')
>>> mo=bat.search('batman')
>>> print(mo.group())
batman
>>> print(mo.group(1))
man

用问号?实现可选匹配

>>> bat=re.compile(r'bat(wo)?man')
>>> mo=bat.search('batman')
>>> mo.group()
'batman'
>>> mo=bat.search('batwoman')
>>> mo.group()
'batwoman'

用星号*匹配零次或多次

>>> bat=re.compile(r'bat(wo)*man')
>>> mo=bat.search('batman')
>>> mo.group()
'batman'
>>> mo=bat.search('batwowowoman')
>>> mo.group()
'batwowowoman'

用加号+匹配一次或多次

>>> bat=re.compile(r'bat(wo)+man')
>>> mo=bat.search('batman')
>>> mo==None
True
>>> mo=bat.search('batwoman')
>>> mo.group()
'batwoman'

用花括号{}匹配特定次数

{3,5}表示匹配3~~5次实例，{,5}表示0~~5次，{3,}表示3次或更多次

>>> h=re.compile(r'(h){3}')
>>> mo=h.search('hhh')
>>> mo.group()
'hhh'
>>> mo=h.search('h')
>>> mo==None
True

贪心匹配和非贪心匹配

>>> h=re.compile(r'(h){3,5}') # 默认是贪心匹配
>>> mo=h.search('hhhhh')
>>> mo.group()
'hhhhh'
>>> h=re.compile(r'(h){3,5}?') # 非贪心匹配
>>> mo=h.search('hhhhh')
>>> mo.group()
'hhh'

search()方法返回第一次匹配的字符串，findall()方法返回所有被匹配的字符串，同时findall()返回的不是Match对象而是一个字符串列表

如果正则表达式中有分组，则findall()方法返回元祖的列表

>>> h=re.compile(r'h')
>>> mo=h.search('hhhhh')
>>> mo.group()
'h'
>>> h.findall('hhhhh')
['h', 'h', 'h', 'h', 'h']
#------------------------------------------------------------
>>> h=re.compile(r'(h)(a)')
>>> h.findall('hahahahaha')
[('h', 'a'), ('h', 'a'), ('h', 'a'), ('h', 'a'), ('h', 'a')]

使用方括号[]建立自己的字符分类

1
2
3

>>> r=re.compile(r'[aeiouAEIOU]')
>>> r.findall('RoboCop eats baby food. BABY FOOD.')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']

字符分类[a-zA-Z0-9]将匹配所有小写字母、大写字母和数字

在方括号内普通的正则表达式符号不会被解释

插入字符^可得到“非字符类”

1
2
3

>>> r=re.compile(r'[^aeiouAEIOU]')
>>> r.findall('RoboCop eats baby food. BABY FOOD.')
['R', 'b', 'C', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.', ' ', 'B', 'B', 'Y', ' ', 'F', 'D', '.']

正则表达式开头处插入符号^表明匹配必须发生在被查找文本开始处，在末尾加上美元符号$表明字符串必须以这个正则表达式的模式结束，同时使用这两个符号表明整个字符串必须匹配整个模式

.称为“通配字符”，可匹配除换行符之外的所有字符

1
2
3

>>> at=re.compile(r'.at')
>>> at.findall('The cat in the hat sat on the flat mat.')
['cat', 'hat', 'sat', 'lat', 'mat']

用.*匹配所有字符串

>>> name=re.compile(r'First Name: (.*) Last Name: (.*)')
>>> mo=name.search('First Name: Al Last Name: Sweigart')
>>> mo.group(1)
'Al'
>>> mo.group(2)
'Sweigart'

用.*?让Python用非贪心模式匹配

传入re.DOTALL作为re.compile()的第二个参数使得通配字符可以匹配换行符

>>> a=re.compile(r'.*')
>>> a.search('a\nb').group()
'a'
#------------
>>> a=re.compile('.*',re.DOTALL)
>>> a.search('a\nb').group()
'a\nb'

向re.compile()传入re.I作为第二个参数使得正则表达式不区分大小写

1
2
3

>>> r=re.compile('a',re.I)
>>> r.search('AAA').group()
'A'

用sub()方法替换字符串

\1将由分组1匹配的文本代替

1
2
3

>>> n=re.compile(r'A')
>>> n.sub('B','A is C')
'B is C'

向re.compile()传入变量re.VERBOSE作为第二个参数使得忽略正则表达式字符串中的空白符和注释

9.读写文件

使用pathlib模块的Path()函数

>>> from pathlib import Path
>>> Path('spam','bacon','eggs')
WindowsPath('spam/bacon/eggs')
>>> str(Path('spam','bacon','eggs'))
'spam\\bacon\\eggs'

用/运算符连接路径

当前工作目录：

>>> from pathlib import Path
>>> import os
>>> Path.cwd()
WindowsPath('C:/Users/Administrator')
>>> os.chdir('C:/Users/Administrator')
>>> Path.cwd()
WindowsPath('C:/Users/Administrator')

主目录：

1 2	>>> Path.home() WindowsPath('C:/Users/Administrator')

创建新文件夹：

1
2

>>> os.makedirs('C:/Users/Administrator/test') # os.mkdirs()可一次创建多个目录
>>> Path(r'C:/Users/Administrator/test1').mkdir() # 通过Path对象创建目录，一次只能创建一个目录

Path对象的is_absolute()方法判断路径是否是绝对路径

os.listdir()查看文件夹内容

用open()函数打开文件：

1	testFile = open('test.txt')

读取文件：

from pathlib import Path
testFile = open('test.txt')
content=testFile.read()
print(content)
# 输出
test
test
#----------------------------
from pathlib import Path
testFile = open('test.txt')
content=testFile.readlines()
print(content)
# 输出
['test\n', 'test']

写入文件，其中参数w表示写模式，写模式会覆盖原有内容，a表示添加模式：

testFile = open('test.txt','w')
testFile.write('This is a test file')
testFile.close()
testFile = open('test.txt','a')
testFile.write('\nThis is a test file')
testFile.close()
# 运行后的文件内容（原本为空）
This is a test file
This is a test file