如果文章包含示例和进阶用法，6, 9, 10, 14 更合适。

2025-3-17

Python `itertools` 模块详解：高效迭代的利器

在 Python 编程中，迭代器（Iterators）和生成器（Generators）是处理序列数据的强大工具。它们允许你以一种高效、节省内存的方式遍历数据，而无需一次性将整个序列加载到内存中。Python 的 itertools 模块提供了一组函数，用于创建和操作迭代器，从而实现各种复杂的迭代模式。本文将深入探讨 itertools 模块，并通过丰富的示例和进阶用法，展示其在实际编程中的强大功能。

1. `itertools` 模块概述

itertools 模块是 Python 标准库的一部分，它包含了一系列用于创建高效迭代器的函数。这些函数大致可以分为以下几类：

无限迭代器： 生成无限序列的迭代器，如 count, cycle, repeat。
有限迭代器： 根据输入序列生成有限序列的迭代器，如 chain, compress, dropwhile, filterfalse, islice, starmap, takewhile, tee, zip_longest。
组合迭代器： 生成输入序列的组合、排列或笛卡尔积的迭代器，如 combinations, combinations_with_replacement, permutations, product。

itertools 模块中的函数通常具有以下优点：

高效性： itertools 函数都是用 C 语言实现的，因此具有很高的执行效率。
内存友好： itertools 函数返回的是迭代器，而不是列表或元组，因此可以节省大量内存，尤其是在处理大型数据集时。
组合性： itertools 函数可以相互组合，从而实现复杂的迭代逻辑。

2. 常用函数详解与示例

接下来，我们将详细介绍 itertools 模块中一些常用的函数，并提供示例代码。

2.1 无限迭代器

count(start=0, step=1): 创建一个无限迭代器，从 start 开始，每次递增 step。

```python
from itertools import count

从 10 开始，每次递增 2

counter = count(10, 2)
for _ in range(5):
print(next(counter)) # 输出: 10, 12, 14, 16, 18
```
cycle(iterable): 创建一个无限迭代器，循环遍历 iterable 中的元素。

```python
from itertools import cycle

colors = cycle(['red', 'green', 'blue'])
for _ in range(7):
print(next(colors)) # 输出: red, green, blue, red, green, blue, red
```
repeat(elem, n=None): 创建一个迭代器，重复 elem 元素 n 次。如果 n 为 None，则无限重复。

```python
from itertools import repeat

重复数字 5 三次

fives = repeat(5, 3)
for num in fives:
print(num) # 输出: 5, 5, 5

无限重复字符串 "hello"

hellos = repeat("hello")

注意：不要直接遍历无限迭代器，会导致无限循环！

```

2.2 有限迭代器

chain(*iterables): 将多个可迭代对象连接成一个迭代器。

```python
from itertools import chain

list1 = [1, 2, 3]
list2 = ['a', 'b', 'c']
tuple1 = (4, 5, 6)

chained_iter = chain(list1, list2, tuple1)
for item in chained_iter:
print(item) # 输出: 1, 2, 3, a, b, c, 4, 5, 6
```
islice(iterable, start, stop[, step]): 对可迭代对象进行切片，类似于列表切片，但返回的是迭代器。

```python
from itertools import islice

data = range(10) # 0 到 9 的整数序列

获取索引 2 到 7（不包括 7）的元素，步长为 2

sliced_iter = islice(data, 2, 7, 2)
for item in sliced_iter:
print(item) # 输出 2, 4, 6
```
zip_longest(*iterables, fillvalue=None): 类似于内置的 zip 函数，但可以处理长度不同的可迭代对象。较短的可迭代对象会被填充 fillvalue。

```python
from itertools import zip_longest

list1 = [1, 2, 3]
list2 = ['a', 'b']

zipped = zip_longest(list1, list2, fillvalue='-')
for item in zipped:
print(item) # 输出: (1, 'a'), (2, 'b'), (3, '-')
```

2.3 组合迭代器 (重点)

combinations(iterable, r): 生成一个迭代器，返回 iterable 中所有长度为 r 的组合。

```python
from itertools import combinations

letters = ['a', 'b', 'c', 'd']

从 letters 中选取 2 个元素的所有组合

combs = combinations(letters, 2)
for comb in combs:
print(comb) # 输出: ('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')
```

进阶用法：
combinations 在处理组合问题时非常有用，例如，从一副扑克牌中抽取特定数量的牌的所有组合。
```python

假设一副扑克牌用以下方式表示

suits = ["C", "D", "H", "S"] # Clubs, Diamonds, Hearts, Spades
ranks = ["2", "3", "4", "5", "6", "7", "8", "9", "T", "J", "Q", "K", "A"]
deck = [r + s for r in ranks for s in suits] # 生成一副完整的扑克牌

从牌堆中抽取 5 张牌的所有组合

from itertools import combinations

for hand in combinations(deck, 5):
# 在这里可以分析每一种手牌，例如判断是否为同花顺等
print(hand)
```
permutations(iterable, r=None): 生成一个迭代器，返回 iterable 中所有长度为 r 的排列。如果 r 为 None，则返回所有可能的全排列。

```python
from itertools import permutations

letters = ['a', 'b', 'c']

letters 的全排列

perms = permutations(letters)
for perm in perms:
print(perm) # 输出: ('a', 'b', 'c'), ('a', 'c', 'b'), ('b', 'a', 'c'), ('b', 'c', 'a'), ('c', 'a', 'b'), ('c', 'b', 'a')

从 letters 中选取 2 个元素的所有排列

perms2 = permutations(letters, 2)
for perm in perms2:
print(perm) # 输出 ('a', 'b'), ('a', 'c'), ('b', 'a'), ('b', 'c'), ('c', 'a'), ('c', 'b')

```

进阶用法: 排列可用于解决诸如密码生成或字符串重排等问题.

```python

生成可能的密码组合(长度为3，字符为'abc')

from itertools import permutations
for password in permutations('abc',3):
print("".join(password))
```
product(*iterables, repeat=1): 生成一个迭代器，返回多个可迭代对象的笛卡尔积。repeat 参数指定重复次数。

```python
from itertools import product

colors = ['red', 'blue']
sizes = ['S', 'M', 'L']

colors 和 sizes 的笛卡尔积

cartesian_product = product(colors, sizes)
for item in cartesian_product:
print(item) # 输出: ('red', 'S'), ('red', 'M'), ('red', 'L'), ('blue', 'S'), ('blue', 'M'), ('blue', 'L')

重复两次 colors

repeated_product = product(colors, repeat=2)
for item in repeated_product:
print(item) # 输出 ('red', 'red'), ('red', 'blue'), ('blue', 'red'), ('blue', 'blue')
**进阶用法：** `product` 可以用来生成多维数据的所有组合，或者在测试中生成测试用例的所有组合。python

假设我们有两个参数需要测试，每个参数有几个可能的值

param1_values = [1, 2, 3]
param2_values = ['A', 'B']

使用 product 生成所有参数组合

from itertools import product
for test_case in product(param1_values, param2_values):
param1, param2 = test_case
# 在这里执行测试，使用 param1 和 param2 的值
print(f"Testing with param1={param1}, param2={param2}")

```
groupby(iterable, key=None): 将 iterable 中相邻且具有相同键值的元素分组。key 是一个函数，用于计算每个元素的键值。如果 key 为 None，则直接比较元素值。

```python
from itertools import groupby

data = [
{'name': 'Alice', 'age': 30},
{'name': 'Bob', 'age': 25},
{'name': 'Charlie', 'age': 30},
{'name': 'David', 'age': 25},
{'name': 'Eve', 'age': 35},
]

按年龄分组

key_func = lambda x: x['age']
grouped_data = groupby(data, key=key_func)

for age, group in grouped_data:
print(f"Age: {age}")
for person in group:
print(f" - {person['name']}")
```

重要提示： groupby 只能对相邻的相同键值的元素进行分组。因此，在使用 groupby 之前，通常需要先对数据进行排序。

```python
from itertools import groupby

data = [1, 1, 2, 2, 2, 1, 1, 3, 3, 3, 3]

未排序的数据

print("Unsorted data:")
for key, group in groupby(data):
print(f"{key}: {[item for item in group]}")

排序后的数据

sorted_data = sorted(data)
print("\nSorted data:")
for key, group in groupby(sorted_data):
print(f"{key}: {[item for item in group]}")
**进阶用法**: `groupby` 经常用于数据分析和处理，例如，按日期对日志进行分组，或按类别对产品进行分组。python

假设有一个日志列表，每个日志条目包含日期和消息

logs = [
{'date': '2023-10-26', 'message': 'Event A'},
{'date': '2023-10-26', 'message': 'Event B'},
{'date': '2023-10-27', 'message': 'Event C'},
{'date': '2023-10-27', 'message': 'Event D'},
{'date': '2023-10-28', 'message': 'Event E'},
]

按日期对日志进行分组

from itertools import groupby
key_func = lambda x: x['date']
sorted_logs = sorted(logs,key=key_func) # 必须先排序！
for date, group in groupby(sorted_logs, key=key_func):
print(f"Date: {date}")
for log in group:
print(f" - {log['message']}")
```

3. `itertools` 的进阶应用

除了上述基本用法外，itertools 还可以用于解决一些更复杂的问题。

3.1 滑动窗口

可以使用 islice 和 tee 实现滑动窗口。tee 函数可以将一个迭代器复制成多个独立的迭代器。

```python
from itertools import islice, tee

def sliding_window(iterable, n):
"""
生成一个迭代器，返回 iterable 中长度为 n 的滑动窗口。
"""
its = tee(iterable, n)
for i, it in enumerate(its):
next(islice(it, i, i), None) # 消耗掉每个迭代器的前 i 个元素
return zip(*its)

data = [1, 2, 3, 4, 5, 6, 7, 8, 9]

获取长度为 3 的滑动窗口

for window in sliding_window(data, 3):
print(window) # 输出: (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9)

```

3.2 分块读取

可以使用 islice 实现分块读取数据。

```python
from itertools import islice

def chunked_read(iterable, size):
"""
将 iterable 分成大小为 size 的块。
"""
iterator = iter(iterable)
while True:
chunk = tuple(islice(iterator, size))
if not chunk:
break
yield chunk

data = range(20)

将 data 分成大小为 5 的块

for chunk in chunked_read(data, 5):
print(chunk) # 输出: (0, 1, 2, 3, 4), (5, 6, 7, 8, 9), (10, 11, 12, 13, 14), (15, 16, 17, 18, 19)
```

4. 总结

itertools 模块是 Python 中一个强大而高效的工具，它提供了一系列函数，用于创建和操作迭代器。通过熟练掌握 itertools，你可以编写出更简洁、更高效、更具可读性的代码。特别是在处理大数据集时，itertools 的内存友好特性可以显著提高程序的性能。建议读者多加练习，将 itertools 应用到实际项目中，体会其带来的便利。

作者：admin

链接：https://hostlocvps.com/2025/03/17/%e5%a6%82%e6%9e%9c%e6%96%87%e7%ab%a0%e5%8c%85%e5%90%ab%e7%a4%ba%e4%be%8b%e5%92%8c%e8%bf%9b%e9%98%b6%e7%94%a8%e6%b3%95%ef%bc%8c6-9-10-14-%e6%9b%b4%e5%90%88%e9%80%82%e3%80%82/

文章版权归作者所有，未经允许请勿转载。

THE END

Wallpaper Engine与同类软件对比：哪款更适合你？

<<上一篇

MySQL索引添加：提升数据库查询效率

下一篇>>

如果文章包含示例和进阶用法，6, 9, 10, 14 更合适。

Python itertools 模块详解：高效迭代的利器

1. itertools 模块概述

2. 常用函数详解与示例

2.1 无限迭代器

从 10 开始，每次递增 2

重复数字 5 三次

无限重复字符串 "hello"

注意：不要直接遍历无限迭代器，会导致无限循环！

2.2 有限迭代器

获取索引 2 到 7（不包括 7）的元素，步长为 2

2.3 组合迭代器 (重点)

从 letters 中选取 2 个元素的所有组合

假设一副扑克牌用以下方式表示

从牌堆中抽取 5 张牌的所有组合

letters 的全排列

从 letters 中选取 2 个元素的所有排列

生成可能的密码组合(长度为3，字符为'abc')

colors 和 sizes 的笛卡尔积

重复两次 colors

假设我们有两个参数需要测试，每个参数有几个可能的值

使用 product 生成所有参数组合

按年龄分组

未排序的数据

排序后的数据

假设有一个日志列表，每个日志条目包含日期和消息

按日期对日志进行分组

3. itertools 的进阶应用