It has been called a “gem” and “pretty much the coolest thing ever,” and if you have not heard of it, then you are missing out on one of the greatest corners of the Python 3 standard library: itertools.

它被称为“宝石”和“几乎是迄今为止最酷的东西” ,如果您还没有听说过它,那么您会错过Python 3标准库最大的角落之一: itertools

A handful of excellent resources exist for learning what functions are available in the itertools module. The docs themselves are a great place to start. So is this post.

存在一些极好的资源,用于学习itertools模块中可用的功能。 文档本身是一个很好的起点。 这个帖子也是如此 。

The thing about itertools, though, is that it is not enough to just know the definitions of the functions it contains. The real power lies in composing these functions to create fast, memory-efficient, and good-looking code.

但是,关于itertools的事情是仅仅知道它所包含的函数的定义是不够的。 真正的功能在于组合这些功能,以创建快速,内存高效且美观的代码。

This article takes a different approach. Rather than introducing itertools to you one function at a time, you will construct practical examples designed to encourage you to “think iteratively.” In general, the examples will start simple and gradually increase in complexity.

本文采用了不同的方法。 您不会构造一次实用的示例,而是鼓励您“反复思考”,而不是一次向您介绍itertools一个功能。 通常,示例将从简单开始,然后逐渐增加复杂性。

A word of warning: this article is long and intended for the intermediate-to-advanced Python programmer. Before diving in, you should be confident using iterators and generators in Python 3, multiple assignment, and tuple unpacking. If you aren’t, or if you need to brush up on your knowledge, consider checking out the following before reading on:

一个警告:本文太长,适合中高级的Python程序员。 在开始学习之前,您应该有信心在Python 3中使用迭代器和生成器,多次分配和元组拆包。 如果您不是,或者需要重新学习,请在继续阅读之前考虑阅读以下内容:

  • Python Iterators: A Step-By-Step Introduction
  • Introduction to Python Generators
  • Chapter 6 of Python Tricks: The Book by Dan Bader
  • Multiple assignment and tuple unpacking improve Python code readability
  • Python迭代器:分步介绍
  • Python生成器简介
  • Python技巧的第6章: Dan Bader 的书
  • 多重分配和元组拆包提高了Python代码的可读性

Free Bonus: Click here to get our itertools cheat sheet that summarizes the techniques demonstrated in this tutorial.

免费奖金: 单击此处以获取我们的itertools备忘单 ,该备忘单总结了本教程中演示的技术。

All set? Let’s start the way any good journey should—with a question.

可以了,好了? 让我们开始任何美好的旅程应该有的方式。

什么是Itertools ,为什么要使用它? (What Is Itertools and Why Should You Use It?)

According to the itertools docs, it is a “module [that] implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML… Together, they form an ‘iterator algebra’ making it possible to construct specialized tools succinctly and efficiently in pure Python.”

根据itertools文档 ,这是一个“模块”,它实现了许多迭代器构造块,这些构造块的灵感来自APL,Haskell和SML的构造……它们共同构成了一个“迭代器代数”,从而可以简洁地构造专用工具,并且在纯Python中高效”。

Loosely speaking, this means that the functions in itertools “operate” on iterators to produce more complex iterators. Consider, for example, the built-in zip() function, which takes any number of iterables as arguments and returns an iterator over tuples of their corresponding elements:

宽松地说,这意味着itertools中的函数在迭代器上“操作”以生成更复杂的迭代器。 例如,考虑内置的zip()函数 ,该函数将任意数量的可迭代对象作为参数,并在对应元素的元组上返回一个迭代器:

 >>> >>>  listlist (( zipzip ([([ 11 , , 22 , , 33 ], ], [[ 'a''a' , , 'b''b' , , 'c''c' ]))
]))
[(1, 'a'), (2, 'b'), (3, 'c')]
[(1, 'a'), (2, 'b'), (3, 'c')]

How, exactly, does zip() work?

zip()到底如何工作?

[1, 2, 3] and ['a', 'b', 'c'], like all lists, are iterable, which means they can return their elements one at a time. Technically, any Python object that implements the .__iter__() or .__getitem__() methods is iterable. (See the Python 3 docs glossary for a more detailed explanation.)

像所有列表一样, ['a', 'b', 'c'] [1, 2, 3]['a', 'b', 'c']是可迭代的,这意味着它们可以一次返回一个元素。 从技术上讲,任何实现.__iter__().__getitem__()方法的Python对象都是可迭代的。 (有关详细说明,请参见Python 3文档词汇表 。)

The iter() built-in function, when called on an iterable, returns an iterator object for that iterable:

iter()内置函数会返回该迭代器的迭代器对象 :

Under the hood, the zip() function works, in essence, by calling iter() on each of its arguments, then advancing each iterator returned by iter() with next() and aggregating the results into tuples. The iterator returned by zip() iterates over these tuples.

在本质上, zip()函数的工作原理是,在其每个参数上调用iter() ,然后将iter()返回的每个迭代器与next()推进,并将结果聚合为元组。 zip()返回的迭代器在这些元组上进行迭代。

The map() built-in function is another “iterator operator” that, in its simplest form, applies a single-parameter function to each element of an iterable one element at a time:

map()内置函数是另一个“迭代器运算符”,它以最简单的形式一次将一个单参数函数应用于可迭代一个元素的每个元素:

 >>> >>>  listlist (( mapmap (( lenlen , , [[ 'abc''abc' , , 'de''de' , , 'fghi''fghi' ]))
]))
[3, 2, 4]
[3, 2, 4]

The map() function works by calling iter() on its second argument, advancing this iterator with next() until the iterator is exhausted, and applying the function passed to its first argument to the value returned by next() at each step. In the above example, len() is called on each element of ['abc', 'de', 'fghi'] to return an iterator over the lengths of each string in the list.

map()函数的工作方式是在其第二个参数上调用iter() ,将该next()迭代器前进,直到迭代器用尽,然后在每个步骤中将传递给其第一个参数的函数应用于next()返回的值。 在上面的示例中,在['abc', 'de', 'fghi']每个元素上调用len()以返回遍历列表中每个字符串长度的迭代器。

Since iterators are iterable, you can compose zip() and map() to produce an iterator over combinations of elements in more than one iterable. For example, the following sums corresponding elements of two lists:

由于迭代器是可迭代的 ,因此可以组合zip()map()以在多个可迭代的元素组合上生成一个迭代器。 例如,以下内容总结了两个列表的相应元素:

This is what is meant by the functions in itertools forming an “iterator algebra.” itertools is best viewed as a collection of building blocks that can be combined to form specialized “data pipelines” like the one in the example above.

这就是itertools构成“迭代器代数”的功能的含义。 最好将itertools看作是一组构建基块,可以将其组合起来以形成专门的“数据管道”,如上面的示例中所示。

Historical Note: In Python 2, the built-in zip() and map() functions do not return an iterator, but rather a list. To return an iterator, the izip() and imap() functions of itertools must be used. In Python 3, izip() and imap() have been removed from itertools and replaced the zip() and map() built-ins. So, in a way, if you have ever used zip() or map() in Python 3, you have already been using itertools!

历史记录:在Python 2中,内置的zip()map()函数不返回迭代器,而是返回一个列表。 要返回迭代器,必须使用itertoolsizip()imap()函数。 在Python 3中, 已从itertools删除了 izip()imap() ,并替换了zip()map()内置函数。 因此,以某种方式,如果您曾经在Python 3中使用过zip()map() ,那么您已经在使用itertools

There are two main reasons why such an “iterator algebra” is useful: improved memory efficiency (via lazy evaluation) and faster execuction time. To see this, consider the following problem:

使用这种“迭代器代数”的主要原因有两个:提高的内存效率(通过惰性评估 )和更快的执行时间。 要看到这一点,请考虑以下问题:

Given a list of values inputs and a positive integer n, write a function that splits inputs into groups of length n. For simplicity, assume that the length of the input list is divisible by n. For example, if inputs = [1, 2, 3, 4, 5, 6] and n = 2, your function should return [(1, 2), (3, 4), (5, 6)].

给定一个值inputs列表和一个正整数n ,编写一个函数将inputs分成长度为n组。 为简单起见,假设输入列表的长度可被n整除。 例如,如果inputs = [1, 2, 3, 4, 5, 6]n = 2 ,则函数应返回[(1, 2), (3, 4), (5, 6)]

Taking a naive approach, you might write something like this:

采取幼稚的方法,您可能会这样写:

 def def naive_groupernaive_grouper (( inputsinputs , , nn ):):num_groups num_groups = = lenlen (( inputsinputs ) ) // // nnreturn return [[ tupletuple (( inputsinputs [[ ii ** nn :(:( ii ++ 11 )) ** nn ]) ]) for for i i in in rangerange (( num_groupsnum_groups )]
)]

When you test it, you see that it works as expected:

测试它时,您会看到它按预期工作:

What happens when you try to pass it a list with, say, 100 million elements? You will need a whole lot of available memory! Even if you have enough memory available, your program will hang for a while until the output list is populated.

当您尝试通过包含1亿个元素的列表时,会发生什么情况? 您将需要大量可用内存! 即使您有足够的可用内存,您的程序也将挂起一段时间,直到填充输出列表。

To see this, store the following in a script called naive.py:

要查看此信息,请将以下内容存储在名为naive.py的脚本中:

 def def naive_groupernaive_grouper (( inputsinputs , , nn ):):num_groups num_groups = = lenlen (( inputsinputs ) ) // // nnreturn return [[ tupletuple (( inputsinputs [[ ii ** nn :(:( ii ++ 11 )) ** nn ]) ]) for for i i in in rangerange (( num_groupsnum_groups )])]for for _ _ in in naive_groupernaive_grouper (( rangerange (( 100000000100000000 ), ), 1010 ):):pass
pass

From the console, you can use the time command (on UNIX systems) to measure memory usage and CPU user time. Make sure you have at least 5GB of free memory before executing the following:

从控制台,您可以使用time命令(在UNIX系统上)来测量内存使用率和CPU用户时间。 执行以下操作之前,请确保至少有5GB的可用内存:

Note: On Ubuntu, you may need to run /usr/bin/time instead of time for the above example to work.

注意:在Ubuntu上,您可能需要运行/usr/bin/time而不是time才能使上面的示例正常工作。

The list and tuple implementation in naive_grouper() requires approximately 4.5GB of memory to process range(100000000). Working with iterators drastically improves this situation. Consider the following:

naive_grouper()listtuple实现需要大约4.5GB的内存来处理range(100000000) 。 使用迭代器可以大大改善这种情况。 考虑以下:

 def def better_grouperbetter_grouper (( inputsinputs , , nn ):):iters iters = = [[ iteriter (( inputsinputs )] )] * * nnreturn return zipzip (( ** itersiters )
)

There’s a lot going on in this little function, so let’s break it down with a concrete example. The expression [iters(inputs)] * n creates a list of n references to the same iterator:

这个小功能有很多事情要做,所以让我们用一个具体的例子来分解它。 表达式[iters(inputs)] * n创建一个由n对同一迭代器的引用的列表:

Next, zip(*iters) returns an iterator over pairs of corresponding elements of each iterator in iters. When the first element, 1, is taken from the “first” iterator, the “second” iterator now starts at 2 since it is just a reference to the “first” iterator and has therefore been advanced one step. So, the first tuple produced by zip() is (1, 2).

接下来, zip(*iters)iters中每个迭代器的对应元素对上返回一个迭代器。 当从“第一”迭代器获取第一个元素1 ,“第二”迭代器现在从2开始,因为它只是对“第一”迭代器的引用,因此已前进了一步。 因此, zip()产生的第一个元组是(1, 2)

At this point, “both” iterators in iters start at 3, so when zip() pulls 3 from the “first” iterator, it gets 4 from the “second” to produce the tuple (3, 4). This process continues until zip() finally produces (9, 10) and “both” iterators in iters are exhausted:

此时,迭代器中的“两个”迭代器iters3开始,因此当zip()从“第一个”迭代器中提取3 ,它从“第二个”迭代器中获取4来生成元组(3, 4) 。 该过程继续直到zip()最终生成(9, 10)和“两者”迭代中iters耗尽:

 >>> >>>  nums nums = = [[ 11 , , 22 , , 33 , , 44 , , 55 , , 66 , , 77 , , 88 , , 99 , , 1010 ]
]
>>> >>>  listlist (( better_grouperbetter_grouper (( numsnums , , 22 ))
))
[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]
[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]

The better_grouper() function is better for a couple of reasons. First, without the reference to the len() built-in, better_grouper() can take any iterable as an argument (even infinite iterators). Second, by returning an iterator rather than a list, better_grouper() can process enormous iterables without trouble and uses much less memory.

better_grouper()函数之所以更好,有几个原因。 首先,无需引用内置的len()better_grouper()可以将任何iterable作为参数(甚至是无限迭代器)。 其次,通过返回一个迭代器而不是一个列表, better_grouper()可以毫不费力地处理大量的可迭代对象,并使用更少的内存。

Store the following in a file called better.py and run it with time from the console again:

将以下内容存储在名为better.py的文件中,然后随着time再次从控制台运行它:

 $ $ time -f time -f "Memory used (kB): %MnUser time (seconds): %U" python3 better.py
"Memory used (kB): %MnUser time (seconds): %U" python3 better.py
Memory used (kB): 7224
Memory used (kB): 7224
User time (seconds): 2.48
User time (seconds): 2.48

That’s a whopping 630 times less memory used than naive.py in less than a quarter of the time!

在不到四分之一的时间内,这比naive.py使用的内存少了630倍!

Now that you’ve seen what itertools is (“iterator algebra”) and why you should use it (improved memory efficiency and faster execution time), let’s take a look at how to take better_grouper() to the next level with itertools.

既然您已经了解了itertools是什么(“迭代器代数”)以及为什么要使用它(提高了内存效率和更快的执行时间),下面让我们看一下如何使用itertoolsbetter_grouper()提升到一个新的水平。

grouper食谱 (The grouper Recipe)

The problem with better_grouper() is that it doesn’t handle situations where the value passed to the second argument isn’t a factor of the length of the iterable in the first argument:

better_grouper()的问题在于,它无法处理传递给第二个参数的值不是第一个参数的iterable长度的因素的情况:

The elements 9 and 10 are missing from the grouped output. This happens because zip() stops aggregating elements once the shortest iterable passed to it is exhausted. It would make more sense to return a third group containing 9 and 10.

分组的输出中缺少元素9和10。 发生这种情况是因为zip()一旦传递给它的最短可迭代次数就停止了对元素的聚合。 返回包含9和10的第三组会更有意义。

To do this, you can use itertools.zip_longest(). This function accepts any number of iterables as arguments and a fillvalue keyword argument that defaults to None. The easiest way to get a sense of the difference between zip() and zip_longest() is to look at some example output:

为此,您可以使用itertools.zip_longest() 。 此函数接受任意数量的fillvalue作为参数,以及一个默认为Nonefillvalue关键字参数。 了解zip()zip_longest()之间差异的最简单方法是查看一些示例输出:

 >>> >>>  import import itertools itertools as as it
it
>>> >>>  x x = = [[ 11 , , 22 , , 33 , , 44 , , 55 ]
]
>>> >>>  y y = = [[ 'a''a' , , 'b''b' , , 'c''c' ]
]
>>> >>>  listlist (( zipzip (( xx , , yy ))
))
[(1, 'a'), (2, 'b'), (3, 'c')]
[(1, 'a'), (2, 'b'), (3, 'c')]
>>> >>>  listlist (( itit .. zip_longestzip_longest (( xx , , yy ))
))
[(1, 'a'), (2, 'b'), (3, 'c'), (4, None), (5, None)]
[(1, 'a'), (2, 'b'), (3, 'c'), (4, None), (5, None)]

With this in mind, replace zip() in better_grouper() with zip_longest():

考虑到这一点,代替zip()better_grouper()zip_longest()

Now you get a better result:

现在您可以获得更好的结果:

 >>> >>>  nums nums = = [[ 11 , , 22 , , 33 , , 44 , , 55 , , 66 , , 77 , , 88 , , 99 , , 1010 ]
]
>>> >>>  printprint (( listlist (( groupergrouper (( numsnums , , 44 )))
)))
[(1, 2, 3, 4), (5, 6, 7, 8), (9, 10, None, None)]
[(1, 2, 3, 4), (5, 6, 7, 8), (9, 10, None, None)]

The grouper() function can be found in the Recipes section of the itertools docs. The recipes are an excellent source of inspiration for ways to use itertools to your advantage.

可在itertools文档的“ 食谱”部分中找到grouper()函数。 食谱是启发您使用itertools的绝佳灵感来源。

Note: From this point forward, the line import itertools as it will not be included at the beginning of examples. All itertools methods in code examples are prefaced with it. The module import is implied.

注意 :从现在开始,该行import itertools as it不会出现在示例的开头。 代码示例中的所有itertools方法均以其开头it. 暗含模块导入。

If you get a NameError: name 'itertools' is not defined or a NameError: name 'it' is not defined exception when running one of the examples in this tutorial you’ll need to import the itertools module first.

如果在运行本教程中的示例之一时遇到NameError: name 'it' is not defined NameError: name 'itertools' is not definedNameError: name 'it' is not defined NameError: name 'itertools' is not defined异常,则需要首先导入itertools模块。

恩,蛮力? (Et tu, Brute Force?)

Here’s a common interview-style problem:

这是一个常见的面试风格问题:

You have three $20 dollar bills, five $10 dollar bills, two $5 dollar bills, and five $1 dollar bills. How many ways can you make change for a $100 dollar bill?

您有三个20美元的钞票,五个10美元的钞票,两个5美元的钞票和五个1美元的钞票。 100美元面额的钞票有几种兑换方式?

To “brute force” this problem, you just start listing off the ways there are to choose one bill from your wallet, check whether any of these makes change for $100, then list the ways to pick two bills from your wallet, check again, and so on and so forth.

要“蛮力”解决这个问题,您只需开始列出从钱包中选择一张钞票的方法,检查其中有没有100美元能找零,然后列出从钱包中选择两张钞票的方法,再次检查,等等等等。

But you are a programmer, so naturally you want to automate this process.

但是您是一名程序员,所以自然要自动执行此过程。

First, create a list of the bills you have in your wallet:

首先,创建您钱包中的账单清单:

A choice of k things from a set of n things is called a combination, and itertools has your back here. The itertools.combination() function takes two arguments—an iterable inputs and a positive integer n—and produces an iterator over tuples of all combinations of n elements in inputs.

从n个事物中选择k个事物称为“ 组合”itertools您。 itertools.combination()函数采用两个参数(一个可迭代的inputs和一个正整数n ,并在inputsn元素的所有组合的元组上生成一个迭代器。

For example, to list the combinations of three bills in your wallet, just do:

例如,要列出钱包中三张钞票的组合,只需执行以下操作:

 >>> >>>  listlist (( itit .. combinationcombination (( billsbills , , 33 ))
))
 [(20, 20, 20), (20, 20, 10), (20, 20, 10), ... ]
 [(20, 20, 20), (20, 20, 10), (20, 20, 10), ... ]

To solve the problem, you can loop over the positive integers from 1 to len(bills), then check which combinations of each size add up to $100:

要解决此问题,您可以将正整数从1循环到len(bills) ,然后检查每种大小的哪些组合加起来为$ 100:

If you print out makes_100, you will notice there are a lot of repeated combinations. This makes sense because you can make change for $100 with three $20 dollar bills and four $10 bills, but combinations() does this with the first four $10 dollars bills in your wallet; the first, third, fourth and fifth $10 dollar bills; the first, second, fourth and fifth $10 bills; and so on.

如果打印出makes_100 ,则会发现有很多重复的组合。 这是有道理的,因为您可以用三个20美元的钞票和四个10美元的钞票进行100美元的找零,但是combinations()使用钱包中的前四个10美元的钞票进行找零; 第一,第三,第四和第五张10美元的钞票; 第一,第二,第四和第五张10美元的钞票; 等等。

To remove duplicates from makes_100, you can convert it to a set:

要从makes_100删除重复makes_100 ,可以将其转换为set

 >>> >>>  setset (( makes_100makes_100 )
)
{(20, 20, 10, 10, 10, 10, 10, 5, 1, 1, 1, 1, 1),
{(20, 20, 10, 10, 10, 10, 10, 5, 1, 1, 1, 1, 1),
 (20, 20, 10, 10, 10, 10, 10, 5, 5),
 (20, 20, 10, 10, 10, 10, 10, 5, 5),
 (20, 20, 20, 10, 10, 10, 5, 1, 1, 1, 1, 1),
 (20, 20, 20, 10, 10, 10, 5, 1, 1, 1, 1, 1),
 (20, 20, 20, 10, 10, 10, 5, 5),
 (20, 20, 20, 10, 10, 10, 5, 5),
 (20, 20, 20, 10, 10, 10, 10)}
 (20, 20, 20, 10, 10, 10, 10)}

So, there are five ways to make change for a $100 bill with the bills you have in your wallet.

因此,您可以通过以下五种方法用钱包中的钞票对100美元的钞票进行找零。

Here’s a variation on the same problem:

这是同一个问题的变体:

How many ways are there to make change for a $100 bill using any number of $50, $20, $10, $5, and $1 dollar bills?

使用50美元,20美元,10美元,5美元和1美元中的任意数量的100美元钞票有多少种找零的方法?

In this case, you don’t have a pre-set collection of bills, so you need a way to generate all possible combinations using any number of bills. For this, you’ll need the itertools.combinations_with_replacement() function.

在这种情况下,您没有预设的账单集合,因此您需要一种使用任意数量的账单生成所有可能组合的方法。 为此,您需要itertools.combinations_with_replacement()函数。

It works just like combinations(), accepting an iterable inputs and a positive integer n, and returns an iterator over n-tuples of elements from inputs. The difference is that combinations_with_replacement() allows elements to be repeated in the tuples it returns.

它的工作方式与Combines combinations()相似,接受可迭代的inputs和正整数n ,并从inputs n元组元素中返回一个迭代器。 区别在于, combinations_with_replacement()允许元素在其返回的元组中重复。

For example:

例如:

Compare that to combinations():

将其与combinations()进行比较:

 >>> >>>  listlist (( itit .. combinationscombinations ([([ 11 , , 22 ], ], 22 ))
))
[(1, 2)]
[(1, 2)]

Here’s what the solution to the revised problem looks like:

修改后的问题的解决方案如下所示:

In this case, you do not need to remove any duplicates since combinations_with_replacement() won’t produce any:

在这种情况下,您不需要删除任何重复项,因为combinations_with_replacement()不会产生任何重复项:

 >>> >>>  lenlen (( makes_100makes_100 )
)
343
343

If you run the above solution, you may notice that it takes a while for the output to display. That is because it has to process 96,560,645 combinations!

如果运行上述解决方案,您可能会注意到显示输出会花费一些时间。 那是因为它必须处理96,560,645个组合!

Another “brute force” itertools function is permutations(), which accepts a single iterable and produces all possible permutations (rearrangements) of its elements:

另一个“蛮力” itertools函数是permutations() ,它接受单个iterable并产生其元素的所有可能的排列(重新排列):

Any iterable of three elements will have six permutations, and the number of permutations of longer iterables grows extremely fast. In fact, an iterable of length n has n! permutations, where

三个元素的任何可迭代项都将具有六个排列,而更长的可迭代项的排列数量将以极快的速度增长。 实际上,长度为n的可迭代数为n! 排列,在哪里

To put this in perspective, here’s a table of these numbers for n = 1 to n = 10:

为了便于理解,下面是这些数字的表格,其中n = 1到n = 10:

nñ n!n!
2 2 2 2
3 3 6 6
4 4 24 24
5 5 120 120
6 6 720 720
7 7 5,040 5,040
8 8 40,320 40,320
9 9 362,880 362,880
10 10 3,628,800 3,628,800

The phenomenon of just a few inputs producing a large number of outcomes is called a combinatorial explosion and is something to keep in mind when working with combinations(), combinations_with_replacement(), and permutations().

仅有少量输入会产生大量结果的现象称为组合爆炸 ,是在使用combinations()combinations_with_replacement()permutations()时要牢记的事情。

It is usually best to avoid brute force algorithms, although there are times you may need to use one (for example, if the correctness of the algorithm is critical, or every possible outcome must be considered). In that case, itertools has you covered.

通常最好避免使用蛮力算法,尽管有时您可能需要使用一种算法(例如,如果算法的正确性很关键,或者必须考虑所有可能的结果)。 在这种情况下, itertools可以解决。

章节回顾 (Section Recap)

In this section you met three itertools functions: combinations(), combinations_with_replacement(), and permutations().

在本节中,您遇到了三个itertools函数: combinations()combinations_with_replacement()permutations()

Let’s review these functions before moving on:

在继续之前,让我们回顾一下这些功能:

itertools.combinations示例 (itertools.combinations Example)

combinations(iterable, n)

combinations(iterable, n)

Return successive n-length combinations of elements in the iterable.

返回迭代器中元素的连续n长度组合。

 >>> >>>  combinationscombinations ([([ 11 , , 22 , , 33 ], ], 22 )
)
(1, 2), (1, 3), (2, 3)
(1, 2), (1, 3), (2, 3)

itertools.combinations_with_replacement示例 (itertools.combinations_with_replacement Example)

combinations_with_replacement(iterable, n)

combinations_with_replacement(iterable, n)

Return successive n-length combinations of elements in the iterable allowing individual elements to have successive repeats.

返回可迭代元素的连续n长度组合,从而允许单个元素具有连续重复。

itertools.permutations示例 (itertools.permutations Example)

permutations(iterable, n=None)

permutations(iterable, n=None)

Return successive n-length permutations of elements in the iterable.

返回迭代器中元素的连续n长度排列。

 >>> >>>  permutationspermutations (( 'abc''abc' )
)
('a', 'b', 'c'), ('a', 'c', 'b'), ('b', 'a', 'c'),
('a', 'b', 'c'), ('a', 'c', 'b'), ('b', 'a', 'c'),
('b', 'c', 'a'), ('c', 'a', 'b'), ('c', 'b', 'a')
('b', 'c', 'a'), ('c', 'a', 'b'), ('c', 'b', 'a')

数字序列 (Sequences of Numbers)

With itertools, you can easily generate iterators over infinite sequences. In this section, you will explore numeric sequences, but the tools and techniques seen here are by no means limited to numbers.

使用itertools ,您可以轻松地在无限序列上生成迭代器。 在本节中,您将探索数字序列,但是这里看到的工具和技术绝不仅限于数字。

偶数和奇数 (Evens and Odds)

For the first example, you will create a pair of iterators over even and odd integers without explicitly doing any arithmetic. Before diving in, let’s look at an arithmetic solution using generators:

对于第一个示例,您将在偶数和奇数整数上创建一对迭代器,而无需显式执行任何算术运算。 在深入探讨之前,让我们看一下使用生成器的算术解决方案:

That is pretty straightforward, but with itertools you can do this much more compactly. The function you need is itertools.count(), which does exactly what it sounds like: it counts, starting by default with the number 0.

这非常简单,但是使用itertools您可以更紧凑地执行此操作。 您需要的功能是itertools.count() ,它的作用完全像它的样子:它会计数,默认情况下从数字0开始。

 >>> >>>  counter counter = = itit .. countcount ()
()
>>> >>>  listlist (( nextnext (( countercounter ) ) for for _ _ in in rangerange (( 55 ))
))
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]

You can start counting from any number you like by setting the start keyword argument, which defaults to 0. You can even set a step keyword argument to determine the interval between numbers returned from count()—this defaults to 1.

您可以通过将start关键字参数设置为默认值(从0开始),从任意数字开始计数。甚至可以设置step关键字参数来确定count()返回的数字之间的间隔-默认为1。

With count(), iterators over even and odd integers become literal one-liners:

使用count() ,在偶数和奇数整数上进行迭代的迭代器将变成文字一线式:

Ever since Python 3.1, the count() function also accepts non-integer arguments:

从Python 3.1开始 , count()函数还接受非整数参数:

 >>> >>>  count_with_floats count_with_floats = = itit .. countcount (( startstart == 0.50.5 , , stepstep == 0.750.75 )
)
>>> >>>  listlist (( nextnext (( count_with_floatscount_with_floats ) ) for for _ _ in in rangerange (( 55 ))
))
[0.5, 1.25, 2.0, 2.75, 3.5]
[0.5, 1.25, 2.0, 2.75, 3.5]

You can even pass it negative numbers:

您甚至可以传递负数:

In some ways, count() is similar to the built-in range() function, but count() always returns an infinite sequence. You might wonder what good an infinite sequence is since it’s impossible to iterate over completely. That is a valid question, and I admit the first time I was introduced to infinite iterators, I too didn’t quite see the point.

在某些方面, count()与内置range()函数相似,但是count()始终返回无限序列。 您可能想知道无限序列有什么好处,因为不可能完全迭代。 这是一个有效的问题,我承认我第一次被介绍给无限迭代器时,我也不太明白这一点。

The example that made me realize the power of the infinite iterator was the following, which emulates the behavior of the built-in enumerate() function:

下面的示例使我意识到了无限迭代器的强大功能,它模仿了内置enumerate()函数的行为 :

 >>> >>>  listlist (( zipzip (( itit .. countcount (), (), [[ 'a''a' , , 'b''b' , , 'c''c' ]))
]))
[(0, 'a'), (1, 'b'), (2, 'c')]
[(0, 'a'), (1, 'b'), (2, 'c')]

It is a simple example, but think about it: you just enumerated a list without a for loop and without knowing the length of the list ahead of time.

这是一个简单的示例,但请考虑一下:您只是枚举了一个没有for循环且不提前知道列表长度的列表。

递归关系 (Recurrence Relations)

A recurrence relation is a way of describing a sequence of numbers with a recursive formula. One of the best-known recurrence relations is the one that describes the Fibonacci sequence.

递归关系是一种使用递归公式描述数字序列的方法。 最著名的递归关系之一是描述斐波那契数列的关系。

The Fibonacci sequence is the sequence 0, 1, 1, 2, 3, 5, 8, 13, .... It starts with 0 and 1, and each subsequent number in the sequence is the sum of the previous two. The numbers in this sequence are called the Fibonacci numbers. In mathematical notation, the recurrence relation describing the n-th Fibonacci number looks like this:

斐波那契数列是序列0、1、1、2、3、5、8、13 0, 1, 1, 2, 3, 5, 8, 13, ... 它以0和1开头,序列中的每个后续数字都是前两个数字的和。 此序列中的数字称为斐波那契数。 用数学符号表示,描述第n个斐波那契数的递归关系如下所示:

Note: If you search Google, you will find a host of implementations of these numbers in Python. You can find a recursive function that produces them in the Thinking Recursively in Python article here on Real Python.

注意:如果您搜索Google,则会在Python中找到这些数字的大量实现。 您可以在Real Python上的“ Python中的递归思考”文章中找到产生它们的递归函数。

It is common to see the Fibonacci sequence produced with a generator:

常见的是用发电机产生斐波那契数列:

The recurrence relation describing the Fibonacci numbers is called a second order recurrence relation because, to calculate the next number in the sequence, you need to look back two numbers behind it.

描述斐波那契数的递归关系称为二阶递归关系,因为要计算序列中的下一个数,您需要向后看两个数。

In general, second order recurrence relations have the form:

通常,二阶递归关系具有以下形式:

Here, P, Q, and R are constants. To generate the sequence, you need two initial values. For the Fibonacci numbers, P = Q = 1, R = 0, and the initial values are 0 and 1.

在此,P,Q和R是常数。 要生成序列,您需要两个初始值。 对于斐波那契数,P = Q = 1,R = 0,初始值为0和1。

As you might guess, a first order recurrence relation has the following form:

您可能会猜到,一阶递归关系具有以下形式:

There are countless sequences of numbers that can be described by first and second order recurrence relations. For example, the positive integers can be described as a first order recurrence relation with P = Q = 1 and initial value 1. For the even integers, take P = 1 and Q = 2 with initial value 0.

一阶和二阶递归关系可以描述无数的数字序列。 例如,可以将正整数描述为P = Q = 1且初始值为1的一阶递归关系。对于偶数整数,取P = 1且Q = 2且初始值为0。

In this section, you will construct functions for producing any sequence whose values can be described with a first or second order recurrence relation.

在本节中,您将构造函数以产生可使用一阶或二阶递归关系描述其值的任何序列。

一阶递归关系 (First Order Recurrence Relations)

You’ve already seen how count() can generate the sequence of non-negative integers, the even integers, and the odd integers. You can also use it to generate the sequence 3n = 0, 3, 6, 9, 12, … and 4n = 0, 4, 8, 12, 16, ….

您已经了解了count()如何生成非负整数,偶数整数和奇数整数的序列。 您也可以使用它来生成序列3n = 0、3、6、9、12,…和4n = 0、4、8、12、16,...。

 count_by_three count_by_three = = itit .. countcount (( stepstep == 33 )
)
count_by_four count_by_four = = itit .. countcount (( stepstep == 44 )
)

In fact, count() can produce sequences of multiples of any number you wish. These sequences can be described with first-order recurrence relations. For example, to generate the sequence of multiples of some number n, just take P = 1, Q = n, and initial value 0.

实际上, count()可以产生任意数量的倍数序列。 可以用一阶递归关系描述这些序列。 例如,要生成某个数字n的倍数的序列,只需取P = 1,Q = n,初始值为0。

Another easy example of a first-order recurrence relation is the constant sequence n, n, n, n, n…, where n is any value you’d like. For this sequence, set P = 1 and Q = 0 with initial value n. itertools provides an easy way to implement this sequence as well, with the repeat() function:

一阶递归关系的另一个简单示例是常数序列n,n,n,n,n…,其中n是您想要的任何值。 对于此序列,将P = 1和Q = 0设置为初始值n。 itertools还使用repeat()函数提供了一种简单的方法来实现此序列:

If you need a finite sequence of repeated values, you can set a stopping point by passing a positive integer as a second argument:

如果需要有限的重复值序列,则可以通过传递正整数作为第二个参数来设置停止点:

 five_ones five_ones = = itit .. repeatrepeat (( 11 , , 55 )  )  # 1, 1, 1, 1, 1
# 1, 1, 1, 1, 1
three_fours three_fours = = itit .. repeatrepeat (( 44 , , 33 )  )  # 4, 4, 4
# 4, 4, 4

What may not be quite as obvious is that the sequence 1, -1, 1, -1, 1, -1, ... of alternating 1s and -1s can also be described by a first order recurrence relation. Just take P = -1, Q = 0, and initial value 1.

可能不是很明显的是,交替的1和-1的序列1, -1, 1, -1, 1, -1, ...也可以由一阶递归关系描述。 只需取P = -1,Q = 0和初始值1。

There’s an easy way to generate this sequence with the itertools.cycle() function. This function takes an iterable inputs as an argument and returns an infinite iterator over the values in inputs that returns to the beginning once the end of inputs is reached. So, to produce the alternating sequence of 1s and -1s, you could do this:

有一种简单的方法可以使用itertools.cycle()函数生成此序列。 此函数将可迭代的inputs作为参数,并在inputs中的值上返回一个无限迭代器,一旦到达inputs的结尾,该迭代器将返回到开头。 因此,要产生1和-1的交替序列,可以执行以下操作:

The goal of this section, though, is to produce a single function that can generate any first order recurrence relation—just pass it P, Q, and an initial value. One way to do this is with itertools.accumulate().

但是,本节的目标是产生一个可以生成任何一阶递归关系的函数,只需将其传递给P,Q和初始值即可。 一种方法是使用itertools.accumulate()

The accumulate() function takes two arguments—an iterable inputs and a binary function func (that is, a function with exactly two inputs)—and returns an iterator over accumulated results of applying func to elements of inputs. It is roughly equivalent to the following generator:

accumulate()函数采用两个参数(一个可迭代的inputs和一个二进制函数 func (即具有两个输入的函数)),并在将func应用于inputs元素的累积结果上返回一个迭代器。 它大致等效于以下生成器:

 def def accumulateaccumulate (( inputsinputs , , funcfunc ):):itr itr = = iteriter (( inputsinputs ))prev prev = = nextnext (( itritr ))for for cur cur in in itritr ::yield yield prevprevprev prev = = funcfunc (( prevprev , , curcur )
)

For example:

例如:

The first value in the iterator returned by accumulate() is always the first value in the input sequence. In the above example, this is 1—the first value in [1, 2, 3, 4, 5].

accumulate()返回的迭代器中的第一个值始终是输入序列中的第一个值。 在上面的示例中,这是1- [1, 2, 3, 4, 5]的第一个值。

The next value in the output iterator is the sum of the first two elements of the input sequence: sum(1, 2) = 3. To produce the next value, accumulate() takes the result of sum(1, 2) and adds this to the third value in the input sequence:

输出迭代器中的下一个值是输入序列的前两个元素的sum(1, 2) = 3sum(1, 2) = 3 。 为了产生下一个值, accumulate()sum(1, 2) accumulate()的结果,并将其与输入序列中的第三个值相加:

sum(3, 3) = sum(sum(1, 2), 3) = 6
sum(3, 3) = sum(sum(1, 2), 3) = 6

The fourth value produced by accumulate() is sum(sum(sum(1, 2), 3), 4) = 10, and so on.

accumulate()产生的第四个值是sum(sum(sum(1, 2), 3), 4) = 10 ,依此类推。

The second argument of accumulate() defaults to sum(), so the previous example can be simplified to:

accumulate()的第二个参数默认为sum() ,因此前面的示例可以简化为:

Passing the built-in min() to accumulate() will keep track of a running minimum:

将内置的min()传递给accumulate()将跟踪运行中的最小值:

 >>> >>>  listlist (( itit .. accumulateaccumulate ([([ 99 , , 2121 , , 1717 , , 55 , , 1111 , , 1212 , , 22 , , 66 ], ], minmin ))
))
[9, 9, 9, 5, 5, 5, 2, 2]
[9, 9, 9, 5, 5, 5, 2, 2]

More complex functions can be passed to accumulate() with lambda expressions:

可以将更复杂的函数与lambda表达式一起传递给accumulate()

The order of the arguments in the binary function passed to accumulate() is important. The first argument is always the previously accumulated result and the second argument is always the next element of the input iterable. For example, consider the difference in output of the following expressions:

传递给accumulate()的二进制函数中参数的顺序很重要。 第一个参数始终是先前累加的结果,第二个参数始终是可迭代输入的下一个元素。 例如,考虑以下表达式的输出差异:

 >>> >>>  listlist (( itit .. accumulateaccumulate ([([ 11 , , 22 , , 33 , , 44 , , 55 ], ], lambda lambda xx , , yy : : x x - - yy ))
))
[1, -1, -4, -8, -13][1, -1, -4, -8, -13]>>> >>>  listlist (( itit .. accumulateaccumulate ([([ 11 , , 22 , , 33 , , 44 , , 55 ], ], lambda lambda xx , , yy : : y y - - xx ))
))
[1, 1, 2, 2, 3]
[1, 1, 2, 2, 3]

To model a recurrence relation, you can just ignore the second argument of the binary function passed to accumulate(). That is, given values p, q, and s, lambda x, _: p*s + q will return the value following x in the recurrence relation defined by sᵢ = Psᵢ₋₁ + Q.

要为递归关系建模,您可以忽略传递给accumulate()的二进制函数的第二个参数。 也就是说,给定值pqslambda x, _: p*s + q将返回由sᵢ=Psᵢ₋₁+ Q定义的递归关系中x之后的值。

In order for accumulate() to iterate over the resulting recurrence relation, you need to pass to it an infinite sequence with the right initial value. It doesn’t matter what the rest of the values in the sequence are, as long as the initial value is the initial value of the recurrence relation. You can do this is with repeat():

为了使accumulate()迭代生成的递归关系,您需要向其传递具有正确初始值的无限序列。 只要初始值是递归关系的初始值,序列中的其余值都无关紧要。 您可以使用repeat()做到这一点:

Using first_order(), you can build the sequences from above as follows:

使用first_order() ,您可以从上面构建序列,如下所示:

 >>> >>> evens evens = = first_orderfirst_order (( pp == 11 , , qq == 22 , , initial_valinitial_val == 00 )
)
>>> >>> listlist (( nextnext (( evensevens ) ) for for _ _ in in rangerange (( 55 ))
))
[[ 00 , , 22 , , 44 , , 66 , , 88 ]]>>> >>> odds odds = = first_orderfirst_order (( pp == 11 , , qq == 22 , , initial_valinitial_val == 11 )
)
>>> >>> listlist (( nextnext (( oddsodds ) ) for for _ _ in in rangerange (( 55 ))
))
[[ 11 , , 33 , , 55 , , 77 , , 99 ]]>>> >>> count_by_threes count_by_threes = = first_orderfirst_order (( pp == 11 , , qq == 33 , , initial_valinitial_val == 00 )
)
>>> >>> listlist (( nextnext (( count_by_threescount_by_threes ) ) for for _ _ in in rangerange (( 55 ))
))
[[ 00 , , 33 , , 66 , , 99 , , 1212 ]]>>> >>> count_by_fours count_by_fours = = first_orderfirst_order (( pp == 11 , , qq == 44 , , initial_valinitial_val == 00 )
)
>>> >>> listlist (( nextnext (( count_by_fourscount_by_fours ) ) for for _ _ in in rangerange (( 55 ))
))
[[ 00 , , 44 , , 88 , , 1212 , , 1616 ]]>>> >>> all_ones all_ones = = first_orderfirst_order (( pp == 11 , , qq == 00 , , initial_valinitial_val == 11 )
)
>>> >>> listlist (( nextnext (( all_onesall_ones ) ) for for _ _ in in rangerange (( 55 ))
))
[[ 11 , , 11 , , 11 , , 11 , , 11 ]]>>> >>> all_twos all_twos = = first_orderfirst_order (( pp == 11 , , qq == 00 , , initial_valinitial_val == 22 )
)
>>> >>> listlist (( nextnext (( all_twosall_twos ) ) for for _ _ in in rangerange (( 55 ))
))
[[ 22 , , 22 , , 22 , , 22 , , 22 ]]>>> >>> alternating_ones alternating_ones = = first_orderfirst_order (( pp =-=- 11 , , 00 , , initial_valinitial_val == 11 )
)
>>> >>> listlist (( nextnext (( alternating_onesalternating_ones ) ) for for _ _ in in rangerange (( 55 ))
))
[[ 11 , , -- 11 , , 11 , , -- 11 , , 11 ]
]

二阶递归关系 (Second Order Recurrence Relations)

Generating sequences described by second order recurrence relations, like the Fibonacci sequence, can be accomplished using a similar technique as the one used for first order recurrence relations.

可以使用与用于一阶递归关系的技术类似的技术来完成由二阶递归关系描述的序列(如斐波那契序列)。

The difference here is that you need to create an intermediate sequence of tuples that keep track of the previous two elements of the sequence, and then map() each of these tuples to their first component to get the final sequence.

此处的区别在于,您需要创建一个元组的中间序列,以跟踪该序列的前两个元素,然后map()这些元组的每个map()到它们的第一个组件以获取最终序列。

Here’s what it looks like:

看起来是这样的:

Using second_order(), you can generate the Fibonacci sequence like this:

使用second_order() ,您可以生成斐波那契数列,如下所示:

 >>> >>> fibs fibs = = second_ordersecond_order (( pp == 11 , , qq == 11 , , rr == 00 , , initial_valuesinitial_values == (( 00 , , 11 ))
))
>>> >>> listlist (( nextnext (( fibsfibs ) ) for for _ _ in in rangerange (( 88 ))
))
[[ 00 , , 11 , , 11 , , 22 , , 33 , , 55 , , 88 , , 1313 ]
]

Other sequences can be easily generated by changing the values of p, q, and r. For example, the Pell numbers and the Lucas numbers can be generated as follows:

通过更改pqr的值,可以轻松生成其他序列。 例如, Perl数和卢卡斯数可以如下生成:

You can even generate the alternating Fibonacci numbers:

您甚至可以生成交替的斐波那契数:

 >>> >>> alt_fibs alt_fibs = = second_ordersecond_order (( pp == 11 , , qq == 00 , , rr == 00 , , inital_valuesinital_values == (( -- 11 , , 11 ))
))
>>> >>> listlist (( nextnext (( alt_fibsalt_fibs ) ) for for _ _ in in rangerange (( 66 ))
))
[[ -- 11 , , 11 , , -- 22 , , 33 , , -- 55 , , 88 ]
]

This is all really cool if you are a giant math nerd like I am, but step back for a second and compare second_order() to the fibs() generator from the beginning of this section. Which one is easier to understand?

如果您像我一样是个数学高手,这一切真的很酷,但是请退一步,然后从本节的开头比较second_order()fibs()生成器。 哪一个更容易理解?

This is a valuable lesson. The accumulate() function is a powerful tool to have in your toolkit, but there are times when using it could mean sacrificing clarity and readability.

这是一个宝贵的教训。 accumulate()函数是工具包中具有的强大工具,但是有时使用它可能意味着牺牲清晰度和可读性。

章节回顾 (Section Recap)

You saw several itertools function in this section. Let’s review those now.

您在本节中看到了几个itertools函数。 现在让我们回顾一下。

itertools.count示例 (itertools.count Example)

count(start=0, step=1)

count(start=0, step=1)

Return a count object whose .__next__() method returns consecutive values.

返回其的计数对象。 __next__()方法返回连续值。

itertools.repeat示例 (itertools.repeat Example)

repeat(object, times=1)

repeat(object, times=1)

Create an iterator which returns the object for the specified number of times. If not specified, returns the object endlessly.

创建一个迭代器,该迭代器将对象返回指定的次数。 如果未指定,则无限期返回该对象。

 >>> >>>  repeatrepeat (( 22 )
)
2, 2, 2, 2, 2 ...2, 2, 2, 2, 2 ...>>> >>>  repeatrepeat (( 22 , , 55 )  )  # Stops after 5 repititions.
# Stops after 5 repititions.
2, 2, 2, 2, 2
2, 2, 2, 2, 2

itertools.cycle示例 (itertools.cycle Example)

cycle(iterable)

cycle(iterable)

Return elements from the iterable until it is exhausted. Then repeat the sequence indefinitely.

从迭代器返回元素,直到耗尽为止。 然后无限重复该序列。

itertools accumulate示例 (itertools accumulate Example)

accumulate(iterable, func=sum)

accumulate(iterable, func=sum)

Return series of accumulated sums (or other binary function results).

返回一系列累加和(或其他二进制函数结果)。

 >>> >>>  accumulateaccumulate ([([ 11 , , 22 , , 33 ])
])
1, 3, 6
1, 3, 6

Alright, let’s take a break from the math and have some fun with cards.

好吧,让我们休息一下数学,在纸牌上玩一些乐趣。

发一副牌 (Dealing a Deck of Cards)

Suppose you are building a Poker app. You’ll need a deck of cards. You might start by defining a list of ranks (ace, king, queen, jack, 10, 9, and so on) and a list of suits (hearts, diamonds, clubs, and spades):

假设您正在构建一个扑克应用程序。 您将需要一副纸牌。 您可能会先定义一个等级列表(王牌,国王,女王,杰克,10、9等)和西服列表(心形,钻石,棍棒和黑桃):

You could represent a card as a tuple whose first element is a rank and second element is a suit. A deck of cards would be a collection of such tuples. The deck should act like the real thing, so it makes sense to define a generator that yields cards one at a time and becomes exhausted once all the cards are dealt.

您可以将卡表示为元组,其第一个元素是等级,第二个元素是西装。 一副纸牌就是这样的元组的集合。 牌组的行为应与真实事物相似,因此定义一个生成器可以一次产生一张卡片,并在所有卡片分发完毕后就耗尽它是有意义的。

One way to achieve this is to write a generator with a nested for loop over ranks and suits:

实现此目的的一种方法是编写一个生成器,该生成器具有嵌套for rankssuits循环for

 def def cardscards ():():"""Return a generator that yields playing cards.""""""Return a generator that yields playing cards."""for for ranks ranks in in ranksranks ::for for suit suit in in suitssuits ::yield yield rankrank , , suit
suit

You could write this more compactly with a generator expression:

您可以使用生成器表达式更紧凑地编写此代码:

However, some might argue that this is actually more difficult to understand than the more explicit nested for loop. After all, the order in which the for fragments appear in the generator expression seems entirely unnatural.

但是,有些人可能会认为,这实际上比更显式的嵌套for循环更难理解。 毕竟, for片段在生成器表达式中出现的顺序似乎是完全不自然的。

It helps to view nested for loops from a mathematical standpoint—that is, as a Cartesian product of two or more iterables. In mathematics, the Cartesian product of two sets A and B is the set of all tuples of the form (a, b) where a is an element of A and b is an element of B.

从数学的角度来看,这有助于查看嵌套的for循环-即,两个或多个可迭代对象的笛卡尔积 。 在数学中,两组A和B的笛卡尔积是(a,b)形式的所有元组的集合,其中a是A的元素,b是B的元素。

Here’s an example with Python iterables: the Cartesian product of A = [1, 2] and B = ['a', 'b'] is [(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')].

这是Python迭代式的示例: A = [1, 2]B = ['a', 'b']的笛卡尔积为[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')]

The itertools.product() function is for exactly this situation. It takes any number of iterables as arguments and returns an iterator over tuples in the Cartesian product:

itertools.product()函数正是针对这种情况的。 它使用任意数量的Iterable作为参数,并在笛卡尔积中返回一个遍历元组的迭代器:

 itit .. productproduct ([([ 11 , , 22 ], ], [[ 'a''a' , , 'b''b' ])  ])  # (1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')
# (1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')

The product() function is by no means limited to two iterables. You can pass it as many as you like—they don’t even have to all be of the same size! See if you can predict what product([1, 2, 3], ['a', 'b'], ['c']) is, then check your work by running it in the interpreter.

product()函数决不限于两个可迭代。 您可以根据需要传递任意数量的密码-它们甚至不必全部相同大小! 看看是否可以预测什么product([1, 2, 3], ['a', 'b'], ['c']) ,然后在解释器中运行它来检查工作。

Warning: The product() function is another “brute force” function and can lead to a combinatorial explosion if you aren’t careful.

警告: product()函数是另一个“强力”函数,如果不小心,可能会导致组合爆炸。

Using product(), you can re-write the cards in a single line:

使用product() ,您可以单行重写cards

This is all fine and dandy, but any Poker app worth its salt better start with a shuffled deck:

一切都很好,但任何值得花钱的扑克应用程序都应该从洗牌的甲板开始:

 import import randomrandomdef def shuffleshuffle (( deckdeck ):):"""Return iterator over shuffled deck.""""""Return iterator over shuffled deck."""deck deck = = listlist (( deckdeck ))randomrandom .. shuffleshuffle (( deckdeck ))return return iteriter (( tupletuple (( deckdeck ))))cards cards = = shuffleshuffle (( cardscards )
)

Note: The random.shuffle() function uses the Fisher-Yates shuffle to shuffle a list (or any mutable sequence) in place in O(n) time. This algorithm is well-suited for shuffling cards because it produces an unbiased permutation—that is, all permutations of the iterable are equally likely to be returned by random.shuffle().

注意: random.shuffle()函数使用Fisher-Yates随机播放在O(n)时间中 随机播放列表(或任何可变序列)。 该算法非常适合混洗cards因为它会产生无偏排列,即,所有可迭代的排列都可能由random.shuffle()返回。

That said, you probably noticed that shuffle() creates a copy of its input deck in memory by calling list(deck). While this seemingly goes against the spirit of this article, this author is unaware of a good way to shuffle an iterator without making a copy.

就是说,您可能已经注意到shuffle()通过调用list(deck)在内存中创建了其输入deck的副本。 尽管这似乎与本文的精神背道而驰,但该作者并未意识到在不进行复制的情况下改组迭代器的好方法。

As a courtesy to your users, you would like to give them the opportunity to cut the deck. If you imagine the cards being stacked neatly on a table, you have the user pick a number n and then remove the first n cards from the top of the stack and move them to the bottom.

出于对用户的礼貌,您想给他们提供切开甲板的机会。 如果您想象这些卡整齐地堆放在桌子上,则让用户选择一个数字n,然后从堆栈顶部移出前n张卡,然后将其移至底部。

If you know a thing or two about slicing, you might accomplish this like so:

如果您知道有关切片的一两件事,则可以这样完成:

The cut() function first converts deck to a list so that you can slice it to make the cut. To guarantee your slices behave as expected, you’ve got to check that n is non-negative. If it isn’t, you better throw an exception so that nothing crazy happens.

cut()函数首先将deck转换为列表,以便您可以对其进行切片以进行切割。 为了确保切片的行为符合预期,您必须检查n是否为负数。 如果不是,那么最好抛出一个异常,这样就不会发生任何疯狂的事情。

Cutting the deck is pretty straightforward: the top of the cut deck is just deck[:n], and the bottom is the remaining cards, or deck[n:]. To construct the new deck with the top “half” moved to the bottom, you just append it to the bottom: deck[n:] + deck[:n].

切割牌组非常简单:被切割的牌组的顶部仅是deck[:n] ,而底部则是剩余的牌或deck[n:] 。 要构建将顶部“一半”移到底部的新卡片组,只需将其附加到底部: deck[n:] + deck[:n]

The cut() function is pretty simple, but it suffers from a couple of problems. When you slice a list, you make a copy of the original list and return a new list with the selected elements. With a deck of only 52 cards, this increase in space complexity is trivial, but you could reduce the memory overhead using itertools. To do this, you’ll need three functions: itertools.tee(), itertools.islice(), and itertools.chain().

cut()函数非常简单,但是存在两个问题。 对列表进行切片时,将复制原始列表,并返回包含所选元素的新列表。 如果只有52张卡,则空间复杂度的增加是微不足道的,但是您可以使用itertools减少内存开销。 为此,您需要三个函数: itertools.tee()itertools.islice()itertools.chain()

Let’s take a look at how those functions work.

让我们看一下这些功能是如何工作的。

The tee() function can be used to create any number of independent iterators from a single iterable. It takes two arguments: the first is an iterable inputs, and the second is the number n of independent iterators over inputs to return (by default, n is set to 2). The iterators are returned in a tuple of length n.

tee()函数可用于从单个可迭代对象创建任意数量的独立迭代器。 它有两个参数:第一个是可迭代的inputs ,第二个是要返回的inputs上独立迭代器的数量n (默认情况下, n设置为2)。 迭代器以长度为n的元组返回。

 >>> >>>  iterator1iterator1 , , iterator2 iterator2 = = itit .. teetee ([([ 11 , , 22 , , 33 , , 44 , , 55 ], ], 22 )
)
>>> >>>  listlist (( iterator1iterator1 )
)
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
>>> >>>  listlist (( iterator1iterator1 )  )  # iterator1 is now exhausted.
# iterator1 is now exhausted.
[]
[]
>>> >>>  listlist (( iterator2iterator2 )  )  # iterator2 works independently of iterator1
# iterator2 works independently of iterator1
[1, 2, 3, 4, 5].
[1, 2, 3, 4, 5].

While tee() is useful for creating independent iterators, it is important to understand a little bit about how it works under the hood. When you call tee() to create n independent iterators, each iterator is essentially working with its own FIFO queue.

尽管tee()对于创建独立的迭代器很有用,但一定要了解其幕后工作原理,这一点很重要。 当您调用tee()创建n个独立的迭代器时,每个迭代器实际上都在使用其自己的FIFO队列。

When a value is extracted from one iterator, that value is appended to the queues for the other iterators. Thus, if one iterator is exhausted before the others, each remaining iterator will hold a copy of the entire iterable in memory. (You can find a Python function that emulates tee() in the itertools docs.)

从一个迭代器中提取一个值时,该值将附加到其他迭代器的队列中。 因此,如果一个迭代器在其他迭代器之前被耗尽,则每个剩余​​的迭代器将在内存中保存整个可迭代的副本。 (您可以在itertools 文档中找到一个模拟tee()的Python函数。)

For this reason, tee() should be used with care. If you are exhausting large portions of an iterator before working with the other returned by tee(), you may be better off casting the input iterator to a list or tuple.

因此,应谨慎使用tee() 。 如果在处理tee()返回的另一个之前要耗尽迭代器的大部分,最好将输入迭代器转换为listtuple

The islice() function works much the same way as slicing a list or tuple. You pass it an iterable, a starting, and stopping point, and, just like slicing a list, the slice returned stops at the index just before the stopping point. You can optionally include a step value, as well. The biggest difference here is, of course, that islice() returns an iterator.

islice()函数的工作方式与切片列表或元组的方式几乎相同。 您为它传递了一个可迭代的起点和终点,并且像切片列表一样,返回的切片在终点之前的索引处停止。 您也可以选择包括一个步长值。 当然,这里最大的区别是islice()返回一个迭代器。

The last two examples above are useful for truncating iterables. You can use this to replace the list slicing used in cut() to select the “top” and “bottom” of the deck. As an added bonus, islice() won’t accept negative indices for the start/stop positions and the step value, so you won’t need to raise an exception if n is negative.

上面的最后两个示例对于截断可迭代对象很有用。 您可以使用它替换cut()使用的列表切片,以选择平台的“顶部”和“底部”。 另外, islice()不会为开始/停止位置和步长值接受负索引,因此,如果n为负,则无需引发异常。

The last function you need is chain(). This function takes any number of iterables as arguments and “chains” them together. For example:

您需要的最后一个功能是chain() 。 此函数将任意数量的可迭代对象作为参数并将它们“链接”在一起。 例如:

 >>> >>>  listlist (( itit .. chainchain (( 'ABC''ABC' , , 'DEF''DEF' ))
))
['A' 'B' 'C' 'D' 'E' 'F']['A' 'B' 'C' 'D' 'E' 'F']>>> >>>  listlist (( itit .. chainchain ([([ 11 , , 22 ], ], [[ 33 , , 44 , , 55 , , 66 ], ], [[ 77 , , 88 , , 99 ]))
]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Now that you’ve got some additional firepower in your arsenal, you can re-write the cut() function to cut the deck of cards without making a full copy cards in memory:

现在,您的武器库有了一些额外的火力,您可以重新编写cut()函数来剪切卡片组,而无需在内存中制作完整的复制cards

Now that you have shuffled and cut the cards, it is time to deal some hands. You could write a function deal() that takes a deck, the number of hands, and the hand size as arguments and returns a tuple containing the specified number of hands.

既然您已经洗牌并剪了张纸牌,现在该换手了。 您可以编写一个函数deal() ,该函数将一个副牌,手的数目和手的大小作为参数,并返回一个包含指定手数的元组。

You do not need any new itertools functions to write this function. See what you can come up with on your own before reading ahead.

您不需要任何新的itertools函数即可编写此函数。 在继续阅读之前,请先看看您能自己想出什么。

Here’s one solution:

这是一种解决方案:

 def def dealdeal (( deckdeck , , num_handsnum_hands == 11 , , hand_sizehand_size == 55 ):):iters iters = = [[ iteriter (( deckdeck )] )] * * hand_sizehand_sizereturn return tupletuple (( zipzip (( ** (( tupletuple (( itit .. isliceislice (( itritr , , num_handsnum_hands )) )) for for itr itr in in itersiters )))
)))

You start by creating a list of hand_size references to an iterator over deck. You then iterate over this list, removing num_hands cards at each step and storing them in tuples.

通过创建列表开始hand_size引用了一个迭代deck 。 然后,您遍历此列表,在每个步骤中删除num_hands张卡片,并将其存储在元组中。

Next, you zip() these tuples up to emulate dealing one card at a time to each player. This produces num_hands tuples, each containing hand_size cards. Finally, you package the hands up into a tuple to return them all at once.

接下来,您将这些元组zip()向上模拟,以模拟一次向每位玩家分发一张卡。 这将产生num_hands元组,每个元组都包含hand_size卡。 最后,将手打包成一个元组以一次全部退回。

This implementation sets the default values for num_hands to 1 and hand_size to 5—maybe you are making a “Five Card Draw” app. Here’s how you would use this function, with some sample output:

此实现将num_hands的默认值num_hands1hand_size的默认值设置为5也许您正在制作“五张纸牌抽奖”应用程序。 使用此功能的方法如下,并带有一些示例输出:

What do you think the state of cards is now that you have dealt three hands of five cards?

您现在已经处理了三手五张牌,您认为cards状态如何?

 >>> >>>  lenlen (( tupletuple (( cardscards ))
))
37
37

The fifteen cards dealt are consumed from the cards iterator, which is exactly what you want. That way, as the game continues, the state of the cards iterator reflects the state of the deck in play.

发卡的15张卡是从cards迭代器中消耗的,这正是您想要的。 这样,随着游戏的继续, cards迭代器的状态将反映游戏中套牌的状态。

章节回顾 (Section Recap)

Let’s review the itertools functions you saw in this section.

让我们回顾一下在本节中看到的itertools函数。

itertools.product示例 (itertools.product Example)

product(*iterables, repeat=1)

product(*iterables, repeat=1)

Cartesian product of input iterables. Equivalent to nested for-loops.

输入可迭代项的笛卡尔积。 等效于嵌套的for循环。

itertools.tee示例 (itertools.tee Example)

tee(iterable, n=2)

tee(iterable, n=2)

Create any number of independent iterators from a single input iterable.

从单个可迭代输入创建任意数量的独立迭代器。

 >>> >>>  iter1iter1 , , iter2 iter2 = = itit .. teetee ([([ 'a''a' , , 'b''b' , , 'c''c' ], ], 22 )
)
>>> >>>  listlist (( iter1iter1 )
)
['a', 'b', 'c']
['a', 'b', 'c']
>>> >>>  listlist (( iter2iter2 )
)
['a', 'b', 'c']
['a', 'b', 'c']

itertools.islice示例 (itertools.islice Example)

islice(iterable, stop) islice(iterable, start, stop, step=1)

islice(iterable, stop) islice(iterable, start, stop, step=1)

Return an iterator whose __next__() method returns selected values from an iterable. Works like a slice() on a list but returns an iterator.

返回一个迭代器,该迭代器的__next__()方法返回可迭代对象中的选定值。 在列表上就像slice() ,但是返回一个迭代器。

itertools.chain示例 (itertools.chain Example)

chain(*iterables)

chain(*iterables)

Return a chain object whose __next__() method returns elements from the first iterable until it is exhausted, then elements from the next iterable, until all of the iterables are exhausted.

返回一个链对象,该对象的__next__()方法从第一个可迭代对象返回元素,直到用尽为止,然后从下一个可迭代对象返回元素,直到所有可迭代对象用尽。

 >>> >>>  chainchain (( 'abc''abc' , , [[ 11 , , 22 , , 33 ])
])
'a', 'b', 'c', 1, 2, 3
'a', 'b', 'c', 1, 2, 3

中场休息:拼合列表清单 (Intermission: Flattening A List of Lists)

In the previous example, you used chain() to tack one iterator onto the end of another. The chain() function has a class method .from_iterable() that takes a single iterable as an argument. The elements of the iterable must themselves be iterable, so the net effect is that chain.from_iterable() flattens its argument:

在前面的示例中,您使用chain()将一个迭代器附加到另一个迭代器的末尾。 chain()函数具有一个类方法.from_iterable() ,该方法将单个iterable作为参数。 可迭代的元素本身必须是可迭代的,因此最终效果是chain.from_iterable()使参数变平:

There’s no reason the argument of chain.from_iterable() needs to be finite. You could emulate the behavior of cycle(), for example:

没有理由chain.from_iterable()的参数必须是有限的。 您可以模拟cycle()的行为,例如:

 >>> >>> cycle cycle = = itit .. chainchain .. from_iterablefrom_iterable (( itit .. repeatrepeat (( 'abc''abc' ))
))
>>> >>> listlist (( itit .. isliceislice (( cyclecycle , , 88 ))
))
[[ 'a''a' , , 'b''b' , , 'c''c' , , 'a''a' , , 'b''b' , , 'c''c' , , 'a''a' , , 'b''b' ]
]

The chain.from_interable() function is useful when you need to build an iterator over data that has been “chunked.”

当您需要在已“分块”的数据上构建迭代器时, chain.from_interable()函数非常有用。

In the next section, you will see how to use itertools to do some data analysis on a large dataset. But you deserve a break for having stuck with it this far. Why not hydrate yourself and relax a bit? Maybe even play a little Star Trek: The Nth Iteration.

在下一节中,您将看到如何使用itertools对大型数据集进行一些数据分析。 但是,到目前为止,您应该休息一下。 为什么不给自己补水并放松一下呢? 甚至还可以玩一点《 星际迷航:第N次迭代》 。

Back? Great! Let’s do some data analysis.

背部? 大! 让我们做一些数据分析。

分析S&P500 (Analyzing the S&P500)

In this example, you will get your first taste of using itertools to manipulate a large dataset—in particular, the historical daily price data of the S&P500 index. A CSV file SP500.csv with this data can be found here (source: Yahoo Finance). The problem you’ll tackle is this:

在此示例中,您将首次使用itertools操纵大型数据集-特别是S&P500指数的历史每日价格数据。 可以在此处找到包含此数据的CSV文件SP500.csv (来源: Yahoo Finance )。 您要解决的问题是:

Determine the maximum daily gain, daily loss (in percent change), and the longest growth streak in the history of the S&P500.

确定S&P500历史上最大的每日收益,每日损失(以百分比变化)和最长的增长趋势。

To get a feel for what you’re dealing with, here are the first ten rows of SP500.csv:

为了了解您要处理的内容,以下是SP500.csv的前十行:

As you can see, the early data is limited. The data improves for later dates, and, as a whole, is sufficient for this example.

如您所见,早期数据有限。 对于以后的日期,数据会有所改善,并且总体而言,对于此示例而言已足够。

The strategy for solving this problem is as follows:

解决此问题的策略如下:

  • Read data from the CSV file and transform it into a sequence gains of daily percent changes using the “Adj Close” column.
  • Find the maximum and minimum values of the gains sequence, and the date on which they occur. (Note that it is possible that these values are attained on more than on date; in that case, the most recent date will suffice.)
  • Transform gains into a sequence growth_streaks of tuples of consecutive positive values in gains. Then determine the length of the longest tuple in growth_streaks and the beginning and ending dates of the streak. (It is possible that the maximum length is attained by more than one tuple in growth_streaks; in that case, the tuple with the most recent beginning and ending dates will suffice.)
  • 从CSV文件中读取数据,并使用“调整关闭”列将其转换为每日百分比变化的序列gains
  • 查找gains序列的最大值和最小值,以及它们出现的日期。 (请注意,这些值有可能比日期更早获得;在这种情况下,最近的日期就足够了。)
  • 转换gains为序列growth_streaks连续正值的元组的gains 。 然后确定growth_streaks最长元组的长度以及growth_streaks的开始和结束日期。 (有可能通过growth_streaks一个以上的元组来达到最大长度;在这种情况下,具有最近的开始和结束日期的元组就足够了。)

The percent change between two values x and y is given by the following formula:

下式给出两个值x和y之间的百分比变化:

For each step in the analysis, it is necessary to compare values associated with dates. To facilitate these comparisons, you can subclass the namedtuple object from the collections module:

对于分析的每个步骤,必须比较与日期关联的值。 为了方便进行这些比较,您可以从collections模块中将namedtuple对象子类化:

 from from collections collections import import namedtuplenamedtupleclass class DataPointDataPoint (( namedtuplenamedtuple (( 'DataPoint''DataPoint' , , [[ 'date''date' , , 'value''value' ])):])):__slots__ __slots__ = = ()()def def __le____le__ (( selfself , , otherother ):):return return selfself .. value value <= <= otherother .. valuevaluedef def __lt____lt__ (( selfself , , otherother ):):return return selfself .. value value < < otherother .. valuevaluedef def __gt____gt__ (( selfself , , otherother ):):return return selfself .. value value > > otherother .. value
value

The DataPoint class has two attributes: date (a datetime.datetime instance) and value. The .__le__(), .__lt__() and .__gt__() dunder methods are implemented so that the <=, <, and > boolean comparators can be used to compare the values of two DataPoint objects. This also allows the max() and min() built-in functions to be called with DataPoint arguments.

DataPoint类具有两个属性: datedatetime.datetime实例)和value 。 实现了.__le__().__lt__().__gt__() dunder方法,以便可以使用<=<>布尔比较器比较两个DataPoint对象的值。 这还允许使用DataPoint参数调用内置函数max()min()

Note: If you are not familiar with namedtuple, check out this excellent resource. The namedtuple implementation for DataPoint is just one of many ways to build this data structure. For example, in Python 3.7 you could implement DataPoint as a data class. Check out our Ultimate Guide to Data Classes for more information.

注意:如果您不熟悉namedtuple ,请查看此出色的资源 。 DataPointnamedtuple实现只是构建此数据结构的多种方法之一。 例如,在Python 3.7中,您可以将DataPoint实现为数据类。 查看我们的《数据类最终指南》以获取更多信息。

The following reads the data from SP500.csv to a tuple of DataPoint objects:

以下内容将数据从SP500.csv读取到一个DataPoint对象元组:

The read_prices() generator opens SP500.csv and reads each row with a csv.DictReader() object. DictReader() returns each row as an OrderedDict whose keys are the column names from the header row of the CSV file.

read_prices()生成器将打开SP500.csv并使用csv.DictReader()对象读取每一行。 DictReader()将每一行作为OrderedDict返回,其键是CSV文件标题行中的列名称。

For each row, read_prices() yields a DataPoint object containing the values in the “Date” and “Adj Close” columns. Finally, the full sequence of data points is committed to memory as a tuple and stored in the prices variable.

对于每一行, read_prices()产生一个DataPoint对象,其中包含“日期”和“调整关闭”列中的值。 最后,完整的数据点序列作为tuple提交到内存,并存储在prices变量中。

Next, prices needs to be transformed to a sequence of daily percent changes:

接下来, prices需要转换为一系列的每日百分比变化:

 gains gains = = tupletuple (( DataPointDataPoint (( dayday .. datedate , , 100100 ** (( dayday .. valuevalue // prev_dayprev_day .. value value - - 1.1. ))))for for dayday , , prev_day prev_day in in zipzip (( pricesprices [[ 11 :], :], pricesprices ))
))

The choice of storing the data in a tuple is intentional. Although you could point gains to an iterator, you will need to iterate over the data twice to find the minimum and maximum values.

选择将数据存储在tuple是有意的。 尽管您可以将gains指向迭代器,但是您将需要对数据进行两次迭代以找到最小值和最大值。

If you use tee() to create two independent iterators, exhausting one iterator to find the maximum will create a copy of all of the data in memory for the second iterator. By creating a tuple up front, you do not lose anything in terms of space complexity compared to tee(), and you may even gain a little speed.

如果使用tee()创建两个独立的迭代器,则用尽一个迭代器以找到最大值将为第二个迭代器创建内存中所有数据的副本。 通过预先创建一个tuple ,与tee()相比,您不会在空间复杂性方面损失任何东西,甚至可能会获得一点速度。

Note: This example focuses on leveraging itertools for analyzing the S&P500 data. Those intent on working with a lot of time series financial data might also want to check out the Pandas library, which is well suited for such tasks.

注意:此示例着重于利用itertools分析S&P500数据。 那些打算使用大量时间序列财务数据的人可能还想查看Pandas库,该库非常适合此类任务。

最大盈亏 (Maximum Gain and Loss)

To determine the maximum gain on any single day, you might do something like this:

要确定任何一天的最大收益,您可以执行以下操作:

You can simplify the for loop using the functools.reduce() function. This function accepts a binary function func and an iterable inputs as arguments, and “reduces” inputs to a single value by applying func cumulatively to pairs of objects in the iterable.

您可以使用functools.reduce()函数简化for循环。 此函数接受二进制函数func和可迭代inputs作为参数,并通过将func累积应用于可迭代对象对来将“减少” inputs到单个值。

For example, functools.reduce(sum, [1, 2, 3, 4, 5]) will return the sum 1 + 2 + 3 + 4 + 5 = 15. You can think of reduce() as working in much the same way as accumulate(), except that it returns only the final value in the new sequence.

例如, functools.reduce(sum, [1, 2, 3, 4, 5])将返回总和1 + 2 + 3 + 4 + 5 = 15 。 您可以认为reduce()工作方式与accumulate()几乎相同,只是它只返回新序列中的最终值。

Using reduce(), you can get rid of the for loop altogether in the above example:

使用reduce() ,可以在上面的示例中完全摆脱for循环:

 import import functools functools as as ftftmax_gain max_gain = = ftft .. reducereduce (( maxmax , , gainsgains ))printprint (( max_gainmax_gain )  )  # DataPoint(date='2008-10-28', value=11.58)
# DataPoint(date='2008-10-28', value=11.58)

The above solution works, but it isn’t equivalent to the for loop you had before. Do you see why? Suppose the data in your CSV file recorded a loss every single day. What would the value of max_gain be?

上面的解决方案有效,但是它不等同于您之前的for循环。 你明白为什么吗? 假设您的CSV文件中的数据每天记录一次损失。 max_gain的值是max_gain

In the for loop, you first set max_gain = DataPoint(None, 0), so if there are no gains, the final max_gain value will be this empty DataPoint object. However, the reduce() solution returns the smallest loss. That is not what you want and could introduce a difficult to find bug.

for循环中,首先设置max_gain = DataPoint(None, 0) ,因此,如果没有增益,则最终的max_gain值将是此空的DataPoint对象。 但是, reduce()解决方案返回的损失最小。 那不是您想要的,可能会引入难以发现的错误。

This is where itertools can help you out. The itertools.filterfalse() function takes two arguments: a function that returns True or False (called a predicate), and an iterable inputs. It returns an iterator over the elements in inputs for which the predicate returns False.

这是itertools可以帮助您的地方。 itertools.filterfalse()函数带有两个参数:一个返回TrueFalse的函数(称为谓词 ),以及一个可迭代的inputs 。 它对谓词返回False inputs的元素返回一个迭代器。

Here’s a simple example:

这是一个简单的例子:

You can use filterfalse() to filter out the values in gains that are negative or zero so that reduce() only works on positive values:

您可以使用filterfalse()滤除gains为负或零的值,以便reduce()仅适用于正值:

 max_gain max_gain = = ftft .. reducereduce (( maxmax , , itit .. filterfalsefilterfalse (( lambda lambda pp : : p p <= <= 00 , , gainsgains ))
))

What happens if there are never any gains? Consider the following:

如果永远都没有收获会怎样? 考虑以下:

Well, that’s not what you want! But, it makes sense because the iterator returned by filterflase() is empty. You could handle the TypeError by wrapping the call to reduce() with try...except, but there’s a better way.

好吧,那不是您想要的! 但是,这是有道理的,因为filterflase()返回的迭代器为空。 您可以通过使用try...except包装对reduce()的调用来处理TypeError ,但是有更好的方法。

The reduce() function accepts an optional third argument for an initial value. Passing 0 to this third argument gets you the expected behavior:

reduce()函数接受可选的第三个参数作为初始值。 将0传递给第三个参数可以使您获得预期的行为:

 >>> >>>  ftft .. reducereduce (( maxmax , , itit .. filterfalsefilterfalse (( lambda lambda xx : : x x <= <= 00 , , [[ -- 11 , , -- 22 , , -- 33 ]), ]), 00 )
)
0
0

Applying this to the S&P500 example:

将其应用于S&P500示例:

Great! You’ve got it working just the way it should! Now, finding the maximum loss is easy:

大! 您已经按照应有的方式工作了! 现在,找到最大损失很容易:

 max_loss max_loss = = ftft .. reducereduce (( minmin , , itit .. filterfalsefilterfalse (( lambda lambda pp : : p p > > 00 , , gainsgains ), ), zdpzdp ))printprint (( max_lossmax_loss )  )  # DataPoint(date='2018-02-08', value=-20.47)
# DataPoint(date='2018-02-08', value=-20.47)

最长的增长条纹 (Longest Growth Streak)

Finding the longest growth streak in the history of the S&P500 is equivalent to finding the largest number of consecutive positive data points in the gains sequence. The itertools.takewhile() and itertools.dropwhile() functions are perfect for this situation.

在S&P500历史上找到最长的涨幅,等同于在gains序列中找到最大数量的连续正数据点。 itertools.takewhile()itertools.dropwhile()函数非常适合这种情况。

The takewhile() function takes a predicate and an iterable inputs as arguments and returns an iterator over inputs that stops at the first instance of an element for which the predicate returns False:

takewhile()函数将一个谓词和一个可迭代的inputs作为参数,并在inputs上返回一个迭代器,该迭代器在谓词为其返回False的元素的第一个实例处停止:

The dropwhile() function does exactly the opposite. It returns an iterator beginning at the first element for which the predicate returns False:

dropwhile()函数的作用恰恰相反。 它从谓词返回False的第一个元素开始返回一个迭代器:

 itit .. dropwhiledropwhile (( lambda lambda xx : : x x < < 33 , , [[ 00 , , 11 , , 22 , , 33 , , 44 ])  ])  # 3, 4
# 3, 4

In the following generator function, takewhile() and dropwhile() are composed to yield tuples of consecutive positive elements of a sequence:

在以下生成器函数中, takewhile()dropwhile()组成为产生序列的连续正元素的元组:

The consecutive_positives() function works because repeat() keeps returning a pointer to an iterator over the sequence argument, which is being partially consumed at each iteration by the call to tuple() in the yield statement.

consecutive_positives()函数之所以起作用,是因为repeat()会不断返回指向sequence参数上迭代器的指针,该指针在每次迭代中都被yield语句中对tuple()的调用部分消耗。

You can use consecutive_positives() to get a generator that produces tuples of consecutive positive data points in gains:

您可以使用consecutive_positives()来生成生成器,该生成器以gains生成连续的正数据点的元组:

 growth_streaks growth_streaks = = consecutive_positivesconsecutive_positives (( gainsgains , , zerozero == DataPointDataPoint (( NoneNone , , 00 ))
))

Now you can use reduce() to extract the longest growth streak:

现在,您可以使用reduce()提取最长的增长条纹:

Putting the whole thing together, here’s a full script that will read data from the SP500.csv file and print out the max gain/loss and longest growth streak:

总的来说,这是一个完整的脚本,它将从SP500.csv文件读取数据并打印出最大增益/损耗和最长的增长条纹:

 from from collections collections import import namedtuple
namedtuple
import import csv
csv
from from datetime datetime import import datetime
datetime
import import itertools itertools as as it
it
import import functools functools as as ftftclass class DataPointDataPoint (( namedtuplenamedtuple (( 'DataPoint''DataPoint' , , [[ 'date''date' , , 'value''value' ])):])):__slots__ __slots__ = = ()()def def __le____le__ (( selfself , , otherother ):):return return selfself .. value value <= <= otherother .. valuevaluedef def __lt____lt__ (( selfself , , otherother ):):return return selfself .. value value < < otherother .. valuevaluedef def __gt____gt__ (( selfself , , otherother ):):return return selfself .. value value > > otherother .. valuevaluedef def consecutive_positivesconsecutive_positives (( sequencesequence , , zerozero == 00 ):):def def _consecutives_consecutives ():():for for itr itr in in itit .. repeatrepeat (( iteriter (( sequencesequence )):)):yield yield tupletuple (( itit .. takewhiletakewhile (( lambda lambda pp : : p p > > zerozero ,,itit .. dropwhiledropwhile (( lambda lambda pp : : p p <= <= zerozero , , itritr ))))))return return itit .. takewhiletakewhile (( lambda lambda tt : : lenlen (( tt ), ), _consecutives_consecutives ())())def def read_pricesread_prices (( csvfilecsvfile , , _strptime_strptime == datetimedatetime .. strptimestrptime ):):with with openopen (( csvfilecsvfile ) ) as as infileinfile ::reader reader = = csvcsv .. DictReaderDictReader (( infileinfile ))for for row row in in readerreader ::yield yield DataPointDataPoint (( datedate == _strptime_strptime (( rowrow [[ 'Date''Date' ], ], '%Y-%m-'%Y-%m- %d%d '' )) .. datedate (),(),valuevalue == floatfloat (( rowrow [[ 'Adj Close''Adj Close' ]))]))# Read prices and calculate daily percent change.
# Read prices and calculate daily percent change.
prices prices = = tupletuple (( read_pricesread_prices (( 'SP500.csv''SP500.csv' ))
))
gains gains = = tupletuple (( DataPointDataPoint (( dayday .. datedate , , 100100 ** (( dayday .. valuevalue // prev_dayprev_day .. value value - - 1.1. ))))for for dayday , , prev_day prev_day in in zipzip (( pricesprices [[ 11 :], :], pricesprices ))))# Find maximum daily gain/loss.
# Find maximum daily gain/loss.
zdp zdp = = DataPointDataPoint (( NoneNone , , 00 )  )  # zero DataPoint
# zero DataPoint
max_gain max_gain = = ftft .. reducereduce (( maxmax , , itit .. filterfalsefilterfalse (( lambda lambda pp : : p p <= <= zdpzdp , , gainsgains ))
))
max_loss max_loss = = ftft .. reducereduce (( minmin , , itit .. filterfalsefilterfalse (( lambda lambda pp : : p p > > zdpzdp , , gainsgains ), ), zdpzdp ))# Find longest growth streak.
# Find longest growth streak.
growth_streaks growth_streaks = = consecutive_positivesconsecutive_positives (( gainsgains , , zerozero == DataPointDataPoint (( NoneNone , , 00 ))
))
longest_streak longest_streak = = ftft .. reducereduce (( lambda lambda xx , , yy : : x x if if lenlen (( xx ) ) > > lenlen (( yy ) ) else else yy ,,growth_streaksgrowth_streaks ))# Display results.
# Display results.
printprint (( 'Max gain: {1:.2f}'Max gain: {1:.2f} % o% o n {0}'n {0}' .. formatformat (( ** max_gainmax_gain ))
))
printprint (( 'Max loss: {1:.2f}'Max loss: {1:.2f} % o% o n {0}'n {0}' .. formatformat (( ** max_lossmax_loss ))))printprint (( 'Longest growth streak: {num_days} days ({first} to {last})''Longest growth streak: {num_days} days ({first} to {last})' .. formatformat ((num_daysnum_days == lenlen (( longest_streaklongest_streak ),),firstfirst == longest_streaklongest_streak [[ 00 ]] .. datedate ,,lastlast == longest_streaklongest_streak [[ -- 11 ]] .. date
date
))
))

Running the above script produces the following output:

运行上面的脚本将产生以下输出:

章节回顾 (Section Recap)

In this section, you covered a lot of ground, but you only saw a few functions from itertools. Let’s review those now.

在本节中,您了解了很多内容,但是您仅从itertools看到了一些功能。 现在让我们回顾一下。

itertools.filterfalse示例 (itertools.filterfalse Example)

filterfalse(pred, iterable)

filterfalse(pred, iterable)

Return those items of sequence for which pred(item) is false. If pred is None, return the items that are false.

返回pred(item)为false的序列pred(item) 。 如果predNone ,则返回错误的项目。

 >>> >>>  filterfalsefilterfalse (( boolbool , , [[ 11 , , 00 , , 11 , , 00 , , 00 ])
])
0, 0, 0
0, 0, 0

itertools.takewhile示例 (itertools.takewhile Example)

takewhile(pred, iterable)

takewhile(pred, iterable)

Return successive entries from an iterable as long as pred evaluates to true for each entry.

只要pred对每个条目求值为true,就从迭代器返回连续的条目。

itertools.dropwhile示例 (itertools.dropwhile Example)

dropwhile(pred, iterable)

dropwhile(pred, iterable)

Drop items from the iterable while pred(item) is true. Afterwards, return every element until the iterable is exhausted.

pred(item)为true时,从iterable中删除项目。 然后,返回每个元素,直到迭代器耗尽为止。

 >>> >>>  dropwhiledropwhile (( boolbool , , [[ 11 , , 11 , , 11 , , 00 , , 00 , , 11 , , 11 , , 00 ])
])
0, 0, 1, 1, 0
0, 0, 1, 1, 0

You are really starting to master this whole itertools thing! The community swim team would like to commission you for a small project.

您真的开始精通整个itertools ! 社区游泳队想委托您进行一个小项目。

根据游泳者数据建立接力队 (Building Relay Teams From Swimmer Data)

In this example, you will read data from a CSV file containing swimming event times for a community swim team from all of the swim meets over the course of a season. The goal is to determine which swimmers should be in the relay teams for each stroke next season.

在此示例中,您将从CSV文件中读取数据,该数据包含整个季节的所有游泳比赛中社区游泳队的游泳活动时间。 目的是确定下个赛季每个泳道的接力运动员应该是哪些选手。

Each stroke should have an “A” and a “B” relay team with four swimmers each. The “A” team should contain the four swimmers with the best times for the stroke and the “B” team the swimmers with the next four best times.

每个泳道应有一个“ A”和“ B”接力队,每个队有四名游泳者。 “ A”队应包含四名最​​佳泳手时间,“ B”队应具有下四段最佳时间。

The data for this example can be found here. If you want to follow along, download it to your current working directory and save it as swimmers.csv.

此示例的数据可以在此处找到。 如果要继续,请将其下载到当前工作目录,并将其另存为swimmers.csv

Here are the first 10 rows of swimmers.csv:

这是swimmers.csv的前10行:

The three times in each row represent the times recorded by three different stopwatches, and are given in MM:SS:mmmmmm format (minutes, seconds, microseconds). The accepted time for an event is the median of these three times, not the average.

每行中的三个时间代表三个不同的秒表记录的时间,并以MM:SS:mmmmmm格式(分钟,秒,微秒)给出。 事件的接受时间是这三倍的中值,而不是平均值。

Let’s start by creating a subclass Event of the namedtuple object, just like we did in the SP500 example:

让我们从创建namedtuple对象的子类Event开始,就像我们在SP500示例中所做的那样:

 from from collections collections import import namedtuplenamedtupleclass class EventEvent (( namedtuplenamedtuple (( 'Event''Event' , , [[ 'stroke''stroke' , , 'name''name' , , 'time''time' ])):])):__slots__ __slots__ = = ()()def def __lt____lt__ (( selfself , , otherother ):):return return selfself .. time time < < otherother .. time
time

The .stroke property stores the name of the stroke in the event, .name stores the swimmer name, and .time records the accepted time for the event. The .__lt__() dunder method will allow min() to be called on a sequence of Event objects.

.stroke属性存储事件中笔画的名称, .name存储游泳者名称, .time记录事件接受的时间。 .__lt__() dunder方法将允许在一系列Event对象上调用min()

To read the data from the CSV into a tuple of Event objects, you can use the csv.DictReader object:

要将数据从CSV读取到Event对象的元组中,可以使用csv.DictReader对象:

The read_events() generator reads each row in the swimmers.csv file into an OrderedDict object in the following line:

read_events()生成器将swimmers.csv文件中的每一行读入以下行中的OrderedDict对象:

 reader reader = = csvcsv .. DictReaderDictReader (( infileinfile , , fieldnamesfieldnames == fieldnamesfieldnames , , restkeyrestkey == 'Times''Times' )
)

By assigning the 'Times' field to restkey, the “Time1”, “Time2”, and “Time3” columns of each row in the CSV file will be stored in a list on the 'Times' key of the OrderedDict returned by csv.DictReader.

通过将'Times'字段分配给restkey ,CSV文件中每行的“ Time1”,“ Time2”和“ Time3”列将存储在csv.DictReader返回的OrderedDict'Times'键上的列表中csv.DictReader

For example, the first row of the file (excluding the header row) is read into the following object:

例如,文件的第一行(不包括标题行)被读入以下对象:

Next, read_events() yields an Event object with the stroke, swimmer name, and median time (as a datetime.time object) returned by the _median() function, which calls statistics.median() on the list of times in the row.

接下来, read_events()产生一个Event对象,该对象具有_median()函数返回的笔画,游泳者名称和中位时间(作为datetime.time对象 ),该函数在该行的时间列表上调用statistics.median()

Since each item in the list of times is read as a string by csv.DictReader(), _median() uses the datetime.datetime.strptime() classmethod to instantiate a time object from each string.

由于csv.DictReader()将时间列表中的每个项目读取为字符串,因此_median()使用datetime.datetime.strptime()类方法从每个字符串实例化时间对象。

Finally, a tuple of Event objects is created:

最后,创建一个Event对象元组:

 events events = = tupletuple (( read_eventsread_events (( 'swimmers.csv''swimmers.csv' ))
))

The first five elements of events look like this:

events的前五个元素如下所示:

Now that you’ve got the data into memory, what do you do with it? Here’s the plan of attack:

现在您已将数据存储到内存中,您将如何处理? 以下是攻击计划:

  • Group the events by stroke.
  • For each stroke:
    • Group its events by swimmer name and determine the best time for each swimmer.
    • Order the swimmers by best time.
    • The first four swimmers make the “A” team for the stroke, and the next four swimmers make the “B” team.
  • 按笔划将事件分组。
  • 对于每个笔画:
    • 按游泳者名称将其事件分组,并确定每个游泳者的最佳时间。
    • 在最佳时间下命令游泳者。
    • 前四名游泳运动员组成“ A”队,其后四名游泳运动员组成“ B”队。

The itertools.groupby() function makes grouping objects in an iterable a snap. It takes an iterable inputs and a key to group by, and returns an object containing iterators over the elements of inputs grouped by the key.

itertools.groupby()函数使可迭代的对象分组变得轻松。 它需要一个可迭代的inputs和一个要分组的key ,并返回一个包含由键分组的inputs元素上的迭代器的对象。

Here’s a simple groupby() example:

这是一个简单的groupby()示例:

 >>> >>>  data data = = [{[{ 'name''name' : : 'Alan''Alan' , , 'age''age' : : 3434 },
},
...         ...         {{ 'name''name' : : 'Catherine''Catherine' , , 'age''age' : : 3434 },
},
...         ...         {{ 'name''name' : : 'Betsy''Betsy' , , 'age''age' : : 2929 },
},
...         ...         {{ 'name''name' : : 'David''David' , , 'age''age' : : 3333 }]
}]
...
...
>>> >>>  grouped_data grouped_data = = itit .. groupbygroupby (( datadata , , keykey == lambda lambda xx : : xx [[ 'age''age' ])
])
>>> >>>  for for keykey , , grp grp in in grouped_datagrouped_data :
:
...     ...     printprint (( '{}: {}''{}: {}' .. formatformat (( keykey , , listlist (( grpgrp )))
)))
...
...
34: [{'name': 'Alan', 'age': 34}, {'name': 'Betsy', 'age': 34}]
34: [{'name': 'Alan', 'age': 34}, {'name': 'Betsy', 'age': 34}]
29: [{'name': 'Catherine', 'age': 29}]
29: [{'name': 'Catherine', 'age': 29}]
33: [{'name': 'David', 'age': 33}]
33: [{'name': 'David', 'age': 33}]

If no key is specified, groupby() defaults to grouping by “identity”—that is, aggregating identical elements in the iterable:

如果未指定任何键,则groupby()默认为按“标识”进行分组-即,将可迭代元素中的相同元素进行汇总:

The object returned by groupby() is sort of like a dictionary in the sense that the iterators returned are associated with a key. However, unlike a dictionary, it won’t allow you to access its values by key name:

从返回的迭代器与键关联的意义上讲, groupby()返回的对象有点像字典。 但是,与字典不同,它不允许您通过键名访问其值:

 >>> grouped_data[1]
>>> grouped_data[1]
Traceback (most recent call last):File Traceback (most recent call last):File "<stdin>", line "<stdin>" , line 1, in 1 , in <module>
<module>
TypeError: TypeError : 'itertools.groupby' object is not subscriptable
'itertools.groupby' object is not subscriptable

In fact, groupby() returns an iterator over tuples whose first components are keys and second components are iterators over the grouped data:

实际上, groupby()返回元组的迭代器,该元组的第一个组件是键,第二个组件是已分组数据的迭代器

One thing to keep in mind with groupby() is that it isn’t as smart as you might like. As groupby() traverses the data, it aggregates elements until an element with a different key is encountered, at which point it starts a new group:

groupby()要记住的一件事是它并不像您想的那样聪明。 当groupby()遍历数据时,它将聚合元素,直到遇到具有不同键的元素为止,这时它将启动一个新的组:

 >>> >>>  grouped_data grouped_data = = itit .. groupbygroupby ([([ 11 , , 22 , , 11 , , 22 , , 33 , , 22 ])
])
>>> >>>  for for keykey , , grp grp in in grouped_datagrouped_data :
:
...     ...     printprint (( '{}: {}''{}: {}' .. formatformat (( keykey , , listlist (( grpgrp )))
)))
...
...
1: [1]
1: [1]
2: [2]
2: [2]
1: [1]
1: [1]
2: [2]
2: [2]
3: [3]
3: [3]
2: [2]
2: [2]

Compare this to, say, the SQL GROUP BY command, which groups elements regardless of their order of appearance.

将其与SQL GROUP BY命令进行比较,该命令对元素进行分组,而不考虑其出现顺序。

When working with groupby(), you need to sort your data on the same key that you would like to group by. Otherwise, you may get unexpected results. This is so common that it helps to write a utility function to take care of this for you:

使用groupby() ,您需要根据要分组的同一键对数据进行排序。 否则,您可能会得到意想不到的结果。 这很常见,以至于有助于编写实用程序函数为您解决此问题:

Returning to the swimmers example, the first thing you need to do is create a for loop that iterates over the data in the events tuple grouped by stroke:

回到游泳者示例,您需要做的第一件事是创建一个for循环,该循环遍历按笔划分组的events元组中的数据:

 for for strokestroke , , evts evts in in sort_and_groupsort_and_group (( eventsevents , , keykey == lambda lambda evtevt : : evtevt .. strokestroke ):
):

Next, you need to group the evts iterator by swimmer name inside of the above for loop:

接下来,您需要在上述for循环内按游泳者名称对evts迭代器进行分组:

To calculate the best time for each swimmer in grp_by_name, you can call min() on the events in that swimmers group. (This works because you implemented the .__lt__() dunder method in the Events class.)

要在grp_by_name为每个游泳者计算最佳时间,您可以在该游泳者组中的事件上调用min() 。 (之所以有效,是因为您在Events类中实现了.__lt__() dunder方法。)

 best_times best_times = = (( minmin (( evtevt ) ) for for __ , , evt evt in in events_by_nameevents_by_name )
)

Note that the best_times generator yields Event objects containing the best stroke time for each swimmer. To build the relay teams, you’ll need to sort best_times by time and aggregate the result into groups of four. To aggregate the results, you can use the grouper() function from The grouper() recipe section and use islice() to grab the first two groups.

请注意, best_times生成器产生的Event对象包含每个游泳者的最佳笔划时间。 要组建接力队,您需要按时间对best_times进行排序,并将结果汇​​总为四组。 汇总结果,您可以使用grouper()函数从该grouper()配方段和使用islice()抢前两组。

Now teams is an iterator over exactly two tuples representing the “A” and the “B” team for the stroke. The first component of each tuple is the letter “A” or “B”, and the second component is an iterator over Event objects containing the swimmers in the team. You can now print the results:

现在, teams是一个遍历两个元组的迭代器,分别代表该笔划的“ A”和“ B”团队。 每个元组的第一个组件是字母“ A”或“ B”,第二个组件是对包含团队中游泳者的Event对象的迭代器。 您现在可以打印结果:

 for for teamteam , , swimmers swimmers in in teamsteams ::printprint (( '{stroke} {team}: {names}''{stroke} {team}: {names}' .. formatformat ((strokestroke == strokestroke .. capitalizecapitalize (),(),teamteam == teamteam ,,namesnames == ', '', ' .. joinjoin (( swimmerswimmer .. name name for for swimmer swimmer in in swimmersswimmers ))))
))

Here’s the full script:

这是完整的脚本:

If you run the above code, you’ll get the following output:

如果运行上面的代码,您将获得以下输出:

 Backstroke A: Sophia, Grace, Penelope, Addison
Backstroke A: Sophia, Grace, Penelope, Addison
Backstroke B: Elizabeth, Audrey, Emily, Aria
Backstroke B: Elizabeth, Audrey, Emily, Aria
Breaststroke A: Samantha, Avery, Layla, Zoe
Breaststroke A: Samantha, Avery, Layla, Zoe
Breaststroke B: Lillian, Aria, Ava, Alexa
Breaststroke B: Lillian, Aria, Ava, Alexa
Butterfly A: Audrey, Leah, Layla, Samantha
Butterfly A: Audrey, Leah, Layla, Samantha
Butterfly B: Alexa, Zoey, Emma, Madison
Butterfly B: Alexa, Zoey, Emma, Madison
Freestyle A: Aubrey, Emma, Olivia, Evelyn
Freestyle A: Aubrey, Emma, Olivia, Evelyn
Freestyle B: Elizabeth, Zoe, Addison, Madison
Freestyle B: Elizabeth, Zoe, Addison, Madison

从这往哪儿走 (Where to Go From Here)

If you have made it this far, congratulations! I hope you have enjoyed the journey.

如果您已经做到了,那么恭喜! 希望您喜欢这个旅程。

itertools is a powerful module in the Python standard library, and an essential tool to have in your toolkit. With it, you can write faster and more memory efficient code that is often simpler and easier to read (although that is not always the case, as you saw in the section on second order recurrence relations).

itertools是Python标准库中的一个功能强大的模块,并且是您工具箱中必备的基本工具。 使用它,您可以编写速度更快,内存效率更高的代码,这些代码通常更简单易读(尽管并非总是如此,正如您在二阶递归关系部分中所看到的)。

If anything, though, itertools is a testament to the power of iterators and lazy evaluation. Even though you have seen many techniques, this article only scratches the surface.

但是,如果有的话, itertools证明了迭代器和惰性评估的强大功能。 即使您已经看过很多技术,本文也只是从头开始。

So I guess this means your journey is only just beginning.

所以我想这意味着您的旅​​程才刚刚开始。

Free Bonus: Click here to get our itertools cheat sheet that summarizes the techniques demonstrated in this tutorial.

免费奖金: 单击此处以获取我们的itertools备忘单 ,该备忘单总结了本教程中演示的技术。

In fact, this article skipped two itertools functions: starmap() and compress(). In my experience, these are two of the lesser used itertools functions, but I urge you to read their docs an experiment with your own use cases!

实际上,本文跳过了两个itertools函数: starmap()compress() 。 以我的经验,这是较少使用的itertools函数中的两个,但我敦促您阅读他们的文档以使用自己的用例进行实验!

Here are a few places where you can find more examples of itertools in action (thanks to Brad Solomon for these fine suggestions):

在一些地方,您可以找到更多使用itertools示例(感谢Brad Solomon的这些出色建议):

  • What is the Purpose of itertools.repeat()?
  • Fastest Way to Generate a Random-like Unique String With Random Length in Python 3
  • Write a Pandas DataFrame to a String Buffer with Chunking
  • itertools.repeat()的用途是什么 ?
  • 用Python 3生成具有随机长度的类似随机的唯一字符串的最快方法
  • 使用分块将Pandas DataFrame写入字符串缓冲区

Finally, for even more tools for constructing iterators, take a look at more-itertools.

最后,有关构建迭代器的更多工具,请查看more-itertools 。

Do you have any favorite itertools recipes/use-cases? We would love to hear about them in the comments!

您有最喜欢的itertools配方/用例吗? 我们希望在评论中听到他们的消息!

翻译自: https://www.pybloggers.com/2018/05/itertools-in-python-3-by-example/

Python 3中的Itertools,例如相关推荐

  1. python中itertools的用法_python中的itertools的使用详解

    今天了解了下python中内置模块itertools的使用,熟悉下,看能不能以后少写几个for,嘿嘿

  2. python中的itertools_在python中使用itertools操作csv数据

    我试图在下面的代码中添加一个特性,但似乎在某个地方出了问题.在 下面的代码基本上为下面的每个演讲者重复第一个"z"表列(就像excel的换位).因为"z"后面的 ...

  3. python itertools卡死_python中的itertools的使用详解

    今天了解了下python中内置模块itertools的使用,熟悉下,看能不能以后少写几个for,嘿嘿

  4. python List中元素两两组合

    python List中元素两两组合 import itertools aa = ['a', 'b', 'c'] bb = list(itertools.permutations(aa, 2)) pr ...

  5. python调用什么函数实现对文件内容的读取_如何使用python语言中的方法对文件进行读写操作...

    在我们使用python语言中的文件时,可以使用open()方法打开文件,close()方法关闭文件,read()方法读取文件内容,write()方法写入内容到文件中.下面利用几个实例说明文件读写方法, ...

  6. python图像中如何显示中文

    python图像中如何显示中文 在开头加入这两行即可 from pylab import * mpl.rcParams['font.sans-serif'] = ['SimHei']

  7. 关于python缩进的描述中_关于Python程序中与“缩进”有关的说法中,以下选项中正确的是()...

    关于Python程序中与"缩进"有关的说法中,以下选项中正确的是() 答:缩进在程序中长度统一且强制使用 同文学或同音乐主题的民歌,<_______>是其中之一.此曲经 ...

  8. python 类中定义类_Python中的动态类定义

    python 类中定义类 Here's a neat Python trick you might just find useful one day. Let's look at how you ca ...

  9. Python培训分享:Python新版本中的6个新特性

    Python在几年做了一个全面的升级,此次Python升级中有6个新特性,本期小编为大家介绍的Python培训教程就是关于介绍Python新版本中的6个新特性的,来看看下面的详细介绍. Python培 ...

最新文章

  1. 理解Android编译命令(转)
  2. Oracle常用操作之登录名和密码大小写问题
  3. 【知识图谱】知识抽取与挖掘(Ⅱ)
  4. SQL 2005 数据库镜像
  5. docker 虚悬镜像 ( 悬空镜像 ) :镜像没有仓库名或没有标签
  6. (转) Oracle性能优化-读懂执行计划
  7. Kinect for Windows v2.0安装教程
  8. 基于JAVA+SpringBoot+Mybatis+MYSQL的疫情信息管理系统
  9. 利用Keydown事件阻止用户输入
  10. PMP不报培训班的通过率高吗?
  11. 古剑奇谭3steam服务器稳定吗,国产游戏《古剑奇谭3》占据steam热销榜第一?这么好玩吗?...
  12. 电子祝福贺卡小程序有哪些?
  13. 如果你现在没有目标,或许很迷茫
  14. 3道数据分析师面试题实录
  15. 将数据集做成VOC2007格式用于Faster-RCNN训练
  16. 2015-10-16 Invoke 函数 InvokeRepeating函数 CancelInvoke取消Invoke函数
  17. 给老孙做了个排班表!
  18. kafka巨坑 启动失败
  19. stm32毕业设计 单片机万能红外遥控器
  20. evus是什么意思_去美国之前要上网做evus是什么意思

热门文章

  1. day8--socketserver回顾
  2. 看了很多书单,我们该看什么书呢?
  3. 伊苏比的梦幻之旅(四)比赛题解
  4. 微盟删库事件对等保工作开展的启示
  5. Apache之——多虚拟主机多站点配置的两种实现方案
  6. mysql 死锁日志_Mysql死锁以及死锁日志分析
  7. 基于Kaggle的经典AI项目二—数据清洗
  8. HashMap源码理解
  9. 超话显示服务器有点累,周杰伦新歌1小时900万,服务器崩溃!但这12个彩蛋你绝对没注意...
  10. 2023计算机毕业设计SSM最新选题之javajava二手书交易系统1rn8a