python算法和数据结构_Python中的数据结构和算法

python算法和数据结构

To

至

Leonardo da Vinci

达芬奇(Leonardo da Vinci)

介绍 (Introduction)

The purpose of this article is to give you a panorama of data structures and algorithms in Python. This topic is very important for a Data Scientist in order to help him or her to design and solve machine learning models in a more effective way.

本文的目的是为您提供Python数据结构和算法的全景图。为了帮助数据科学家以更有效的方式设计和求解机器学习模型，该主题对于数据科学家而言非常重要。

We will see together with practical examples the built-in data structures, the user-defined data structures, and last but not least I will introduce you to some algorithms like traversal algorithms, sorting algorithms, and searching algorithms.

我们将与实际示例一起查看内置数据结构，用户定义的数据结构，最后但并非最不重要的一点是，我将向您介绍一些算法，例如遍历算法，排序算法和搜索算法。

So, let’s get started!

所以，让我们开始吧！

第一部分：导入数据结构 (Part I: Bult-in Data Structures)

As the name suggests, data structures allow us to organize, store, and manage data for efficient access and modification.

顾名思义，数据结构使我们能够组织，存储和管理数据，以进行有效的访问和修改。

In this part, we are going to take a look at built-in data structures. There are four types of built-in data structures in Python: list, tuple, set, and dictionary.

在这一部分中，我们将研究内置数据结构。 Python中有四种类型的内置数据结构：列表，元组，集合和字典。

List

清单

A list is defined using square brackets and holds data that is separated by commas. The list is mutable and ordered. It can contain a mix of different data types.

列表使用方括号定义，并包含用逗号分隔的数据。该列表是可变的和有序的。它可以包含不同数据类型的混合。

out:

出：

january['january', 'february', 'march', 'april', 'may', 'june', 'july']['birthday', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december']

Below there are some useful functions for the list.

下面是该列表的一些有用功能。

out:

出：

Whatisyourfavouritepainting?Who-is-your-favourite-artist-?

out:

出：

['Chagall', 'Kandinskij', 'Dalí', 'da Vinci', 'Picasso', 'Warhol', 'Basquiat']

Tuple

元组

A tuple is another container. It is a data type for immutable ordered sequences of elements. Immutable because you can’t add and remove elements from tuples, or sort them in place.

元组是另一个容器。它是元素的不可变有序序列的数据类型。不可变，因为您无法在元组中添加和删除元素，也无法对其进行排序。

out:

出：

The dimensions are 7 x 3 x 1

Set

组

Set is a mutable and unordered collection of unique elements. It can permit us to remove duplicate quickly from a list.

Set是可变且无序的唯一元素集合。它可以允许我们从列表中快速删除重复项。

out:

出：

{1, 2, 3, 5, 6}FalseBasquiat

Dictionary

字典

Dictionary is a mutable and unordered data structure. It permits storing a pair of items (i.e. keys and values).

字典是一种可变且无序的数据结构。它允许存储一对项目(即键和值)。

As the example below shows, in the dictionary, it is possible to include containers into other containers to create compound data structures.

如下例所示，在字典中，可以将容器包含在其他容器中以创建复合数据结构。

out:

出：

In a Sentimental MoodLacrimosa

第二部分：用户定义的数据结构 (Part II: User-Defined Data Structures)

Now I will introduce you three user-defined data structures: ques, stack, and tree. I assume that you have a basic knowledge of classes and functions.

现在，我将向您介绍三种用户定义的数据结构：ques，stack和tree。我假设您具有有关类和函数的基本知识。

Stack using arrays

使用数组堆叠

The stack is a linear data structure where elements are arranged sequentially. It follows the mechanism L.I.F.O which means last in first out. So, the last element inserted will be removed as the first. The operations are:

堆栈是线性数据结构，其中元素按顺序排列。它遵循LIFO机制，即先进先出。因此，插入的最后一个元素将被删除为第一个元素。操作是：

Push → inserting an element into the stack按下→将元素插入堆栈
Pop → deleting an element from the stack弹出→从堆栈中删除元素

The conditions to check:

检查条件：

overflow condition → this condition occurs when we try to put one more element into a stack that is already having maximum elements.溢出条件→当我们尝试将一个以上的元素放入已经具有最大元素的堆栈中时，就会发生这种情况。
underflow condition →this condition occurs when we try to delete an element from an empty stack.下溢条件→当我们尝试从空堆栈中删除元素时，将发生这种情况。

out:

出：

5True[10, 23, 25, 27, 11]overflow1127252310underflow

Queue using arrays

使用数组排队

The queue is a linear data structure where elements are in a sequential manner. It follows the F.I.F.O mechanism that means first in first out. Think when you go to the cinema with your friends, as you can imagine the first of you that give the ticket is also the first that step out of the line. The mechanism of the queue is the same.

队列是线性数据结构，其中元素按顺序排列。它遵循先进先出的先进先出机制。想想当您和朋友一起去电影院时，您可以想象到，第一个出票的人也是第一个跳出界限的人。队列的机制是相同的。

Below the aspects that characterize a queue.

在表征队列的方面之下。

Two ends:

两端：

front → points to starting element前面→指向起始元素
rear → points to the last element后→指向最后一个元素

There are two operations:

有两个操作：

enqueue → inserting an element into the queue. It will be done at the rear.入队→将元素插入队列。它将在后部完成。
dequeue → deleting an element from the queue. It will be done at the front.出队→从队列中删除元素。它将在前面完成。

There are two conditions:

有两个条件：

overflow → insertion into a queue that is full溢出→插入已满的队列
underflow → deletion from the empty queue下溢→从空队列中删除

out:

出：

[2, 3, 4, 5][3, 4, 5]

Tree (general tree)

树(普通树)

Trees are used to define hierarchy. It starts with the root node and goes further down, the last nodes are called child nodes.

树用于定义层次结构。它从根节点开始，然后向下延伸，最后一个节点称为子节点。

In this article, I focus on the binary tree. The binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. Below you can see a representation and an example of the binary tree with python where I constructed a class called Node and the objects that represent the different nodes( A, B, C, D, and E).

在本文中，我重点介绍二叉树。二叉树是一种树数据结构，其中每个节点最多具有两个子节点，称为左子节点和右子节点。在下面，您可以看到python二进制树的表示形式和示例，在其中构造了一个名为Node的类，并表示了代表不同节点(A，B，C，D和E)的对象。

Anyway, there are other user-defined data structures like linked lists and graphs.

无论如何，还有其他用户定义的数据结构，例如链表和图形。

第三部分：算法 (Part III: Algorithms)

The concept of the algorithm has existed since antiquity. In fact, the ancient Egyptians used algorithms to solve their problems. Then they taught this approach to the Greeks.

自上古以来，算法的概念就存在了。实际上，古埃及人使用算法来解决他们的问题。然后他们向希腊人教授了这种方法。

The word algorithm derives itself from the 9th-century Persian mathematician Muḥammad ibn Mūsā al-Khwārizmī, whose name was Latinized as Algorithmi. Al-Khwārizmī was also an astronomer, geographer, and a scholar in the House of Wisdom in Baghdad.

算法一词源于9世纪的波斯数学家MuḥammadibnMūsāal-Khwārizmī ，其名称被拉丁化为Algorithmi。 Al-Khwārizmī还是天文学家，地理学家，也是巴格达智慧之家的学者。

As you already know algorithms are instructions that are formulated in a finite and sequential order to solve problems.

如您所知，算法是按有限顺序排列的指令来解决问题。

When we write an algorithm, we have to know what is the exact problem, determine where we need to start and stop and formulate the intermediate steps.

在编写算法时，我们必须知道确切的问题是什么，确定需要在哪里开始和停止以及制定中间步骤。

There are three main approaches to solve algorithms:

有三种主要的算法求解方法：

Divide et Impera (also known as divide and conquer) → it divides the problem into sub-parts and solves each one separatelyDivide et Impera(也称为“分而治之”)→将问题分为几个部分，分别解决每个问题
Dynamic programming → it divides the problem into sub-parts remembers the results of the sub-parts and applies it to similar ones动态编程→将问题划分为多个子部分，记住子部分的结果并将其应用于相似的部分
Greedy algorithms → involve taking the easiest step while solving a problem without worrying about the complexity of the future steps贪婪算法→包括在解决问题的同时采取最简单的步骤，而无需担心未来步骤的复杂性

Tree Traversal Algorithm

树遍历算法

Trees in python are non-linear data structures. They are characterized by roots and nodes. I take the class I constructed before for the binary tree.

python中的树是非线性数据结构。它们的特征是根和节点。我采用之前为二叉树构造的类。

Tree Traversal refers to visiting each node present in the tree exactly once, in order to update or check them.

树遍历是指只访问树中存在的每个节点一次，以更新或检查它们。

There are three types of tree traversals:

有三种类型的树遍历：

In-order traversal → refers to visiting the left node, followed by the root and then the right nodes.有序遍历→指先访问左节点，然后依次访问根节点和右节点。

Here D is the leftmost node where the nearest root is B. The right of root B is E. Now the left sub-tree is completed, so I move towards the root node A and then to node C.

这里D是最左边的节点，其中最近的根是B。根B的右边是E。现在左侧的子树已完成，因此我朝根节点A移动，然后向节点C移动。

out:

出：

DBEAC

Pre-order traversal → refers to visiting the root node followed by the left nodes and then the right nodes.顺序遍历→是指先访问根节点，然后再访问左节点，再访问右节点。

In this case, I move to the root node A and then to the left child node B and to the sub child node D. After that I can go to the nodes E and then C.

在这种情况下，我先移至根节点A，然后移至左子节点B，再移至子子节点D。之后，我可以先移至节点E，然后移至C。

out:

出：

ABDEC

Post-order traversal → refers to visiting the left nodes followed by the right nodes and then the root node后顺序遍历→是指先访问左侧节点，然后再访问右侧节点，然后再访问根节点

I go to the most left node which is D and then to the right node E. Then, I can go from the left node B to the right node C. Finally, I move towards the root node A.

我先去最左边的节点D，再去右边的节点E。然后，我可以从左边的节点B到右边的节点C。最后，我向根节点A移动。

out:

出：

DEBCA

Sorting Algorithm

排序算法

The sorting algorithm is used to sort data in some given order. It can be classified in Merge Sort and Bubble Sort.

排序算法用于按给定顺序对数据进行排序。可以分为合并排序和气泡排序。

Merge Sort → it follows the divide et Impera rule. The given list is first divided into smaller lists and compares adjacent lists and then, reorders them in the desired sequence. So, in summary from unordered elements as input, we need to have ordered elements as output. Below, the code with each step described.

合并排序→遵循除法和Impera规则 。给定的列表首先被分成较小的列表，并比较相邻列表，然后按所需顺序对其进行重新排序。因此，总之，无序元素作为输入，我们需要有序元素作为输出。下面，用每个步骤描述代码。

out:

出：

input - unordered elements: 15 1 19 93output - ordered elements: [1, 15, 19, 93]

Bubble Sort → it first compares and then sorts adjacent elements if they are not in the specified order.冒泡排序→如果不按指定顺序对相邻元素进行排序，则首先进行比较，然后对它们进行排序。

out:

出：

[1, 3, 9, 15]

Insertion Sort → it picks one item of a given list at the time and places it at the exact spot where it is to be placed.插入排序→它会同时选择给定列表中的一项并将其放置在要放置的确切位置。

out:

出：

[1, 3, 9, 15]

There are other Sorting Algorithms like Selection Sort and Shell Sort.

还有其他排序算法，例如选择排序和外壳排序 。

Searching Algorithms

搜索算法

Searching algorithms are used to seek for some elements present in a given dataset. There are many types of search algorithms such as Linear Search, Binary Search, Exponential Search, Interpolation Search, and so on. In this section, we will see the Linear Search and Binary Search.

搜索算法用于寻找给定数据集中存在的某些元素。有很多类型的搜索算法，例如线性搜索，二进制搜索，指数搜索，插值搜索等。在本节中，我们将看到线性搜索和二进制搜索。

Linear Search → in a single-dimensional array we have to search a particular key element. The input is the group of elements and the key element that we want to find. So, we have to compare the key element with each element of the group. In the following code, I try to seek element 27 in our list.线性搜索→在一维数组中，我们必须搜索特定的关键元素。输入是元素组和我们要查找的关键元素。因此，我们必须将关键元素与组中的每个元素进行比较。在下面的代码中，我尝试在列表中查找元素27。

out:

出：

'not fund'

Binary Search → in this algorithm, we assume that the list is in ascending order. So, if the value of the search key is less than the element in the middle of the list, we narrow the interval to the lower half. Otherwise, we narrow to the upper half. We continue our check until the value is found or the list is empty.二进制搜索→在此算法中，我们假定列表按升序排列。因此，如果搜索关键字的值小于列表中间的元素，则将间隔缩小到下半部分。否则，我们缩小到上半部分。我们继续检查，直到找到该值或列表为空。

out:

出：

FalseTrue

结论 (Conclusion)

Now you have an overview of data structures and algorithms. So, you can start going to a deeper understanding of algorithms.

现在，您将概述数据结构和算法。因此，您可以开始对算法进行更深入的了解。

The beautiful image of the Vitruvian Man I have chosen for this article is not casual. The drawing is based on the correlation of the ideal human body in relation to geometry. In fact, for this representation, Leonardo da Vinci was inspired by Vitruvius who described the man’s body as the ideal body to determine the correct proportion in architecture.

我为本文选择的维特鲁威人的美丽形象并非随随便便。该图基于理想人体与几何体的相关性。实际上，对于这种表示形式，达芬奇(Leonardo da Vinci)的灵感来自维特鲁威(Vitruvius) ，他将男人的身体描述为理想的身体，可以确定建筑中正确的比例。

For what concerns algorithms, the Vitruvian Man hides a secret algorithm used by the artists for centuries to certify that their works were inspired by the divine proportion.

关于算法，《维特鲁威人》(Vitruvian Man)隐藏了艺术家几个世纪以来一直在使用的秘密算法，以证明他们的作品是受神圣比例启发的。

Sometimes I like to think that maybe Leonardo da Vinci, through his wonderful works, wanted to define the most important algorithm which is the algorithm of life.

有时我想认为达芬奇(Leonardo da Vinci)通过他的出色著作想要定义最重要的算法，即生命算法。

Thanks for reading this. There are some other ways you can keep in touch with me and follow my work:

感谢您阅读本文。您可以通过其他方法与我保持联系并关注我的工作：

Subscribe to my newsletter.

订阅我的时事通讯。
You can also get in touch via my Telegram group, Data Science for Beginners.

您也可以通过我的电报小组“ 面向初学者的数据科学”进行联系 。

翻译自: https://towardsdatascience.com/data-structures-algorithms-in-python-68c8dbb19c90

python算法和数据结构

查看全文

http://www.taodudu.cc/news/show-994851.html

python dash_Dash是Databricks Spark后端的理想基于Python的前端
在Python中查找子字符串索引的5种方法
趣味数据故事_坏数据的好故事
python分句_Python循环中的分句，继续和其他子句
python数据建模数据集_Python中的数据集
usgs地震记录如何下载_用大叶草绘制USGS地震数据
数据可视化信息可视化_更好的数据可视化的8个技巧
sql 左联接全联接_通过了解自我联接将您SQL技能提升到一个新的水平
科学价值社交关系大数据_服务的价值：数据科学和用户体验研究美好生活
vs azure web_在Azure中迁移和自动化Chrome Web爬网程序的指南。
selenium 解析网页_用Selenium进行网页搜刮
hive 导入hdfs数据_将数据加载或导入运行在基于HDFS的数据湖之上的Hive表中的另一种方法。
大数据业务学习笔记_学习业务成为一名出色的数据科学家
python 开发api_使用FastAPI和Python快速开发高性能API
Power BI：M与DAX以及度量与计算列
梯度下降法优化目标函数_如何通过3个简单的步骤区分梯度下降目标函数
seaborn 子图_Seaborn FacetGrid：进一步完善子图
异常检测时间序列_时间序列的无监督异常检测
存款惊人_如何使您的图快速美丽惊人
网络传播动力学_通过简单的规则传播动力
开源软件安全风险_3开源安全风险及其解决方法
自助分析_为什么自助服务分析真的不是一回事
错误录入算法_如何使用验证错误率确定算法输出之间的关系
pytorch回归_PyTorch：用岭回归检查泰坦尼克号下沉
iris数据集测试集_IRIS数据集的探索性数据分析
flink 检查点_Flink检查点和恢复
python初学者_初学者使用Python的完整介绍
snowflake 数据库_Snowflake数据分析教程
高级Python：定义类时要应用的9种最佳做法
医疗大数据处理流程_我们需要数据来大规模改善医疗流程