kinesis

Part 1 of a multi-part series around analysing Flanders’ traffic whilst leveraging the power of cloud components!

这是一个由多部分组成的系列文章的第1部分,该系列文章将分析佛兰德斯的交通,同时利用云组件的强大功能!

TL;DR — Using Kinesis Data Analytics to get real-time insights in Flanders’ traffic situation!

TL; DR-使用Kinesis Data Analytics实时了解法兰德斯的交通状况!

表中的内容 (Table of Content)

  • Why Real-time Data Analytics?

    为什么选择实时数据分析?

  • Analysing Flanders' Traffic !?

    正在分析佛兰德斯的交通!

  • Big Picture Architecture

    大图架构

  • Kinesis Data Analytics: Nuts And Bolds

    Kinesis数据分析:粗体

  • Record preprocessor

    记录预处理器

  • Real time analytics with Streams and Pumps

    使用Streams和Pumps进行实时分析

  • Windowed queries

    窗口查询

  • Using static reference data

    使用静态参考数据

  • Key Takeaways

    重要要点

为什么选择实时数据分析? (Why Real-time Data Analytics?)

Real-time analytics allows businesses to react on the spot. Seize opportunities when they occur, avoid problems before they happen or alert immediately when necessary.

实时分析使企业可以当场做出React。 抓住机遇,在机遇发生之前就避免它们出现,或者在必要时立即发出警报。

Traditionally, batch-processing analytics was generally considered adequate as most BI users could meet their business goals by looking at weekly or monthly business numbers.

传统上,通常认为批处理分析就足够了,因为大多数BI用户可以通过查看每周或每月的业务数字来实现其业务目标。

Recently however, a shift has occurred towards low latency real-time analytics, where data is processed as it arrives. Driving this shift are several factors such as increased access to data sources and improved computing power resulting from cloud computing.

但是,最近发生了向低延迟实时分析的转变,该分析在数据到达时进行处理。 推动这一转变的因素有很多,例如增加了对数据源的访问以及云计算带来的更高的计算能力。

Examples where businesses could benefit from real-time analytics are:

企业可以从实时分析中受益的示例有:

  • The tracking of client behaviour on websites (a visitor that is hesitating to convert may be persuaded to buy by a final incentive).在网站上跟踪客户的行为(犹豫不决的访客可能会说服最终购买者诱使他们购买)。
  • Viewing orders as they happen allowing for the faster identification of and reaction to trends.查看订单发生的情况,可以更快地识别趋势并对趋势做出React。
  • The monitoring of traffic data to improve traffic flows or generate alerts for traffic jams.监控交通数据以改善交通流量或生成交通拥堵警报。

我们如何实现实时分析? (How Can We Achieve Real-Time Analytics?)

We can use AWS Kinesis Data Analytics to author SQL code that continuously reads and processes data in near real time. When using data in real time we speak of the hot path for the data. "Hot", meaning that you want to do something with the data while it is still hot from just being produced.

我们可以使用AWS Kinesis Data Analytics编写SQL代码,该代码连续近实时地读取和处理数据。 当实时使用数据时,我们谈到了数据的hot path 。 “ Hot”(热),表示您想对数据进行某些处理,而这些数据在刚生成时仍然很热。

For your hot path you want to think about your data access patterns upfront!

对于您的 hot path 您想先考虑一下您的数据访问模式!

Correctly defining your data access patterns upfront (and structuring your data accordingly) is extremely important. It will ensure you have data available that is compatible with the manner in which it is to be used, allowing for (near) real time reactions to it. ( Note: with data access pattern we are referring to the way you will interact with your data: which fields will you query on, what attributes do you want to extract from nested data structures..)

预先正确定义数据访问模式(并相应地构造数据)非常重要。 它将确保您拥有与使用方式兼容的可用数据,并允许对其进行(近乎)实时React。 ( 注意:在 data access pattern 我们指的是您与数据交互的方式:您要查询哪些字段,要从嵌套数据结构中提取哪些属性。 )

正在分析佛兰德斯的交通! (Analysing Flanders' Traffic !?)

For this blog post, we would like to explore the following practical example. In Flanders, the “Flemish Traffic Institute” is continuously monitoring traffic on the highways. The visualisation directly below shows all of the different measurement locations, where traffic is monitored.

对于此博客文章,我们想探索以下实际示例。 在法兰德斯,“佛兰芒交通学院”一直在监视高速公路上的交通。 正下方的图表显示了监控流量的所有不同测量位置。

All traffic measurement locations in Flanders
法兰德斯的所有流量测量位置

This data is made available, on a per minute basis, via an API. For a couple of critical locations in Flanders, we endeavored to set-up the following:

通过API,每分钟可获取此数据。 对于法兰德斯的几个关键地点,我们努力进行以下设置:

  • Send out an alert when a traffic jam occurs.发生交通拥堵时发出警报。
  • Analyse whether a traffic jam is emerging or disappearing.分析交通堵塞是否正在出现或消失。
  • Get a real time status of the current situation.获取当前情况的实时状态。

大图片架构 (Big Picture Architecture)

The government publishes the traffic data every minute as a big blob of xml data, containing the information for all 4500 measurement locations in Flanders. We immediately convert this data into the JSON format and divide it into in per-location measurement events. This preprocessing is achieved using AWS Lambda. As the focus of this blog post is to discuss real-time analytics, we will not go deeper into the particulars of how we used Lambda to accomplish this.

政府每分钟以xml数据的大块形式发布交通数据,其中包含法兰德斯所有4500个测量地点的信息。 我们立即将这些数据转换为JSON格式,并在每个位置的测量事件中将其分为几部分。 此预处理使用AWS Lambda实现。 由于此博客文章的重点是讨论实时分析,因此我们不会更深入地介绍如何使用Lambda来完成此任务。

The per-location measurement events are then streamed over Firehose. This Firehose is used as an input for our Kinesis Data Analytics application, which will provide real-time insights. Next, the results of our real-time analytics with Kinesis Data Analytics are sent to a Kinesis Data Stream, which can then be used by a Lambda to, for example, generating traffic jam alerts or saving the results in DynamoDB.

然后, Firehose位置测量事件通过Firehose流式传输。 该Firehose用作我们的Kinesis Data Analytics应用程序的输入,该应用程序将提供实时见解。 接下来,我们使用Kinesis Data Analytics进行实时分析的结果将发送到Kinesis Data Stream ,然后可由Lambda使用该Kinesis Data Stream ,例如,生成交通拥堵警报或将结果保存在DynamoDB

The format of the data arriving on Firehose is shown below. For the non-native Dutch readers, this data contains:

到达Firehose的数据格式如下所示。 对于非母语的荷兰读者,此数据包含:

  • A timestamp telling us when the measurement was taken.时间戳告诉我们何时进行测量。
  • A unique id for the measurement location which can be linked to a physical place in Flanders.测量位置的唯一标识,可以将其链接到法兰德斯的实际位置。
  • Information about the status of the measurement sensor.有关测量传感器状态的信息。
  • Information about the speed of vehicles of certain classes. These classes represent the type of vehicle eg. motor, truck, car. For future reference let’s remember that class2 is the class representing the cars.

    有关某些类别的车辆速度的信息。 这些类别代表车辆的类型,例如。 马达,卡车,汽车。 为了将来参考,请记住class2是代表cars的类。

Kinesis数据分析:粗体 (Kinesis Data Analytics: Nuts And Bolds)

源/传入数据 (Source / Incoming Data)

Let’s dig deeper into the architecture. We’ll start with the source for our analytics application, which is a Kinesis Firehose stream.

让我们更深入地研究架构。 我们将从分析应用程序的源开始,该源是Kinesis Firehose流。

Kinesis Firehose is a near real-time serverless service that can load data into your data lake or analytics tool and scales automatically.

Kinesis Firehose是一项近乎实时的无服务器服务,可以将数据加载到您的数据湖或分析工具中并自动扩展。

Let’s dissect that definition:

我们来剖析一下这个定义:

  • Near real-time: data arrives on the stream and is flushed towards the destination of the stream on minimum intervals of 60 seconds or 1MiB.接近实时:数据到达流,并以60秒或1MiB的最小间隔冲向流的目的地。
  • Serverless: you don’t have to manage this stream yourself.无服务器:您不必自己管理此流。
  • Scales automatically: don’t worry about sharding your stream. AWS will take care of this for you.自动缩放:不用担心分流。 AWS会帮您解决这个问题。

It is important to note that there are 2 main options to stream your data, either Kinesis Firehose or Kinesis Data Stream. We decided to use Kinesis Firehose, as we did not wish to handle sharding up or down ourself, as is required when using Kinesis Data Stream. Firehose also allows 5000 writes per second where Data Streams will throttle you at 1000 writes per second (per shard). Firehose comes with the extra advantage that it can land your original data on S3 allowing you to build a data lake for batch processing later on. The other side of the medal is that Firehose causes you to be near real-time instead of real-time.

请务必注意,有两个主要选项可用于流式传输数据,即Kinesis FirehoseKinesis Data Stream 。 我们决定使用Kinesis Firehose ,因为我们不希望像使用Kinesis Data Stream时那样需要自行处理分片。 Firehose还允许每秒5000次写入,而Data Streams将以每秒1000次写入(每个分片)的Firehose您。 Firehose具有额外的优势,它可以将您的原始数据登陆到S3上,从而允许您构建数据湖以供日后进行批处理。 奖牌的另一面是, Firehose使您接近实时而非实时。

If you would like to know more about Firehose vs Data Streams visit this page on the Lumigo Blog.

如果您想了解有关FirehoseData Streams更多信息,请访问Lumigo博客上的此页面 。

The Kinesis Firehose/Data Stream that you choose as your input is your Streaming source. You point this streaming source to an in-application stream that is automatically created and will be named SOURCE_SQL_STREAM_001 by AWS.

您选择作为输入的Kinesis Firehose/Data Stream是您的Streaming source 。 您将此streaming source指向自动创建的应用程序内流,该streaming source将由AWS命名为SOURCE_SQL_STREAM_001

分析工具 (Analytics)

Now we dive into the heart of our real-time analytics flow, namely Kinesis Data Analytics.

现在,我们深入了解实时分析流程的核心,即Kinesis Data Analytics

Kinesis Data Analytics is a way to analyze streaming data in real-time using SQL or integrated Java applications. https://aws.amazon.com/kinesis/data-analytics/

Kinesis Data Analytics是一种使用SQL或集成Java应用程序实时分析流数据的方法。 https://aws.amazon.com/kinesis/data-analytics /

In this case we chose to use SQL to write our real-time analytics. In our Analytics Application we'll use the Firehose as the source for our application.

在这种情况下,我们选择使用SQL编写实时分析。 在我们的Analytics Application我们将使用Firehose作为应用程序的源。

Notice that:

注意:

  • The incoming data from Firehose will be available on the In-application stream named SOURCE_SQL_STREAM_001. As mentioned above this stream is named by AWS.

    Firehose传入的数据将In-application stream名为SOURCE_SQL_STREAM_001 In-application stream上可用。 如上所述,此流由AWS命名。

  • There is a record preprocessor allowing you to do process or filter on the data before it enters the SQL analytics app.有一个记录预处理器,可让您在数据进入SQL分析应用之前对其进行处理或过滤。

记录预处理器 (Record preprocessor)

This is a Lambda Function which will receive batches of events and can transform these, drop these or let them pass on a one-by-one basis. The pseudo code below shows what the Lambda does:

这是一个Lambda Function ,它将接收大量事件,并可以转换事件,删除事件或让事件逐一传递。 下面的伪代码显示了Lambda does :

We see that the Lambda uses the dropped_or_okay() method to filter records. A record that will be dropped gets result Dropped, one that can pass gets result Ok.

我们看到Lambda使用dropped_or_okay()方法来过滤记录。 将被丢弃得到结果的记录Dropped ,可以通过,可以得到结果Ok

The preprocess_payload method is used to modify the payload. In the preprocess_payload method I remove some unneccessary fields from the payload. (The method is not shown here). In our case, we are only interested in cars (vehicle class 2 in our payload). So we will remove the data from the other vehicle classes in order to avoid storing and having to deal with unnecessary data.

preprocess_payload方法用于修改有效负载。 在preprocess_payload方法中,我从有效负载中删除了一些不必要的字段。 (此方法未在此处显示)。 在我们的案例中,我们仅对汽车感兴趣(有效载荷中的车辆类别2)。 因此,我们将从其他车辆类别中删除数据,以避免存储和处理不必要的数据。

使用Streams和Pumps进行实时分析 (Real time analytics with Streams and Pumps)

Kinesis Data Analytics allows you to do real time analytics using SQL concepts.

Kinesis Data Analytics允许您使用SQL概念进行实时分析。

Before I dive into the actual queries, there are two very important concepts to discuss:

在深入探讨实际查询之前,有两个非常重要的概念需要讨论:

  • In-application-SQL-Streams

    应用内SQL流

    As mentioned above, the streaming source (a

    如上所述,流媒体源(

    Kinesis firehose in our case) is mapped to an in-application stream named SOURCE_SQL_STREAM_001. In-application streams are Amazon Kinesis Data Analytics concepts. The stream continuously receives data from your source. Think of it as basically a table that you can also query using SQL. Since a continuous stream of data is flowing over it, we call it an in-application stream.

    在我们的情况下, Kinesis firehose被映射到名为SOURCE_SQL_STREAM_001的应用程序内流。 应用程序内流是Amazon Kinesis Data Analytics概念。 流不断从您的源接收数据。 基本上可以将其视为可以使用SQL查询的表。 由于持续不断的数据流在其上流动,因此我们将其称为应用程序内流。

  • In-application-SQL-pumps

    应用内SQL泵

    You can actually create multiple in-application streams. This means you need a way to move and insert data or query results from one stream to another. This is done by a

    您实际上可以创建多个应用程序内流。 这意味着您需要一种将数据或查询结果从一个流移动并插入到另一个流的方法。 这是通过

    SQL pump. AWS puts it as follows: "A pump is a continuously running insert query that moves data from one in-application stream to another in-application stream." (source:AWS doc)

    SQL pump 。 AWS表示如下:“泵是一个连续运行的插入查询,它将数据从一个应用程序内流移动到另一个应用程序内流。” (来源: AWS文档 )

In-application SQL streams and SQL pumps are the core concepts of a Kinesis Data Analytics Application.

应用程序内SQL流和SQL泵是Kinesis Data Analytics应用程序的核心概念。

Let’s see a basic example of what that looks like. Remember the structure of the streamed events shown above and the name of the source in-application stream. SOURCE_SQL_STREAM_001.

让我们看一个看起来像的基本示例。 记住上面显示的流事件的结构以及源应用程序内流的名称。 SOURCE_SQL_STREAM_001

  1. I am creating an intermediate in-application stream INCOMING_STREAM.

    我正在创建中间应用程序内部流INCOMING_STREAM

  2. I create a pump to insert data towards the intermediate stream.我创建了一个泵,向中间流插入数据。
  3. I define the query that will yield the results which will be pumped towards the intermediate streams.我定义了查询,该查询将产生将被泵送到中间流的结果。

窗口查询 (Windowed queries)

So, now we know about (intermediary) in-application streams and pumps which move data between those streams. Let’s now have a look at how we can make a window on our stream and aggregate results within that window. There are two kinds of windows: time-based vs row-based and 3 types: stagger, tumbling and sliding windows.

因此,现在我们知道了(中间)应用程序内流和在这些流之间移动数据的泵。 现在让我们看一下如何在流中创建一个窗口并在该窗口中汇总结果。 有两种类型的窗口:基于时间的窗口与基于行的窗口,以及3种类型:交错,滚动和滑动窗口。

Concerning time- and row-based windows the names says it all. You either specify the window size in terms of time or number of rows.

关于基于时间和行的窗口,名称说明了一切。 您可以根据时间或行数来指定窗口大小。

Different types of windows:

不同类型的窗户:

  • Sliding Windows

    滑动窗

    A continuously aggregating query, using a fixed time or rowcount interval.

    使用固定时间或行计数间隔的连续聚合查询。

  • Tumbling Windows

    翻滚视窗

    A continuously aggregating query, using definite time-based windows that open and close at regular intervals.

    使用基于定期的定期打开和关闭的基于时间的窗口的连续汇总查询。

  • Staggering Windows

    惊人的窗户

    Stagger windows can help with use cases where related records do not fall into the same (by

    交错窗口可以帮助解决相关记录不相同的用例(通过

    ROWTIME) time-restricted window. A challenge which you cannot solve when using tumbling windows.

    ROWTIME ) ROWTIME窗口。 使用滚动窗口时无法解决的挑战。

A detailed explanation and example of these windows can be found in the AWS docs. Originally it was hard for me to remember the syntax of each of these windows. Let me show you how you can recognise each type:

这些窗口的详细说明和示例可以在AWS文档中找到。 最初,我很难记住每个窗口的语法。 让我告诉您如何识别每种类型:

  • Sliding Windows

    滑动窗

... WINDOW W1 AS (PARTITION BY ... RANGE INTERVAL 'x' MINUTE PRECEDING

... WINDOW W1 AS (PARTITION BY ... RANGE INTERVAL 'x' MINUTE PRECEDING

  • Tumbling Windows

    翻滚窗户

... GROUP BY ... , STEP("YOUR_IN_APPLICATION_STREAM".ROWTIME BY INTERVAL 'x' MINUTE)

... GROUP BY ... , STEP("YOUR_IN_APPLICATION_STREAM".ROWTIME BY INTERVAL 'x' MINUTE)

  • Staggering Windows

    惊人的窗户

... WINDOWED BY STAGGER (PARTITION BY FLOOR(EVENT_TIME TO MINUTE), ... RANGE INTERVAL 'x' MINUTE)

... WINDOWED BY STAGGER (PARTITION BY FLOOR(EVENT_TIME TO MINUTE), ... RANGE INTERVAL 'x' MINUTE)

In our application we use the Sliding window to find out what the average speed over the last x minutes was. Below you can recognise three windows, indicating the last 10 minutes, 2 minutes and the current timestamp.

在我们的应用程序中,我们使用“ 滑动”窗口来查找最近x分钟的平均速度。 在下面可以识别三个窗口,分别指示最近10分钟,2分钟和当前时间戳。

When using a timestamp to start your window from, you can only use either ROWTIME or APPROXIMATE_ARRIVAL_TIME

使用时间戳记启动窗口时,只能使用ROWTIME或APPROXIMATE_ARRIVAL_TIME

  • ROWTIME represents the time at which the event was inserted to the first in-application stream.

    ROWTIME表示将事件插入到第一个应用程序内流的时间。

  • APPROXIMATE_ARRIVAL_TIME represents the time at which the event was added to the streaming source. That is the source which is feeding data towards your kinesis analytics application.

    APPROXIMATE_ARRIVAL_TIME表示将事件添加到流源的时间。 这就是向您的运动分析应用程序馈送数据的来源。

!! You cannot use a timestamp that originates from a field in your event to window by. This actually makes sense since you are working with real time data, which implicates the data should arrive in real time!

!! 您不能使用源自事件中某个字段的时间戳作为窗口。 这实际上是有道理的,因为您正在使用实时数据,这意味着数据应该实时到达!

Using the LAG operator we can look back in our window and access the data of the previous event(s).

使用LAG运算符,我们可以在窗口中回顾并访问先前事件的数据。

In the following example, I am using LAG to look back in the current Sliding Window and extract the speed from the previous event. This allows me to output a new event with both the current speed and the previous speed.

在下面的示例中,我使用LAG来回顾当前的“ Sliding Window并从上一个事件中提取速度。 这使我可以输出当前速度和先前速度的新事件。

使用静态参考数据 (Using static reference data)

You can add reference data that you can use to enrich the query results of your application.

您可以添加参考数据,以丰富应用程序的查询结果。

In our case the data of our application already contains an ID of the place where the data was measured. The name of the places themselves is not included in the data. However the ID of a place is statically linked to the name of that measurement location and thus could be found using reference data. This reference data has the following format:

在我们的案例中,我们应用程序的数据已经包含了数据测量地点的ID。 地点名称本身不包括在数据中。 但是,地点的ID静态链接到该测量地点的名称,因此可以使用参考数据找到该地点的ID。 该参考数据具有以下格式:

It’s a csv file which in tab delimited. You can also use other separator. This files must be located on S3. When your application starts, it will read the file from S3 and make the data available as a table. A table that you can use in your queries. Below I joined a query result on location ID to retrieve the name of the measurement location from the reference data.

这是一个csv文件,在制表符中定界。 您也可以使用其他分隔符。 该文件必须位于S3上。 当您的应用程序启动时,它将从S3中读取文件并使数据可作为表格使用。 您可以在查询中使用的表。 下面,我加入了有关位置ID的查询结果,以从参考数据中检索测量位置的名称。

Your kinesis analytics application outputs its result towards a destination.

您的运动分析应用程序将其结果输出到目的地。

We saw that our intermediary results are always pumped towards an in-application stream. To get these results out of our application we have to couple the in-application stream towards an exterior data stream like a kinesis Firehose or a kinesis data stream.

我们看到中间结果总是被泵送到应用程序内。 为了从我们的应用程序中获得这些结果,我们必须将应用程序内流耦合到外部数据流,如运动Kinesis Firehose或运动Kinesis数据流。

Mind that an in-application stream can only be coupled to one exterior data stream. If you want to output the same result towards two different destinations you’ll have to create another in-application stream which receives the same data. That is also the reason why you see two different in-application streams coupled to the two destinations.

注意一个应用程序内流只能耦合到一个外部数据流。 如果要向两个不同的目标输出相同的结果,则必须创建另一个接收相同数据的应用程序内流。 这也是为什么您看到两个不同的应用程序内流耦合到两个目的地的原因。

实时结果怎么办 (What to do with the real time results)

In the architecture diagram above, you will notice that the kinesis data stream, which receives the analytics results, is coupled to a Lambda Function. That gives you opportunities. You could directly send out alerts based on the data that the function receives from the stream. Or you can save the results in a real time data store which you can use to always query for the current situation.

在上面的架构图中,您会注意到,接收分析结果的运动数据流已与Lambda函数耦合。 那给你机会。 您可以根据函数从流接收的数据直接发送警报。 或者,您可以将结果保存在实时数据存储中,以便始终查询当前情况。

Here I choose the latter. I am storing the real time resuls in DynamoDB. This table holds the current situation for each of the different measuring points in Belgium. I then provide an API through which a client can fetch the current traffic situation in Belgium for a certain point.

在这里,我选择后者。 我将实时结果存储在DynamoDB 。 该表保存了比利时每个不同测量点的当前状况。 然后,我提供一个API,客户端可以通过该API获取比利时当前的交通状况。

Another Lambda Function is listening on the change stream of this table. It’s actually monitoring whether a traffic jam is present or not. If the traffic jam flag switches between True or False we send out a slack message to notify interested parties that a traffic jam has appeared or has dissolved.

另一个Lambda函数正在侦听此表的更改流。 它实际上是在监视是否存在交通拥堵。 如果交通拥堵标志在“ True或“ False之间切换,我们将发送一条松弛消息以通知相关方交通拥堵已经出现或已解决。

重要要点 (Key Takeaways)

Great, we just learned how we can use Kinesis data analytics to get real time insights in our streamed data. In our case it gave us the possiblity to get an on-demand view of the traffic jams in Belgium and send out alerts for emerging traffic jams.

太好了,我们刚刚学习了如何使用Kinesis data analytics在流式数据中获得实时见解。 在我们的案例中,它使我们可以按需查看比利时的交通拥堵情况,并发出有关新出现的交通拥堵的警报。

Kinesis data analytics is a great tool for real time analytics. There are some some knobs and twists which I think are really good to know!

Kinesis数据分析是实时分析的绝佳工具。 我认为有些旋钮和曲折确实很不错!

Here are once again the key takeaways from this blog:

这再次是此博客的主要收获:

  • Separate cold and hot flow of your data completely (real time vs batch).将数据的冷热流完全分开(实时与批处理)。
  • Real time date should be produced in real time and arrive in real time.实时日期应实时生成并实时到达。
  • Think about your data access pattern upfront.考虑一下您的数据访问模式。
  • Mind the differences between Kinesis Firehose and Kinesis Data Streams to stream your data.

    注意Kinesis FirehoseKinesis Data Streams之间的差异以流式传输数据。

  • Preprocess and/or filter your records before they go in your Kinesis analytics application by using a record preprocessor Lambda Function.

    使用记录预处理器Lambda函数,在记录进入Kinesis analytics application之前对其进行预处理和/或过滤。

  • You can use Windowing to aggregate or correlate results over a certain timespan.

    您可以使用Windowing在特定时间范围内汇总或关联结果。

  • You can only use the timestamps ROWTIME and APPROXIMATE_ARRIVAL_TIME.

    您只能使用时间戳记ROWTIMEAPPROXIMATE_ARRIVAL_TIME

  • Add static reference data to your application by making it available via S3.通过S3将静态参考数据添加到您的应用程序中。
  • The core SQL concepts of the Kinesis analytics app are SQL STREAMS and SQL PUMPS.

    Kinesis analytics app的核心SQL概念是SQL STREAMSSQL PUMPS

Get in touch with the authors:

与作者联系:

  • Nick Van Hoof: https://twitter.com/TheNickVanHoof

    尼克·范·霍夫(Nick Van Hoof): https : //twitter.com/TheNickVanHoof

  • David Smits大卫·史密斯

Triggered to team up with CloudWay? Interested in a collaboration?Don’t hesitate to get in touch and let’s talk!

触发了与CloudWay的合作? 对合作感兴趣吗? 随时 联系 我们,开始讨论吧!

翻译自: https://medium.com/cloudway/real-time-data-processing-with-kinesis-data-analytics-ad52ad338c6d

kinesis


http://www.taodudu.cc/news/show-5260456.html

相关文章:

  • 5G在中国一步步满血,高通实现毫米波独立组网:7.1Gbps网速、3.6毫秒延迟
  • 水运视频综合应用成套产品
  • 测试电脑性能(c语言简单应用)
  • 测试两台电脑之间的网速
  • 2008年个人知识管理软件测评[草]
  • 完全用linux办公(一)
  • 个人知识管理为什么要软件工具为主理念为次?
  • “个人知识管理”百科
  • 如何对单位的规章制度进行汇编?
  • 友益文书软件加密深度分析
  • 让电池起死回生 笔记本电池换芯记
  • java获取win电源电量_笔记本电量尿崩?比X大师靠谱的Win10电池检测
  • 计算机硬件检测设备,Win10系统自带电脑硬件设备检测工具如何使用?
  • 纷争再起:Flutter-UI绘制解析
  • じ守望者┱ o
  • 正厚知识 | 游戏,是未来的科技领域?
  • 迷茫中的守望者
  • 3D模型欣赏:红发守望者 女性3D角色 【3D游戏建模教程】
  • U盘启动 BEINI 的方法
  • beini系列_1_U盘引导制作
  • Beini的6种***模式详解
  • beini系列_2_beini装入虚拟机
  • Beini 的6种攻击模式详解
  • beini桌面没有奶瓶图标
  • linux奶瓶U盘使用方法,CDlinux如何制作U盘启动及Beini(奶瓶)制作U盘启动的方法...
  • linux模式下无奶瓶程序,PE加载BEINI奶瓶,看不到奶瓶图标?忘指导
  • linux奶瓶系统,奶瓶beini 系统从硬盘光盘U盘引导启动
  • 使用python创建文件夹快捷方式
  • NSIS 生成快捷方式 (学习 一)
  • Ubuntu在桌面建立快捷方式

使用Kinesis Data Analytics进行实时数据处理相关推荐

  1. 石油和天然气行业的大数据分析:新兴趋势Big Data analytics in oil and gas industry: An emerging trend

    文章目录 A B S T R A C T 1. Introduction 2. Big Data analytics 2.1. Big Data definition 2.2 Big Data met ...

  2. 使用 Kafka 和 Spark Streaming 构建实时数据处理系统

    使用 Kafka 和 Spark Streaming 构建实时数据处理系统  来源:https://www.ibm.com/developerworks,这篇文章转载自微信里文章,正好解决了我项目中的 ...

  3. 浅析Kafka实时数据处理系统

    Kafka是啥?用Kafka官方的话来说就是: Kafka is used for building real-time data pipelines and streaming apps. It i ...

  4. Apache Pulsar:实时数据处理中消息,计算和存储的统一

    本文转载自"AI前线",整理自翟佳在 QCon2018 北京站的演讲,在本次演讲中,翟佳介绍了 Apache Pulsar 的架构.特性和其生态系统的组成,并展示了 Apache ...

  5. PerfEnforce Demonstration: Data Analytics with Performance Guarantees

    PerfEnforce Demonstration: Data Analytics with Performance Guarantees Created by: ctur Date: April 2 ...

  6. kafka+flume 实时数据处理

    kafka+flume 实时数据处理 1.监测数据处理技术路线 ​ ​ ​ 1.1数据层 2.介绍技术 我们很多人在在使用Flume和kafka时,都会问一句为什么要将Flume和Kafka集成? ​ ...

  7. IAB303 Data Analytics Assessment Task

    Assessment Task IAB303 Data Analytics for Business Insight Semester I 2019 Assessment 2 – Data Analy ...

  8. 《Storm实时数据处理》一2.6 统计与持久化日志统计信息

    本节书摘来华章计算机<Storm实时数据处理>一书中的第2章 ,第2.6节,(澳)Quinton Anderson 著 卢誉声 译更多章节内容可以访问云栖社区"华章计算机&quo ...

  9. 如何使用实时计算 Flink 搞定实时数据处理难题?

    简介:如何使用实时计算 Flink 搞定实时数据处理难题?本文由阿里巴巴高级技术专家邓小勇老师分享,从实时计算的历史回顾着手,详细介绍了阿里云实时计算 Flink 的核心优势与应用场景,文章内容主要分 ...

最新文章

  1. 基于机器学习的入侵检测系统
  2. 从一个提问引发到你是怎么看待编程语言是一种工具这句话的?【笔记自用】
  3. OpenCV中Mat属性step,step1,elemSize,elemSize1
  4. Taro多端开发实现原理与项目实战(二)
  5. Class类和Object类及用法(二)
  6. 花两个小时,做了个分页控件
  7. Spark(一)-- Standalone HA的部署
  8. 单包攻击_SQL Server Integration Services 2016中的单包部署
  9. Android 实现图片画画板
  10. 系统学习机器学习之神经网络(十一) --TDNN
  11. 计算机网络按网络覆盖范围大小排序 从小到大为,现代远程网络教育概论试题及答案分析.pdf...
  12. 新手必看的入门编程教程
  13. 08_星仔带你学Java之什么是软件开发以及软件开发方式有哪些?
  14. Django默认用户模型类和父类 AbstractUser 介绍
  15. 汽车HiL测试简单介绍及其优势
  16. RFID电子耳标识别棒,牦牛身份识别管理专用设备
  17. 亚像素边缘提取的例子
  18. 【待写】x265:CRF、ABR、CQP码控模式性能测试
  19. 【ubuntu】xmm2(音频播放器)安装及其使用
  20. mysql行锁/表锁

热门文章

  1. pyqt5-tools安装失败处理方法
  2. 基于微信小程序的慢出行共享系统的设计与实现.pdf
  3. 请简述ajax的过程,ajax简述过程.doc
  4. python DEA: 零和数据包络分析zero-sum gain Data envelopment analysis
  5. LWIP(chapter2) ARP协议与源码解析
  6. 统计学习方法|决策树原理剖析及实现
  7. 火山引擎数智平台 ByteHouse 入围稀土掘金《Top10 年度创新产品》
  8. 用python画小黄人步骤图-通过python将图片生成字符画
  9. 微信小程序,类似微信点击语音播放效果,不会互相干扰播放状态
  10. 动画笔记1----iclone工具栏基本操作