0. Data Lake Analytics(DLA)简介

关于Data Lake的概念,更多阅读可以参考:
https://en.wikipedia.org/wiki/Data_lake

以及AWS和Azure关于Data Lake的解读:
https://amazonaws-china.com/big-data/datalakes-and-analytics/what-is-a-data-lake/
https://azure.microsoft.com/en-us/solutions/data-lake/

终于,阿里云现在也有了自己的数据湖分析产品:https://www.aliyun.com/product/datalakeanalytics

可以点击申请使用(目前公测阶段还属于邀测模式,我们会尽快审批申请),体验本教程的TPC-H CSV数据格式的数据分析之旅。

产品文档:https://help.aliyun.com/product/70174.html

1. 开通Data Lake Analytics与OSS服务

如果您已经开通,可以跳过该步骤。如果没有开通,可以参考:https://help.aliyun.com/document_detail/70386.html
进行产品开通服务申请。

2. 下载TPC-H测试数据集

可以从这下载TPC-H 100MB的数据集:
https://public-datasets-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/tpch_100m_data.zip

3. 上传数据文件到OSS

登录阿里云官网的OSS控制台:https://oss.console.aliyun.com/overview
规划您要使用的OSS bucket,创建或选择好后,点击“文件管理”,因为有8个数据文件,为每个数据文件创建对应的文件目录:

创建好8个目录如下:

点击进入目录,上传相应的数据文件,例如,customer目录,则上传customer.tbl文件。

上传好后,如下图。然后,依次把其他7个数据文件也上传到对应的目录下。

至此,8个数据文件都上传到了您的OSS bucket中:

oss://xxx/tpch_100m/customer/customer.tbl
oss://xxx/tpch_100m/lineitem/lineitem.tbl
oss://xxx/tpch_100m/nation/nation.tbl
oss://xxx/tpch_100m/orders/orders.tbl
oss://xxx/tpch_100m/part/part.tbl
oss://xxx/tpch_100m/partsupp/partsupp.tbl
oss://xxx/tpch_100m/region/region.tbl
oss://xxx/tpch_100m/supplier/supplier.tbl

4. 登录Data Lake Analytics控制台

https://openanalytics.console.aliyun.com/
点击“登录数据库”,输入开通服务时分配的用户名和密码,登录Data Lake Analytics控制台。

5. 创建Schema和Table

输入创建SCHEMA的语句,点击“同步执行”。

CREATE SCHEMA tpch_100m with DBPROPERTIES(LOCATION = 'oss://test-bucket-julian-1/tpch_100m/',catalog='oss'
);

(注意:目前在同一个阿里云region,Data Lake Analytics的schema名全局唯一,建议schema名尽量根据业务定义,已有重名schema,在创建时会提示报错,则请换一个schema名字。)

Schema创建好后,在“数据库”的下拉框中,选择刚刚创建的schema。然后在SQL文本框中输入建表语句,点击同步执行。
建表语句语法参考:https://help.aliyun.com/document_detail/72006.html

TPC-H对应的8个表的建表语句如下,分别贴入文档框中执行(LOCATION子句中的数据文件位置请根据您的实际OSS bucket目录相应修改)。(注意:目前控制台中还不支持多个SQL语句执行,请单条语句执行。)

CREATE EXTERNAL TABLE nation (N_NATIONKEY INT, N_NAME STRING,N_ID STRING,N_REGIONKEY INT, N_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/nation';CREATE EXTERNAL TABLE lineitem (L_ORDERKEY INT, L_PARTKEY INT, L_SUPPKEY INT, L_LINENUMBER INT, L_QUANTITY DOUBLE, L_EXTENDEDPRICE DOUBLE, L_DISCOUNT DOUBLE, L_TAX DOUBLE, L_RETURNFLAG STRING, L_LINESTATUS STRING, L_SHIPDATE DATE, L_COMMITDATE DATE, L_RECEIPTDATE DATE, L_SHIPINSTRUCT STRING, L_SHIPMODE STRING, L_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/lineitem';CREATE EXTERNAL TABLE orders (O_ORDERKEY INT, O_CUSTKEY INT, O_ORDERSTATUS STRING, O_TOTALPRICE DOUBLE, O_ORDERDATE DATE, O_ORDERPRIORITY STRING, O_CLERK STRING, O_SHIPPRIORITY INT, O_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/orders';CREATE EXTERNAL TABLE supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/supplier';CREATE EXTERNAL TABLE partsupp (PS_PARTKEY INT, PS_SUPPKEY INT, PS_AVAILQTY INT, PS_SUPPLYCOST DOUBLE, PS_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/partsupp';CREATE EXTERNAL TABLE customer (C_CUSTKEY INT, C_NAME STRING, C_ADDRESS STRING, C_NATIONKEY INT, C_PHONE STRING, C_ACCTBAL DOUBLE, C_MKTSEGMENT STRING, C_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/customer';CREATE EXTERNAL TABLE part (P_PARTKEY INT, P_NAME STRING, P_MFGR STRING, P_BRAND STRING, P_TYPE STRING, P_SIZE INT, P_CONTAINER STRING, P_RETAILPRICE DOUBLE, P_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/part';CREATE EXTERNAL TABLE region (R_REGIONKEY INT, R_NAME STRING, R_COMMENT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION 'oss://test-bucket-julian-1/tpch_100m/region';

建表完毕后,刷新页面,在左边导航条中能看到schema下的8张表。

6. 执行TPC-H查询

TPC-H总共22条查询,如下:
Q1:

SELECT   l_returnflag,l_linestatus,Sum(l_quantity)                                       AS sum_qty,Sum(l_extendedprice)                                  AS sum_base_price,Sum(l_extendedprice * (1 - l_discount))               AS sum_disc_price,Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,Avg(l_quantity)                                       AS avg_qty,Avg(l_extendedprice)                                  AS avg_price,Avg(l_discount)                                       AS avg_disc,Count(*)                                              AS count_order
FROM     lineitem
WHERE    l_shipdate <= date '1998-12-01' - INTERVAL '93' day
GROUP BY l_returnflag,l_linestatus
ORDER BY l_returnflag,l_linestatus
LIMIT    1;

Q2:

SELECT   s_acctbal,s_name,n_name,p_partkey,p_mfgr,s_address,s_phone,s_comment
FROM     part,supplier,partsupp,nation,region
WHERE    p_partkey = ps_partkey
AND      s_suppkey = ps_suppkey
AND      p_size = 35
AND      p_type LIKE '%NICKEL'
AND      s_nationkey = n_nationkey
AND      n_regionkey = r_regionkey
AND      r_name = 'MIDDLE EAST'

Q3:

SELECT   l_orderkey,Sum(l_extendedprice * (1 - l_discount)) AS revenue,o_orderdate,o_shippriority
FROM     customer,orders,lineitem
WHERE    c_mktsegment = 'AUTOMOBILE'
AND      c_custkey = o_custkey
AND      l_orderkey = o_orderkey
AND      o_orderdate < date '1995-03-31'
AND      l_shipdate >  date '1995-03-31'
GROUP BY l_orderkey,o_orderdate,o_shippriority
ORDER BY revenue DESC,o_orderdate
LIMIT    10;

Q4:

SELECT   o_orderpriority,Count(*) AS order_count
FROM     orders,lineitem
WHERE    o_orderdate >= date '1997-10-01'
AND      o_orderdate <  date '1997-10-01' + INTERVAL '3' month
AND      l_orderkey = o_orderkey
AND      l_commitdate < l_receiptdate
GROUP BY o_orderpriority
ORDER BY o_orderpriority
LIMIT    1;

Q5:

SELECT   n_name,Sum(l_extendedprice * (1 - l_discount)) AS revenue
FROM     customer,orders,lineitem,supplier,nation,region
WHERE    c_custkey = o_custkey
AND      l_orderkey = o_orderkey
AND      l_suppkey = s_suppkey
AND      c_nationkey = s_nationkey
AND      s_nationkey = n_nationkey
AND      n_regionkey = r_regionkey
AND      r_name = 'ASIA'
AND      o_orderdate >= date '1995-01-01'
AND      o_orderdate <  date '1995-01-01' + INTERVAL '1' year
GROUP BY n_name
ORDER BY revenue DESC
LIMIT    1;

Q6:

SELECT sum(l_extendedprice * l_discount) AS revenue
FROM lineitem
WHERE l_shipdate >= date '1995-01-01'
AND l_shipdate < date '1995-01-01' + interval '1' year
AND l_discount between 0.04 - 0.01 AND 0.04 + 0.01
AND l_quantity < 24
LIMIT 1;

Q7:

SELECT   supp_nation,cust_nation,l_year,Sum(volume) AS revenue
FROM     (SELECT n1.n_name                          AS supp_nation,n2.n_name                          AS cust_nation,Extract(year FROM l_shipdate)      AS l_year,l_extendedprice * (1 - l_discount) AS volumeFROM   supplier,lineitem,orders,customer,nation n1,nation n2WHERE  s_suppkey = l_suppkeyAND    o_orderkey = l_orderkeyAND    c_custkey = o_custkeyAND    s_nationkey = n1.n_nationkeyAND    c_nationkey = n2.n_nationkeyAND    ( (n1.n_name = 'GERMANY'AND    n2.n_name = 'INDIA')OR     (n1.n_name = 'INDIA'AND    n2.n_name = 'GERMANY') )AND    l_shipdate BETWEEN date '1995-01-01' AND    date '1996-12-31' ) AS shipping
GROUP BY supp_nation,cust_nation,l_year
ORDER BY supp_nation,cust_nation,l_year
LIMIT    1;

Q8:

SELECT   o_year,Sum(CASEWHEN nation = 'INDIA' THEN volumeELSE 0end) / Sum(volume) AS mkt_share
FROM     (SELECT Extract(year FROM o_orderdate)     AS o_year,l_extendedprice * (1 - l_discount) AS volume,n2.n_name                          AS nationFROM   part,supplier,lineitem,orders,customer,nation n1,nation n2,regionWHERE  p_partkey = l_partkeyAND    s_suppkey = l_suppkeyAND    l_orderkey = o_orderkeyAND    o_custkey = c_custkeyAND    c_nationkey = n1.n_nationkeyAND    n1.n_regionkey = r_regionkeyAND    r_name = 'ASIA'AND    s_nationkey = n2.n_nationkeyAND    o_orderdate BETWEEN date '1995-01-01' AND    date '1996-12-31'AND    p_type = 'STANDARD ANODIZED STEEL' ) AS all_nations
GROUP BY o_year
ORDER BY o_year
LIMIT    1;

Q9:

SELECT   nation,o_year,Sum(amount) AS sum_profit
FROM     (SELECT n_name                                                          AS nation,Extract(year FROM o_orderdate) AS o_year,l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity AS amountFROM   part,supplier,lineitem,partsupp,orders,nationWHERE  s_suppkey = l_suppkeyAND    ps_suppkey = l_suppkeyAND    ps_partkey = l_partkeyAND    p_partkey = l_partkeyAND    o_orderkey = l_orderkeyAND    s_nationkey = n_nationkeyAND    p_name LIKE '%aquamarine%' ) AS profit
GROUP BY nation,o_year
ORDER BY nation,o_year DESC
LIMIT    1;

Q10:

SELECT   c_custkey,c_name,Sum(l_extendedprice * (1 - l_discount)) AS revenue,c_acctbal,n_name,c_address,c_phone,c_comment
FROM     customer,orders,lineitem,nation
WHERE    c_custkey = o_custkey
AND      l_orderkey = o_orderkey
AND      o_orderdate >= date '1994-08-01'
AND      o_orderdate <  date '1994-08-01' + INTERVAL '3' month
AND      l_returnflag = 'R'
AND      c_nationkey = n_nationkey
GROUP BY c_custkey,c_name,c_acctbal,c_phone,n_name,c_address,c_comment
ORDER BY revenue DESC
LIMIT    20;

Q11:

SELECT   ps_partkey,Sum(ps_supplycost * ps_availqty) AS value
FROM     partsupp,supplier,nation
WHERE    ps_suppkey = s_suppkey
AND      s_nationkey = n_nationkey
AND      n_name = 'PERU'
GROUP BY ps_partkey
HAVING   Sum(ps_supplycost * ps_availqty) >
(
SELECT Sum(ps_supplycost * ps_availqty) * 0.0001000000 as sum_valueFROM partsupp,supplier,nationWHERE  ps_suppkey = s_suppkeyAND    s_nationkey = n_nationkeyAND    n_name = 'PERU'
)
ORDER BY value DESC
LIMIT    1;

Q12:

SELECT l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1else 0
end) AS high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1
else 0
end) AS low_line_count
FROM orders,lineitem
WHERE o_orderkey = l_orderkey
AND l_shipmode in ('MAIL', 'TRUCK')
AND l_commitdate < l_receiptdate
AND l_shipdate < l_commitdate
AND l_receiptdate >= date '1996-01-01'
AND l_receiptdate < date '1996-01-01' + interval '1' year
GROUP BY l_shipmode
ORDER BY l_shipmode
LIMIT 1;

Q13:

SELECT c_count, count(*) AS custdist
FROM (SELECT c_custkey, count(o_orderkey) AS c_countFROM customer,ordersWHERE c_custkey = o_custkeyAND o_comment NOT LIKE '%pending%accounts%'GROUP BY c_custkey ) AS c_orders
GROUP BY c_count
ORDER BY custdist DESC, c_count DESC
LIMIT 1;

Q14:

SELECT 100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount)else 0
end) / sum(l_extendedprice * (1 - l_discount)) AS promo_revenue
FROM lineitem,part
WHERE l_partkey = p_partkey
AND l_shipdate >= date '1996-01-01'
AND l_shipdate < date '1996-01-01' + interval '1' month
LIMIT 1;

Q15:

WITH revenue0 AS
(
SELECT l_suppkey AS supplier_no, sum(l_extendedprice * (1 - l_discount)) AS total_revenue
FROM lineitem
WHERE l_shipdate >= date '1993-01-01'
AND l_shipdate < date '1993-01-01' + interval '3' month
GROUP BY l_suppkey
)
SELECT s_suppkey, s_name, s_address, s_phone, total_revenue
FROM supplier, revenue0
WHERE s_suppkey = supplier_no
AND total_revenue IN (SELECT max(total_revenue)FROM revenue0 )
ORDER BY s_suppkey;

Q16:

SELECT p_brand, p_type, p_size, count(distinct ps_suppkey) AS supplier_cnt
FROM partsupp,part
WHERE p_partkey = ps_partkey
AND p_brand <> 'Brand#23'
AND p_type NOT LIKE 'PROMO BURNISHED%'
AND p_size IN (1, 13, 10, 28, 21, 35, 31, 11)
AND ps_suppkey NOT IN (SELECT s_suppkeyFROM supplierWHERE s_comment LIKE '%Customer%Complaints%' )
GROUP BY p_brand, p_type, p_size
ORDER BY supplier_cnt DESC, p_brand, p_type, p_size
LIMIT 1;

Q17:

SELECTsum(l_extendedprice) / 7.0 AS avg_yearly
FROMlineitem,part
WHERE p_partkey = l_partkeyAND p_brand = 'Brand#44'AND p_container = 'WRAP PKG'AND l_quantity < (SELECT0.2 * avg(l_quantity)FROMlineitem, partWHEREl_partkey = p_partkey);

Q18:

SELECT c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
FROM customer,orders,lineitem
WHERE o_orderkey IN (SELECT l_orderkeyFROM lineitemGROUP BY l_orderkeyHAVING sum(l_quantity) > 315 )
AND c_custkey = o_custkey
AND o_orderkey = l_orderkey
GROUP BY c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
ORDER BY o_totalprice DESC, o_orderdate
LIMIT 100;

Q19:

SELECT sum(l_extendedprice* (1 - l_discount)) AS revenue
FROM lineitem,part
WHERE ( p_partkey = l_partkey and p_brand = 'Brand#12'and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')and l_quantity >= 6 and l_quantity <= 6 + 10and p_size between 1 and 5and l_shipmode in ('AIR', 'AIR REG')and l_shipinstruct = 'DELIVER IN PERSON' )or ( p_partkey = l_partkey and p_brand = 'Brand#13'and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')and l_quantity >= 10 and l_quantity <= 10 + 10and p_size between 1 and 10and l_shipmode in ('AIR', 'AIR REG')and l_shipinstruct = 'DELIVER IN PERSON' )or ( p_partkey = l_partkey and p_brand = 'Brand#24'and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')and l_quantity >= 21 and l_quantity <= 21 + 10and p_size between 1 and 15and l_shipmode in ('AIR', 'AIR REG')and l_shipinstruct = 'DELIVER IN PERSON' )
LIMIT 1;

Q20:

with temp_table as
(select 0.5 * sum(l_quantity) as col1from lineitem,partsuppwhere l_partkey = ps_partkey and l_suppkey = ps_suppkeyand l_shipdate >= date '1993-01-01'and l_shipdate < date '1993-01-01' + interval '1' year
)
select s_name, s_address
from supplier,nation
where s_suppkey in (select ps_suppkeyfrom partsupp,temp_tablewhere ps_partkey in (select p_partkeyfrom partwhere p_name like 'dark%' )and ps_availqty > temp_table.col1 )and s_nationkey = n_nationkey and n_name = 'JORDAN'
order by s_name
limit 1;

Q21:

selects_name,count(*) as numwait
fromsupplier,lineitem l1,orders,nation
wheres_suppkey = l1.l_suppkeyand o_orderkey = l1.l_orderkeyand o_orderstatus = 'F'and l1.l_receiptdate > l1.l_commitdateand exists (select*fromlineitem l2wherel2.l_orderkey = l1.l_orderkeyand l2.l_suppkey <> l1.l_suppkey)and not exists (select*fromlineitem l3wherel3.l_orderkey = l1.l_orderkeyand l3.l_suppkey <> l1.l_suppkeyand l3.l_receiptdate > l3.l_commitdate)and s_nationkey = n_nationkeyand n_name = 'SAUDI ARABIA'
group bys_name
order bynumwait desc,s_name
limit 100;

Q22:

with temp_table_1 as
(select avg(c_acctbal) as avg_valuefrom customerwhere c_acctbal > 0.00 and substring(c_phone from 1 for 2)in ('33', '29', '37', '35', '25', '27', '43')
),
temp_table_2 as
(select count(*) as count1from orders, customerwhere o_custkey = c_custkey
)
select cntrycode, count(*) as numcust, sum(c_acctbal) as totacctbal
from (select substring(c_phone from 1 for 2) as cntrycode, c_acctbalfrom customer, temp_table_1, temp_table_2where substring(c_phonefrom 1for 2) in ('33', '29', '37', '35', '25', '27', '43')and c_acctbal > temp_table_1.avg_valueand temp_table_2.count1 = 0) as custsale
group by cntrycode
order by cntrycode
limit 1;

7. 异步执行查询

Data Lake Analytics支持“同步执行”模式和“异步执行”模式。“同步执行”模式下,控制台界面等待执行结果返回;“异步执行”模式下,立刻返回查询任务的ID。

点击“执行状态”,可以看到该异步查询任务的执行状态,主要分为:“RUNNING”,“SUCCESS”,“FAILURE”。

点击“刷新”,当STATUS变为“SUCCESS”时,表示查询成功,同时可查看查询耗时“ELAPSE_TIME”和查询扫描的数据字节数“SCANNED_DATA_BYTES”。

8. 查看查询历史

点击“执行历史”,可以看到您执行的查询的历史详细信息,包括:
1)查询语句;
2)查询耗时与执行具体时间;
3)查询结果返回行数;
4)查询状态;
5)查询扫描的字节数;
6)结果集回写到的目标OSS文件(Data Lake Analytics会将查询结果集保存用户的bucket中)。

查询结果文件自动上传到用户同region的OSS bucket中,其中包括结果数据文件和结果集元数据描述文件。

{QueryLocation}/{query_name}|Unsaved}/{yyyy}/{mm}/{dd}/{query_id}/xxx.csv
{QueryLocation}/{query_name}|Unsaved}/{yyyy}/{mm}/{dd}/{query_id}/xxx.csv.metadata

其中QueryLocation为:

aliyun-oa-query-results-<your_account_id>-<oss_region>

9. 后续

至此,本教程一步一步教您如何利用Data Lake Analytics云产品分析您OSS上的CSV格式的数据文件。除了CSV文件外,Data Lake Analytics还支持Parquet、ORC、json、RCFile、AVRO等多种格式文件的数据分析能力。特别是Parquet、ORC,相比CSV文件,有极大的性能和成本优势(同样内容的数据集,拥有更小的存储空间、更快的查询性能,这也意味着更低的分析成本)。
后续陆续会有更多教程和文章,手把手教您轻松使用Data Lake Analytics进行数据湖上数据分析和探索,开启您的云上低成本、即存即用的数据分析和探索之旅。
G

教程:使用Data Lake Analytics + OSS分析CSV格式的TPC-H数据集相关推荐

  1. Data Lake Analytics + OSS数据文件格式处理大全

    前言 Data Lake Analytics是Serverless化的云上交互式查询分析服务.用户可以使用标准的SQL语句,对存储在OSS.TableStore上的数据无需移动,直接进行查询分析. 目 ...

  2. Data Lake Analytics + OSS数据文件格式处理大全 1

    0. 前言 Data Lake Analytics是Serverless化的云上交互式查询分析服务.用户可以使用标准的SQL语句,对存储在OSS.TableStore上的数据无需移动,直接进行查询分析 ...

  3. Data Lake Analytics: 使用DataWorks来调度DLA任务

    2019独角兽企业重金招聘Python工程师标准>>> DataWorks作为阿里云上广受欢迎的大数据开发调度服务,最近加入了对于Data Lake Analytics的支持,意味着 ...

  4. Data Lake Analytics的Geospatial分析函数 1

    简介 为满足部分客户在云上做Geometry数据的分析需求,阿里云Data Lake Analytics(以下简称:DLA)支持多种格式的地理空间数据处理函数,符合Open Geospatial Co ...

  5. Data Lake Analytics中OSS LOCATION的使用说明

    2019独角兽企业重金招聘Python工程师标准>>> 前言 Data Lake Analytic(后文简称 DLA)可以帮助用户通过标准的SQL语句直接对存储在OSS.TableS ...

  6. 如何在Data Lake Analytics中使用临时表

    2019独角兽企业重金招聘Python工程师标准>>> 前言 Data Lake Analytics (后文简称DLA)是阿里云重磅推出的一款用于大数据分析的产品,可以对存储在OSS ...

  7. 【免费公测中】为数据赋予超能力,阿里云重磅推出Serverless数据分析引擎-Data Lake Analytics

    摘要: 近日,阿里云重磅推出Serverless数据分析引擎-Data Lake Analytics,Data Lake Analytics,帮助更多不具备分析能力的存储服务,赋予其分析的能力. 近日 ...

  8. 如何在阿里云上使用Data Lake Analytics分析Table Store数据

    0. Data Lake Analytics(简称DLA)介绍 数据湖(Data Lake)是时下热门的概念,更多阅读可以参考: https://en.wikipedia.org/wiki/Data_ ...

  9. Data Lake Analytics: 基于OSS文件自动推断建表

    背景 用户在使用Data Lake Analytics对OSS上的数据建表时,需要预先知道目标文件的格式和schema信息.当目标文件中每条记录对应的列数很多时,需要一个个手动匹配,很不方便. 在新版 ...

最新文章

  1. vivo应用商店电脑版_好车网APP|苹果商城、华为应用市场、腾讯应用宝、vivo应用商店、OPPO应用商店用户免费下载...
  2. Flex+ActionScript
  3. ofbiz mysql 序列_ofbiz+mysql安装求教
  4. Win-MASM64汇编语言-标志寄存器(EFLAGS/PSW/PUSHF/POPF)
  5. 用 50 年前 NASA 送阿波罗上天的计算机挖矿是什么体验?
  6. mysql 5.6 升级问题_mysql5.6升级5.7带来的问题
  7. C++中的->用于赋值
  8. Cloudera Manager和CDH4.1的安装
  9. python实现,excel随机抽取特定行到新表中(附上源码和桌面软件)
  10. Dezender的使用
  11. 蓝牙sbc怎么解决_从指标看蓝牙A2DP编码(SBC,AAC,APTX)
  12. 华为手机便签有扫描的功能么
  13. lol服务器维护8月7,LOL8月7日维护到几点 具体更新了什么
  14. 洛谷p1200题目c语言代码,洛谷P1200 [USACO1.1]你的飞碟在这儿
  15. 作品集十(平面设计)
  16. (附源码)计算机毕业设计SSM黄淮学院二手物品交易平台
  17. 入门的Android架构师需要掌握哪些技能?
  18. 前端移动端刻度尺实现
  19. HDOJ 1495非常可乐
  20. 网络版收银系统服务器更新是什么意思,POS收银系统也需要更新的五大理由

热门文章

  1. linux如何关闭udp端口,如何阻止linux RHEL7中的udp端口范围(How to block udp ports range in linux RHEL7)...
  2. SAP MM MIGO + 301 K 可以对供应商寄售库存跨工厂转库
  3. SAP SD 以PDF格式显示BILLING的输出格式
  4. AI教父争夺秘史:百度2.88亿天价求才,因中国身份惜败谷歌
  5. SAP HUM 嵌套HU初探 II
  6. 智能车联,一场「以人为核心」的智能化变革
  7. TinyML 机器学习简介
  8. 福布斯2020年AI领域10大预测:人工智能越来越“边缘化”!
  9. 跨国企业频繁并购,SAP咨询行业的非常利好
  10. 是固执成见还是步步为营,工程建筑行业如何追赶人工智能化潮流?