2021年大数据Flink(四十五):扩展阅读 双流Join
目录
扩展阅读 双流Join
介绍
Window Join
Interval Join
代码演示1
代码演示2
重点注意
扩展阅读 双流Join
介绍
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/operators/joining.html
https://zhuanlan.zhihu.com/p/340560908
https://blog.csdn.net/andyonlines/article/details/108173259
双流Join是Flink面试的高频问题。一般情况下说明以下几点就可以hold了:
- Join大体分类只有两种:Window Join和Interval Join。
- Window Join又可以根据Window的类型细分出3种:
Tumbling Window Join、Sliding Window Join、Session Widnow Join。
Windows类型的join都是利用window的机制,先将数据缓存在Window State中,当窗口触发计算时,执行join操作;
- interval join也是利用state存储数据再处理,区别在于state中的数据有失效机制,依靠数据触发数据清理;
目前Stream join的结果是数据的笛卡尔积;
Window Join
- Tumbling Window Join
执行翻滚窗口联接时,具有公共键和公共翻滚窗口的所有元素将作为成对组合联接,并传递给JoinFunction或FlatJoinFunction。因为它的行为类似于内部连接,所以一个流中的元素在其滚动窗口中没有来自另一个流的元素,因此不会被发射!
如图所示,我们定义了一个大小为2毫秒的翻滚窗口,结果窗口的形式为[0,1]、[2,3]、。。。。该图显示了每个窗口中所有元素的成对组合,这些元素将传递给JoinFunction。注意,在翻滚窗口[6,7]中没有发射任何东西,因为绿色流中不存在与橙色元素⑥和⑦结合的元素。
import org.apache.flink.api.java.functions.KeySelector;
import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows;
import org.apache.flink.streaming.api.windowing.time.Time;...
DataStream<Integer> orangeStream = ...DataStream<Integer> greenStream = ...
orangeStream.join(greenStream).where(<KeySelector>).equalTo(<KeySelector>).window(TumblingEventTimeWindows.of(Time.milliseconds(2))).apply (new JoinFunction<Integer, Integer, String> (){@Overridepublic String join(Integer first, Integer second) {return first + "," + second;}});
- Sliding Window Join
在执行滑动窗口联接时,具有公共键和公共滑动窗口的所有元素将作为成对组合联接,并传递给JoinFunction或FlatJoinFunction。在当前滑动窗口中,一个流的元素没有来自另一个流的元素,则不会发射!请注意,某些元素可能会连接到一个滑动窗口中,但不会连接到另一个滑动窗口中!
在本例中,我们使用大小为2毫秒的滑动窗口,并将其滑动1毫秒,从而产生滑动窗口[-1,0],[0,1],[1,2],[2,3]…。x轴下方的连接元素是传递给每个滑动窗口的JoinFunction的元素。在这里,您还可以看到,例如,在窗口[2,3]中,橙色②与绿色③连接,但在窗口[1,2]中没有与任何对象连接。
import org.apache.flink.api.java.functions.KeySelector;
import org.apache.flink.streaming.api.windowing.assigners.SlidingEventTimeWindows;
import org.apache.flink.streaming.api.windowing.time.Time;
...
DataStream<Integer> orangeStream = ...DataStream<Integer> greenStream = ...
orangeStream.join(greenStream).where(<KeySelector>).equalTo(<KeySelector>).window(SlidingEventTimeWindows.of(Time.milliseconds(2) /* size */, Time.milliseconds(1) /* slide */)).apply (new JoinFunction<Integer, Integer, String> (){@Overridepublic String join(Integer first, Integer second) {return first + "," + second;}});
- Session Window Join
在执行会话窗口联接时,具有相同键(当“组合”时满足会话条件)的所有元素以成对组合方式联接,并传递给JoinFunction或FlatJoinFunction。同样,这执行一个内部连接,所以如果有一个会话窗口只包含来自一个流的元素,则不会发出任何输出!
在这里,我们定义了一个会话窗口连接,其中每个会话被至少1ms的间隔分割。有三个会话,在前两个会话中,来自两个流的连接元素被传递给JoinFunction。在第三个会话中,绿色流中没有元素,所以⑧和⑨没有连接!
import org.apache.flink.api.java.functions.KeySelector;import org.apache.flink.streaming.api.windowing.assigners.EventTimeSessionWindows;import org.apache.flink.streaming.api.windowing.time.Time;...DataStream<Integer> orangeStream = ...DataStream<Integer> greenStream = ...orangeStream.join(greenStream).where(<KeySelector>).equalTo(<KeySelector>).window(EventTimeSessionWindows.withGap(Time.milliseconds(1))).apply (new JoinFunction<Integer, Integer, String> (){@Overridepublic String join(Integer first, Integer second) {return first + "," + second;}});
Interval Join
前面学习的Window Join必须要在一个Window中进行JOIN,那如果没有Window如何处理呢?
interval join也是使用相同的key来join两个流(流A、流B),
并且流B中的元素中的时间戳,和流A元素的时间戳,有一个时间间隔。
b.timestamp ∈ [a.timestamp + lowerBound; a.timestamp + upperBound]
or
a.timestamp + lowerBound <= b.timestamp <= a.timestamp + upperBound
也就是:
流B的元素的时间戳 ≥ 流A的元素时间戳 + 下界,且,流B的元素的时间戳 ≤ 流A的元素时间戳 + 上界。
在上面的示例中,我们将两个流“orange”和“green”连接起来,其下限为-2毫秒,上限为+1毫秒。默认情况下,这些边界是包含的,但是可以应用.lowerBoundExclusive()和.upperBoundExclusive来更改行为
orangeElem.ts + lowerBound <= greenElem.ts <= orangeElem.ts + upperBound
import org.apache.flink.api.java.functions.KeySelector;import org.apache.flink.streaming.api.functions.co.ProcessJoinFunction;import org.apache.flink.streaming.api.windowing.time.Time;...DataStream<Integer> orangeStream = ...DataStream<Integer> greenStream = ...orangeStream.keyBy(<KeySelector>).intervalJoin(greenStream.keyBy(<KeySelector>)).between(Time.milliseconds(-2), Time.milliseconds(1)).process (new ProcessJoinFunction<Integer, Integer, String(){@Overridepublic void processElement(Integer left, Integer right, Context ctx, Collector<String> out) {out.collect(first + "," + second);}});
代码演示1
- 需求
来做个案例:
使用两个指定Source模拟数据,一个Source是订单明细,一个Source是商品数据。我们通过window join,将数据关联到一起。
- 思路
1、Window Join首先需要使用where和equalTo指定使用哪个key来进行关联,此处我们通过应用方法,基于GoodsId来关联两个流中的元素。
2、设置5秒的滚动窗口,流的元素关联都会在这个5秒的窗口中进行关联。
3、apply方法中实现将两个不同类型的元素关联并生成一个新类型的元素。
package cn.lanson.extend;import com.alibaba.fastjson.JSON;
import lombok.Data;
import org.apache.flink.api.common.eventtime.*;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;
import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows;
import org.apache.flink.streaming.api.windowing.time.Time;import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.UUID;
import java.util.concurrent.TimeUnit;/*** Author lanson* Desc*/
public class JoinDemo01 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();// 构建商品数据流DataStream<Goods> goodsDS = env.addSource(new GoodsSource()).assignTimestampsAndWatermarks(new GoodsWatermark());// 构建订单明细数据流DataStream<OrderItem> orderItemDS = env.addSource(new OrderItemSource()).assignTimestampsAndWatermarks(new OrderItemWatermark());// 进行关联查询DataStream<FactOrderItem> factOrderItemDS = orderItemDS.join(goodsDS)// join条件:第一个流orderItemDS的GoodsId == 第二个流goodsDS的GoodsId.where(OrderItem::getGoodsId).equalTo(Goods::getGoodsId)//指定窗口.window(TumblingEventTimeWindows.of(Time.seconds(5)))//处理join结果.apply((OrderItem item, Goods goods) -> {FactOrderItem factOrderItem = new FactOrderItem();factOrderItem.setGoodsId(goods.getGoodsId());factOrderItem.setGoodsName(goods.getGoodsName());factOrderItem.setCount(new BigDecimal(item.getCount()));factOrderItem.setTotalMoney(goods.getGoodsPrice().multiply(new BigDecimal(item.getCount())));return factOrderItem;});factOrderItemDS.print();env.execute("滚动窗口JOIN");}//商品类@Datapublic static class Goods {private String goodsId;private String goodsName;private BigDecimal goodsPrice;public static List<Goods> GOODS_LIST;public static Random r;static {r = new Random();GOODS_LIST = new ArrayList<>();GOODS_LIST.add(new Goods("1", "小米12", new BigDecimal(4890)));GOODS_LIST.add(new Goods("2", "iphone12", new BigDecimal(12000)));GOODS_LIST.add(new Goods("3", "MacBookPro", new BigDecimal(15000)));GOODS_LIST.add(new Goods("4", "Thinkpad X1", new BigDecimal(9800)));GOODS_LIST.add(new Goods("5", "MeiZu One", new BigDecimal(3200)));GOODS_LIST.add(new Goods("6", "Mate 40", new BigDecimal(6500)));}public static Goods randomGoods() {int rIndex = r.nextInt(GOODS_LIST.size());return GOODS_LIST.get(rIndex);}public Goods() {}public Goods(String goodsId, String goodsName, BigDecimal goodsPrice) {this.goodsId = goodsId;this.goodsName = goodsName;this.goodsPrice = goodsPrice;}@Overridepublic String toString() {return JSON.toJSONString(this);}}//订单明细类@Datapublic static class OrderItem {private String itemId;private String goodsId;private Integer count;@Overridepublic String toString() {return JSON.toJSONString(this);}}//关联结果@Datapublic static class FactOrderItem {private String goodsId;private String goodsName;private BigDecimal count;private BigDecimal totalMoney;@Overridepublic String toString() {return JSON.toJSONString(this);}}//构建一个商品Stream源(这个好比就是维表)public static class GoodsSource extends RichSourceFunction<Goods> {private Boolean isCancel;@Overridepublic void open(Configuration parameters) throws Exception {isCancel = false;}@Overridepublic void run(SourceContext sourceContext) throws Exception {while(!isCancel) {Goods.GOODS_LIST.stream().forEach(goods -> sourceContext.collect(goods));TimeUnit.SECONDS.sleep(1);}}@Overridepublic void cancel() {isCancel = true;}}//构建订单明细Stream源public static class OrderItemSource extends RichSourceFunction<OrderItem> {private Boolean isCancel;private Random r;@Overridepublic void open(Configuration parameters) throws Exception {isCancel = false;r = new Random();}@Overridepublic void run(SourceContext sourceContext) throws Exception {while(!isCancel) {Goods goods = Goods.randomGoods();OrderItem orderItem = new OrderItem();orderItem.setGoodsId(goods.getGoodsId());orderItem.setCount(r.nextInt(10) + 1);orderItem.setItemId(UUID.randomUUID().toString());sourceContext.collect(orderItem);orderItem.setGoodsId("111");sourceContext.collect(orderItem);TimeUnit.SECONDS.sleep(1);}}@Overridepublic void cancel() {isCancel = true;}}//构建水印分配器,学习测试直接使用系统时间了public static class GoodsWatermark implements WatermarkStrategy<Goods> {@Overridepublic TimestampAssigner<Goods> createTimestampAssigner(TimestampAssignerSupplier.Context context) {return (element, recordTimestamp) -> System.currentTimeMillis();}@Overridepublic WatermarkGenerator<Goods> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context) {return new WatermarkGenerator<Goods>() {@Overridepublic void onEvent(Goods event, long eventTimestamp, WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}@Overridepublic void onPeriodicEmit(WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}};}}//构建水印分配器,学习测试直接使用系统时间了public static class OrderItemWatermark implements WatermarkStrategy<OrderItem> {@Overridepublic TimestampAssigner<OrderItem> createTimestampAssigner(TimestampAssignerSupplier.Context context) {return (element, recordTimestamp) -> System.currentTimeMillis();}@Overridepublic WatermarkGenerator<OrderItem> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context) {return new WatermarkGenerator<OrderItem>() {@Overridepublic void onEvent(OrderItem event, long eventTimestamp, WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}@Overridepublic void onPeriodicEmit(WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}};}}
}
代码演示2
1、通过keyBy将两个流join到一起
2、interval join需要设置流A去关联哪个时间范围的流B中的元素。此处,我设置的下界为-1、上界为0,且上界是一个开区间。表达的意思就是流A中某个元素的时间,对应上一秒的流B中的元素。
3、process中将两个key一样的元素,关联在一起,并加载到一个新的FactOrderItem对象中
package cn.lanson.extend;import com.alibaba.fastjson.JSON;
import lombok.Data;
import org.apache.flink.api.common.eventtime.*;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.co.ProcessJoinFunction;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;
import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.UUID;
import java.util.concurrent.TimeUnit;/*** Author lanson* Desc*/
public class JoinDemo02 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();// 构建商品数据流DataStream<Goods> goodsDS = env.addSource(new GoodsSource()).assignTimestampsAndWatermarks(new GoodsWatermark());// 构建订单明细数据流DataStream<OrderItem> orderItemDS = env.addSource(new OrderItemSource()).assignTimestampsAndWatermarks(new OrderItemWatermark());// 进行关联查询SingleOutputStreamOperator<FactOrderItem> factOrderItemDS = orderItemDS.keyBy(OrderItem::getGoodsId).intervalJoin(goodsDS.keyBy(Goods::getGoodsId)).between(Time.seconds(-1), Time.seconds(0))//.upperBoundExclusive().process(new ProcessJoinFunction<OrderItem, Goods, FactOrderItem>() {@Overridepublic void processElement(OrderItem left, Goods right, Context ctx, Collector<FactOrderItem> out) throws Exception {FactOrderItem factOrderItem = new FactOrderItem();factOrderItem.setGoodsId(right.getGoodsId());factOrderItem.setGoodsName(right.getGoodsName());factOrderItem.setCount(new BigDecimal(left.getCount()));factOrderItem.setTotalMoney(right.getGoodsPrice().multiply(new BigDecimal(left.getCount())));out.collect(factOrderItem);}});factOrderItemDS.print();env.execute("Interval JOIN");}//商品类@Datapublic static class Goods {private String goodsId;private String goodsName;private BigDecimal goodsPrice;public static List<Goods> GOODS_LIST;public static Random r;static {r = new Random();GOODS_LIST = new ArrayList<>();GOODS_LIST.add(new Goods("1", "小米12", new BigDecimal(4890)));GOODS_LIST.add(new Goods("2", "iphone12", new BigDecimal(12000)));GOODS_LIST.add(new Goods("3", "MacBookPro", new BigDecimal(15000)));GOODS_LIST.add(new Goods("4", "Thinkpad X1", new BigDecimal(9800)));GOODS_LIST.add(new Goods("5", "MeiZu One", new BigDecimal(3200)));GOODS_LIST.add(new Goods("6", "Mate 40", new BigDecimal(6500)));}public static Goods randomGoods() {int rIndex = r.nextInt(GOODS_LIST.size());return GOODS_LIST.get(rIndex);}public Goods() {}public Goods(String goodsId, String goodsName, BigDecimal goodsPrice) {this.goodsId = goodsId;this.goodsName = goodsName;this.goodsPrice = goodsPrice;}@Overridepublic String toString() {return JSON.toJSONString(this);}}//订单明细类@Datapublic static class OrderItem {private String itemId;private String goodsId;private Integer count;@Overridepublic String toString() {return JSON.toJSONString(this);}}//关联结果@Datapublic static class FactOrderItem {private String goodsId;private String goodsName;private BigDecimal count;private BigDecimal totalMoney;@Overridepublic String toString() {return JSON.toJSONString(this);}}//构建一个商品Stream源(这个好比就是维表)public static class GoodsSource extends RichSourceFunction<Goods> {private Boolean isCancel;@Overridepublic void open(Configuration parameters) throws Exception {isCancel = false;}@Overridepublic void run(SourceContext sourceContext) throws Exception {while (!isCancel) {Goods.GOODS_LIST.stream().forEach(goods -> sourceContext.collect(goods));TimeUnit.SECONDS.sleep(1);}}@Overridepublic void cancel() {isCancel = true;}}//构建订单明细Stream源public static class OrderItemSource extends RichSourceFunction<OrderItem> {private Boolean isCancel;private Random r;@Overridepublic void open(Configuration parameters) throws Exception {isCancel = false;r = new Random();}@Overridepublic void run(SourceContext sourceContext) throws Exception {while (!isCancel) {Goods goods = Goods.randomGoods();OrderItem orderItem = new OrderItem();orderItem.setGoodsId(goods.getGoodsId());orderItem.setCount(r.nextInt(10) + 1);orderItem.setItemId(UUID.randomUUID().toString());sourceContext.collect(orderItem);orderItem.setGoodsId("111");sourceContext.collect(orderItem);TimeUnit.SECONDS.sleep(1);}}@Overridepublic void cancel() {isCancel = true;}}//构建水印分配器,学习测试直接使用系统时间了public static class GoodsWatermark implements WatermarkStrategy<Goods> {@Overridepublic TimestampAssigner<Goods> createTimestampAssigner(TimestampAssignerSupplier.Context context) {return (element, recordTimestamp) -> System.currentTimeMillis();}@Overridepublic WatermarkGenerator<Goods> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context) {return new WatermarkGenerator<Goods>() {@Overridepublic void onEvent(Goods event, long eventTimestamp, WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}@Overridepublic void onPeriodicEmit(WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}};}}//构建水印分配器,学习测试直接使用系统时间了public static class OrderItemWatermark implements WatermarkStrategy<OrderItem> {@Overridepublic TimestampAssigner<OrderItem> createTimestampAssigner(TimestampAssignerSupplier.Context context) {return (element, recordTimestamp) -> System.currentTimeMillis();}@Overridepublic WatermarkGenerator<OrderItem> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context) {return new WatermarkGenerator<OrderItem>() {@Overridepublic void onEvent(OrderItem event, long eventTimestamp, WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}@Overridepublic void onPeriodicEmit(WatermarkOutput output) {output.emitWatermark(new Watermark(System.currentTimeMillis()));}};}}
}
重点注意
注意:后面项目中涉及到双流
接下来的内容面试常问
双流Join是Flink面试的高频问题。一般情况下说明以下几点就可以hold了: 1.Join大体分类只有两种:Window Join和Interval Join。 2.Window Join又可以根据Window的类型细分出3种: Tumbling 、Sliding 、Session Widnow Join。 3.Windows类型的join都是利用window的机制,先将数据缓存在Window State中,当窗口触发计算时,执行join操作; 4.interval join也是利用state存储数据再处理,区别在于state中的数据有失效机制,依靠数据触发数据清理;
看官网示例说明
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/operators/joining.html
2021年大数据Flink(四十五):扩展阅读 双流Join相关推荐
- 2021年大数据Flink(十五):流批一体API Connectors Kafka
目录 Kafka pom依赖 参数设置 参数说明 Kafka命令 代码实现-Kafka Consumer 代码实现-Kafka Producer 代码实现-实时ETL Kafka pom依赖 Flin ...
- 2021年大数据HBase(十五):HBase的Bulk Load批量加载操作
全网最详细的大数据HBase文章系列,强烈建议收藏加关注! 新文章都已经列出历史文章目录,帮助大家回顾前面的知识重点. 目录 系列历史文章 HBase的Bulk Load批量加载操作 一.Bulk L ...
- 2021年大数据Hadoop(十五):Hadoop的联邦机制 Federation
全网最详细的Hadoop文章系列,强烈建议收藏加关注! 后面更新文章都会列出历史文章目录,帮助大家回顾知识重点. 目录 本系列历史文章 前言 Hadoop的联邦机制 Federation 背景概述 F ...
- 2021年大数据ELK(十五):Elasticsearch SQL简单介绍
全网最详细的大数据ELK文章系列,强烈建议收藏加关注! 新文章都已经列出历史文章目录,帮助大家回顾前面的知识重点. 目录 Elasticsearch SQL简单介绍 一.SQL与Elasticsear ...
- 2021年大数据Flink(十四):流批一体API Connectors JDBC
目录 Connectors JDBC 代码演示 Connectors JDBC Apache Flink 1.12 Documentation: JDBC Connector 代码演示 package ...
- 2021年大数据Flink(十二):流批一体API Transformation
目录 Transformation 官网API列表 基本操作-略 map flatMap keyBy filter sum reduce 代码演示 合并-拆分 union和connect split. ...
- 2021年大数据Flink(十):流处理相关概念
目录 流处理相关概念 数据的时效性 流处理和批处理 流批一体API DataStream API 支持批执行模式 API 编程模型 流处理相关概念 数据的时效 ...
- 2021年大数据Flink(十八):Flink Window操作
目录 Flink-Window操作 为什么需要Window Window的分类 按照time和count分类 按照slide和size分类 总结 Window ...
- 2021年大数据Flink(十六):流批一体API Connectors Redis
目录 Redis API 使用RedisCommand设置数据结构类型时和redis结构对应关系 需求 代码实现 Redis API 通过flink 操作redis 其实我们可以通过传统的redis ...
- 2021年大数据Flink(十九):案例一 基于时间的滚动和滑动窗口
目录 案例一 基于时间的滚动和滑动窗口 需求 代码实现 案例一 基于时间的滚动和滑动窗口 需求 nc -lk 9999 有如下数据表示: 信号灯编号和通过该信号灯的车的数量 9,3 9,2 9,7 4 ...
最新文章
- mobile还有人用吗 spring_话说,苹果手机语音备忘录功能还有人用吗?
- Hibernate初探(二)
- extjs中Store和grid的刷新问题
- jsp调用struts,jsp调用action,action获取表单提交的参数
- 策略设计模式_设计模式之策略模式总结
- 基于已有集群动态发现方式部署 Etcd 集群
- matlab相机标定_综述 | 相机标定方法
- BeyondCompare This license key has been revoked:
- 疫期免费 “零接触”云迁移~工具替代人力! 人不聚,活儿继续!
- 能自动更新的万能周报模板,有手就会用!
- 2020互联网公司中秋礼盒大比拼(22家互联网厂商)
- Excel 组及分级显示制作教程
- python假设税前工资和税率如下_计算税后收入_税前税后工资计算公式,软件和手动计算哪个更有优势?...
- Pollard‘s rho大数分解算法
- java文件分割与合并
- 什么人不在生死簿_高人亲眼所见的“地狱、生死簿、三世因果”(转)阴间一直是世...
- 黑马程序员_JAVA相关基础知识
- 内网安全——穿透上线NgrokFrpNpsSpp
- Python装饰器应用实例
- python 工具变量回归_stata工具变量法一例:使用2SLS进行ivreg2估计及其检验