Flume-NG源码阅读之SourceRunner，及选择器selector和拦截器interceptor的执行

　　在AbstractConfigurationProvider类中loadSources方法会将所有的source进行封装成SourceRunner放到了Map<String, SourceRunner> sourceRunnerMap之中。相关代码如下：

 1 　　　　　　Map<String, String> selectorConfig = context.getSubProperties(
 2               BasicConfigurationConstants.CONFIG_SOURCE_CHANNELSELECTOR_PREFIX);
 3
 4           ChannelSelector selector = ChannelSelectorFactory.create(
 5               sourceChannels, selectorConfig);
 6
 7           ChannelProcessor channelProcessor = new ChannelProcessor(selector);
 8           Configurables.configure(channelProcessor, context);
 9           source.setChannelProcessor(channelProcessor);
10           sourceRunnerMap.put(sourceName,
11               SourceRunner.forSource(source));

　　每个source都有selector。上述代码会获取配置文件中关于source的selector配置信息；然后构造ChannelSelector对象selector；并封装selector对象成ChannelProcessor对象channelProcessor；执行channelProcessor.configure方法进行配置；设置soure的channelprocessor，最后封装为sourceRunner和source名称一起放入sourceRunnerMap中。　

　　一、ChannelSelector selector = ChannelSelectorFactory.create(sourceChannels, selectorConfig)会根据配置文件中指定的类型实例化一个ChannelSelector(共两种ReplicatingChannelSelector复制和MultiplexingChannelSelector复用)如果没有指定类型默认是ReplicatingChannelSelector，也就是配置文件中不用配置selector会将每个event复制发送到多个channel；selector.setChannels(channels)；对此slector进行配置configure(context)。这两中selector都实现了三个方法getRequiredChannels(Event event)、getOptionalChannels(Event event) 以及configure(Context context)。其实Event要发送到的channel有两种组成：RequiredChannels和OptionalChannels，对应两个方法。

　　(1)ReplicatingChannelSelector的configure(context)方法会获得通过"optional"在配置文件中指定的可选发送的channels(可以多个，通过空格分割)；获取requiredChannels是此source对应的channel中可以活动的channel列表；然后获取所有channel的名字及其与channel的映射channelNameMap；然后将可选的channel加入optionalChannels并从requiredChannels去掉有对应的channel，在这里并没有检查可选channel的合法性以及可以配置此source指定的channel之外的channel，requiredChannels和optionalChannels不能有交集，有交集的话会从requiredChannels中删除相交的channel，所以如果配置文件中optional指定的channel列表和source指定的列表相同getOptionalChannels方法有可能会返回全部可活动channel列表使得数据重复，所以建议optional指定的channel最好是source指定之外的其他channel（比如是其他source的channel）。getOptionalChannels方法就是直接返回optionalChannels列表，getRequiredChannels方法返回requiredChannels列表，如果requiredChannels为null，则返回全部的可以活动的channel列表。

　　(2)MultiplexingChannelSelector的configure(context)先获取要匹配的event的header的headerName，只能选择一个headerName；获得默认发送到的channel列表defaultChannels，可以指定多个默认channel；获得mapping的各个子值，及对应的channel名称mapConfig；用来存储header不同的值及其对应的要发送到的channel列表（每个map可以发送到多个channel中，每个channel也可以同时对应多个mapping），存入channelMapping（这个数据结构是用来存储mapping值及对应的channel列表的）；optionalChannels是配置的可选值及其要发送到的channel列表的映射关系，channelMapping中已经出现的channel不允许再次在optionalChannels出现(防止数据重复)，如果channelMapping没有这个值对应的channel列表(表示可能会使用默认的channel列表)则使过滤与默认channel列表的交集，optionalChannels存储的是对应header的各个值及其等于该值的event要发送到的可选择的channel列表。getOptionalChannels(Event event)方法返回的是optionalChannels中该event的指定header对应的可选择的channel列表。getRequiredChannels(Event event)方法返回的是channelMapping中该event的指定header对应的channel列表，如果为null(表示由于该event的headers没有匹配的channel就发送到默认的channel中)就返回默认发送列表defaultChannels。需要说明的是选择器配置文件中的"default"、"mapping."、"optional."这三个是同等级的，没有匹配后两者的值时才会选择发送到default对应的channel列表，后两者的值都是event的header中对应配置文件中指定的"header"的各种值。当调用getRequiredChannels(Event event)和getOptionalChannels(Event event)方法时都会对这个event的相应header查找对应要发送到的channel列表。

　　二、 ChannelProcessor channelProcessor = new ChannelProcessor(selector)这个是封装选择器构造channelprocessor。其构造方法会赋值selector并构造一个InterceptorChain对象interceptorChain。ChannelProcessor类负责管理选择器selector和拦截器interceptor。

　　三、执行channelProcessor.configure(Context)进行必要的配置，该方法会调用channelProcessor.configureInterceptors(context)对拦截器们进行获取和配置，configureInterceptors方法会先从配置文件中获取interceptor的组件名字interceptorNames[](可以多个)，然后获取所有的“interceptors.”的配置信息interceptorContexts，然后遍历所有interceptorNames从配置文件中获取属于这个interceptor的配置信息及类型(type)，根据类型构建相应的interceptor并进行配置configure，加入interceptors列表(用来存放实例化的interceptor)；最后将列表传递给interceptorChain。关于更多interceptor的信息可以看这篇Flume-NG源码阅读之Interceptor(原创) 。　　

　　四、source.setChannelProcessor(channelProcessor)赋值。各个source通过getChannelProcessor()方法获取processor调用其processEventBatch(events)或者processEvent(event)来将event送到channel中。

　　五、sourceRunnerMap.put(sourceName,SourceRunner.forSource(source))将source封装成SourceRunner放入sourceRunnerMap。SourceRunner.forSource会根据这个source所实现的接口封装成不同的Runner，有两种接口PollableSource和EventDrivenSource，前者是有自己线程来驱动的需要实现process方法，后者是没有单独的线程来驱动的没有process方法。

 1 public static SourceRunner forSource(Source source) {
 2     SourceRunner runner = null;
 3
 4     if (source instanceof PollableSource) {
 5       runner = new PollableSourceRunner();
 6       ((PollableSourceRunner) runner).setSource((PollableSource) source);
 7     } else if (source instanceof EventDrivenSource) {
 8       runner = new EventDrivenSourceRunner();
 9       ((EventDrivenSourceRunner) runner).setSource((EventDrivenSource) source);
10     } else {
11       throw new IllegalArgumentException("No known runner type for source "
12           + source);
13     }
14
15     return runner;
16   }

　　(1)PollableSourceRunner的start()方法会获取source的ChannelProcessor，然后执行其initialize()方法，该方法会调用interceptorChain.initialize()方法对拦截器们进行初始化(遍历所有拦截器然后执行拦截器的initialize()方法)；然后执行source.start()启动source；再启动一个线程PollingRunner，它的run方法会始终执行source.process()并根据返回的状态值做一些统计工作。

　　(2)EventDrivenSourceRunner的start()方法会获取source的ChannelProcessor，然后执行其initialize()方法，该方法会调用interceptorChain.initialize()方法对拦截器们进行初始化(遍历所有拦截器然后执行拦截器的initialize()方法)；然后执行source.start()启动source。

　　这样就完成了sourceRunnerMap的组装。当在Application中的startAllComponents方法中通过materializedConfiguration.getSourceRunners()获取所有的SourceRunner并放入supervisor.supervise中去执行，会调用到SourceRunner.start()方法，即上面刚讲到的内容。这样source就启动了。然后当将封装的Events或者Event发送到channel时，需要使用对应的方法ChannelProcessor.processEventBatch(List<Event> events)或者ChannelProcessor.processEvent(Event event)就可以将数据从source传输到channel中，这两个方法都会在开始调用interceptorChain.intercept(events)或者interceptorChain.intercept(event)对event增加headers(如果有多个interceptor会遍历interceptors处理每个event)。ChannelProcessor都是通过在source中直接调用getChannelProcessor()(在所有的source的父类AbstractSource中实现的)获得。看一看processEventBatch(List<Event> events)代码：

 1 public void processEventBatch(List<Event> events) {
 2     Preconditions.checkNotNull(events, "Event list must not be null");
 3
 4     events = interceptorChain.intercept(events);
 5
 6     Map<Channel, List<Event>> reqChannelQueue =        //需要发送到的每个channel及其要发送到这个channel的event列表
 7         new LinkedHashMap<Channel, List<Event>>();
 8
 9     Map<Channel, List<Event>> optChannelQueue =        //可选的每个channel及其要发送到这个channel的event列表
10         new LinkedHashMap<Channel, List<Event>>();
11
12     for (Event event : events) {
13       List<Channel> reqChannels = selector.getRequiredChannels(event);    //获取需要发送到的所有channel
14
15       for (Channel ch : reqChannels) {
16         List<Event> eventQueue = reqChannelQueue.get(ch);
17         if (eventQueue == null) {
18           eventQueue = new ArrayList<Event>();
19           reqChannelQueue.put(ch, eventQueue);
20         }
21         eventQueue.add(event);        //将event放入对应channel的event列表
22       }
23
24       List<Channel> optChannels = selector.getOptionalChannels(event);    //获取可选的要发送到的所有channel
25
26       for (Channel ch: optChannels) {
27         List<Event> eventQueue = optChannelQueue.get(ch);
28         if (eventQueue == null) {
29           eventQueue = new ArrayList<Event>();
30           optChannelQueue.put(ch, eventQueue);
31         }
32
33         eventQueue.add(event);        //将event放入对应channel的event列表
34       }
35     }
36
37     // Process required channels
38     for (Channel reqChannel : reqChannelQueue.keySet()) {
39       Transaction tx = reqChannel.getTransaction();    //创建事务
40       Preconditions.checkNotNull(tx, "Transaction object must not be null");
41       try {
42         tx.begin();
43
44         List<Event> batch = reqChannelQueue.get(reqChannel);
45
46         for (Event event : batch) {        //发送到需要发送到的channel
47           reqChannel.put(event);
48         }
49
50         tx.commit();
51       } catch (Throwable t) {
52         tx.rollback();    //事务回滚
53         if (t instanceof Error) {
54           LOG.error("Error while writing to required channel: " +
55               reqChannel, t);
56           throw (Error) t;
57         } else {
58           throw new ChannelException("Unable to put batch on required " +
59               "channel: " + reqChannel, t);
60         }
61       } finally {
62         if (tx != null) {
63           tx.close();
64         }
65       }
66     }

　　上述代码不复杂，会获得所有需要发送到的channel和所有可选的channel，然后针对每个channel，将所有event放入一个列表与该channel组成映射；然后会遍历两种channel列表中的每个channel将它对应的所有event发送到对应的channel中。这个方法写的不够友好，还可以再优化，因为方法的参数本身就是一个列表可以省去一层for循环，直接将reqChannelQueue.put(ch, eventQueue)和optChannelQueue.put(ch, eventQueue)中的eventQueue改为传递过来的参数List<Event> events就可以达到优化的目的。

　　processEvent(Event event)方法就更简单了，将这个event发送到这两种channel列表中每个channel就可以。

　　在发送到channel的过程中我们也发现都会有事务的创建(getTransaction())、开始(tx.begin())、提交(tx.commit())、回滚(tx.rollback())、关闭(tx.close())等操作，这是必须的。在sink中这些操作需要显示的去调用，而在source端则封装在processEvent和processEventBatch方法中，不需要显示的调用了，但不是不调用。

　　至此，sourceRunner的配置、初始化、执行就讲解完毕了。在配置文件中看到的interceptor和selector都是在这里进行配置及执行的。通过了解上述，我们自定义source组件是不是更容易了。呵呵

　　后续还有精彩内容！敬请期待哈！

转载于:https://www.cnblogs.com/lxf20061900/p/3751252.html

Flume-NG源码阅读之SourceRunner，及选择器selector和拦截器interceptor的执行相关推荐

Spring AOP 源码分析 - 拦截器链的执行过程
1.简介本篇文章是 AOP 源码分析系列文章的最后一篇文章,在前面的两篇文章中,我分别介绍了 Spring AOP 是如何为目标 bean 筛选合适的通知器,以及如何创建代理对象的过程.现在我们的得 ...
Java struts 2 源码阅读入门
一搭建源码阅读环境首先新建一个struts 2 实例工程,并附着源码: 在Eclipse中新建一个动态web工程:完成后结构如下: 添加如下图的包:可以直接拖到lib文件夹:完成后如下: 新建一个 ...
Flume-NG源码阅读之AvroSink
org.apache.flume.sink.AvroSink是用来通过网络来传输数据的,可以将event发送到RPC服务器(比如AvroSource),使用AvroSink和AvroSource可以组 ...
struts2源码阅读
Struts2的工作机制分析及实例一.概述本章讲述Struts2的工作原理. 读者如果曾经学习过Struts1.x或者有过Struts1.x的开发经验,那么千万不要想当然地以为这一章可以跳过.实际 ...
应用监控CAT之cat-client源码阅读（一）
CAT 由大众点评开发的,基于 Java 的实时应用监控平台,包括实时应用监控,业务监控.对于及时发现线上问题非常有用.(不知道大家有没有在用) 应用自然是最初级的,用完之后,还想了解下其背后的原理, ...
centos下将vim配置为强大的源码阅读器
每日杂事缠身,让自己在不断得烦扰之后终于有了自己的清静时光来熟悉一下我的工具,每次熟悉源码都需要先在windows端改好,拖到linux端,再编译.出现问题,还得重新回到windows端,这个过程太耗 ...
源码阅读：AFNetworking（十六）——UIWebView+AFNetworking
该文章阅读的AFNetworking的版本为3.2.0. 这个分类提供了对请求周期进行控制的方法,包括进度监控.成功和失败的回调. 1.接口文件 1.1.属性 /**网络会话管理者对象*/ @prop ...
源码阅读：SDWebImage（六）——SDWebImageCoderHelper
该文章阅读的SDWebImage的版本为4.3.3. 这个类提供了四个方法,这四个方法可分为两类,一类是动图处理,一类是图像方向处理. 1.私有函数先来看一下这个类里的两个函数 /**这个函数是计算 ...
mybatis源码阅读
说下mybatis执行一个sql语句的流程执行语句,事务等SqlSession都交给了excutor,excutor又委托给statementHandler SimpleExecutor:每执行一次 ...

Flume-NG源码阅读之SourceRunner，及选择器selector和拦截器interceptor的执行

Flume-NG源码阅读之SourceRunner，及选择器selector和拦截器interceptor的执行相关推荐

最新文章

热门文章