这是在线迁移 源码分析的第三篇,Openstack liberty 云主机迁移源码分析之在线迁移2中分析了prepare准备阶段nova-compute的处理过程,本文将会分析execute执行阶段的处理过程,下面一起来看具体内容:

execute执行阶段

#`nova/compute/manager.py/ComputeManager._do_live_migration`
def _do_live_migration(self, context, dest, instance, block_migration,migration, migrate_data):#省略prepare准备阶段的代码部分,具体分析请查阅前一篇博文"""`prepare`准备阶段完成后,返回如下的字典pre_migration_data :{'graphics_listen_addrs': {}, 'volume': {},'serial_listen_addr': {}}"""migrate_data['pre_live_migration_result'] = pre_migration_data #更新`nova.instance_migrations`数据库,状态改为:runningif migration:migration.status = 'running'migration.save()migrate_data['migration'] = migrationtry:"""调用虚拟化驱动(LibvirtDriver)执行迁移请看下文的具体分析"""self.driver.live_migration(context, instance, dest,self._post_live_migration,self._rollback_live_migration,block_migration, migrate_data)except Exception:# Executing live migration# live_migration might raises exceptions, but# nothing must be recovered in this version.LOG.exception(_LE('Live migration failed.'), instance=instance)#迁移失败,更新`nova.instance_migrations`,状态改为:#failed,并上抛异常with excutils.save_and_reraise_exception():if migration:migration.status = 'failed'migration.save()---------------------------------------------------------------
#接上文:`nova/virt/libvirt/driver.py/LibvirtDriver.live_migration`def live_migration(self, context, instance, dest,post_method, recover_method, block_migration=False,migrate_data=None):"""Spawning live_migration operation for distributinghigh-load."""# 'dest' will be substituted into 'migration_uri' so ensure# it does't contain any characters that could be used to# exploit the URI accepted by libivrt#校验目的主机名是否合法,只能:单词字符、_、-、.、:if not libvirt_utils.is_valid_hostname(dest):raise exception.InvalidHostname(hostname=dest)#下文分析self._live_migration(context, instance, dest,post_method, recover_method, block_migration,migrate_data)---------------------------------------------------------------
#接上文:
def _live_migration(self, context, instance, dest, post_method,recover_method, block_migration,migrate_data):"""Do live migration.This fires off a new thread to run the blocking migrationoperation, and then this thread monitors the progress ofmigration and controls its operation"""#通过libvirt获取实例的virDomain对象,然后返回对应的Guest对象guest = self._host.get_guest(instance)# TODO(sahid): We are converting all calls from a# virDomain object to use nova.virt.libvirt.Guest.# We should be able to remove dom at the end.dom = guest._domain#启动新线程执行块迁移,下文具体分析opthread = utils.spawn(self._live_migration_operation,context, instance, dest,block_migration,migrate_data, dom)#创建事件并与块迁移线程关联,监视线程通过事件来了解迁移状态finish_event = eventlet.event.Event()def thread_finished(thread, event):LOG.debug("Migration operation thread notification",instance=instance)event.send()opthread.link(thread_finished, finish_event)# Let eventlet schedule the new thread right awaytime.sleep(0)#省略异常处理:发生异常就上抛,见下文的具体分析self._live_migration_monitor(context, instance, guest, dest,post_method, recover_method,block_migration, migrate_data,dom, finish_event)#打印日志LOG.debug("Live migration monitoring is all done",instance=instance)

小结:上述过程很简单:更新迁移状态及校验目标主机名,之后创建线程执行块迁移并通过事件监控迁移状态

块迁移过程

由上文分析可知,块迁移线程函数为:_live_migration_operation,下面来看具体内容:

def _live_migration_operation(self, context, instance, dest,block_migration, migrate_data, dom):"""Invoke the live migration operationThis method is intended to be run in a background thread and will block that thread until the migration is finished or failed."""guest = libvirt_guest.Guest(dom)#省略try{}except异常代码:发送异常打印日志并上抛异常"""从配置中获取迁移标志,我的示例中block_migration=Falselive_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST""""if block_migration:flaglist = CONF.libvirt.block_migration_flag.split(',')else:flaglist = CONF.libvirt.live_migration_flag.split(',')#转换libvirt支持的标志并计算其或值flagvals = [getattr(libvirt, x.strip()) for x in flaglist]logical_sum = six.moves.reduce(lambda x, y: x | y, flagvals)#pre_live_migrate_data在`prepare`准备阶段中设置pre_live_migrate_data = (migrate_data or {}).get('pre_live_migration_result', {})#vnc监听地址listen_addrs = \pre_live_migrate_data.get('graphics_listen_addrs')#串口信息volume = pre_live_migrate_data.get('volume')#串口监听地址serial_listen_addr = pre_live_migrate_data.get('serial_listen_addr')#检查是否支持VIR_DOMAIN_XML_MIGRATABLE属性,migratable_flag = getattr(libvirt, 'VIR_DOMAIN_XML_MIGRATABLE', None)#如果不支持VIR_DOMAIN_XML_MIGRATABLE属性或者vnc地址为空且没有串口if (migratable_flag is None or(listen_addrs is None and not volume)):# TODO(alexs-h): These checks could be moved to the# check_can_live_migrate_destination/source phase   """如果配置的vnc或者spice监听地址不属于:('0.0.0.0', '127.0.0.1', '::', '::1') 就抛异常   """                                       self._check_graphics_addresses_can_live_migrate(listen_addrs)                           #确保CONF.serial_console.enabled=False                                  self._verify_serial_console_is_disabled()#由libvirt完成迁移操作dom.migrateToURI(CONF.libvirt.live_migration_uri % dest,logical_sum, None,CONF.libvirt.live_migration_bandwidth)else:#先转储可迁移的xml配置,然后添加卷,vnc,serial信息组成新的可迁#移配置old_xml_str = guest.get_xml_desc(dump_migratable=True)new_xml_str = self._update_xml(old_xml_str,volume,listen_addrs,serial_listen_addr)try:#由libvirt完成迁移操作dom.migrateToURI2(CONF.libvirt.live_migration_uri % dest,None,new_xml_str,logical_sum,None,CONF.libvirt.live_migration_bandwidth)except libvirt.libvirtError as ex:""" NOTE(mriedem): There is a bug in older versions of libvirt where the VIR_DOMAIN_XML_MIGRATABLE flag causes virDomainDefCheckABIStability to not compare the source and target domain xml's correctly for the CPU model.We try to handle that error here and attempt the legacy migrateToURI path, which could fail if the console addresses are not correct, but in that case we have the_check_graphics_addresses_can_live_migrate check in place to catch it.上面的意思是说:在老版本的libvirt中有个bug:VIR_DOMAIN_XML_MIGRATABLE 标志导致virDomainDefCheckABIStability 未能正确的比较源端和目的端的CPU模式,这里再次尝试是用migrateToURI执行迁移"""# TODO(mriedem): Remove this workaround when# Red Hat BZ #1141838 is closed.#如果是VIR_ERR_CONFIG_UNSUPPORTED错误,就尝试再次迁移#否则抛异常error_code = ex.get_error_code()if error_code ==libvirt.VIR_ERR_CONFIG_UNSUPPORTED: LOG.warn(_LW('An error occurred trying to live''migrate. Falling back to legacy live' 'migrate flow. Error: %s'), ex,instance=instance)self._check_graphics_addresses_can_live_migrate(                    listen_addrs)self._verify_serial_console_is_disabled()dom.migrateToURI(CONF.libvirt.live_migration_uri % dest,logical_sum,None,CONF.libvirt.live_migration_bandwidth)else:raise#迁移结束,打印日志LOG.debug("Migration operation thread has finished",instance=instance)

小结:执行参数配置和条件检查,然后由libvirt完成迁移过程

状态监视

def _live_migration_monitor(self, context, instance, guest,dest, post_method,recover_method, block_migration,migrate_data, dom, finish_event):"""从配置模板获得需要迁移的内存大小+从云主机获取需要迁移的磁盘大小对于后端是共享存储(如:nfs,rbd)的cinder卷是不需要迁移的,只有本地的lvm块设备或者raw/qcow2格式的本地文件才需要迁移"""data_gb = self._live_migration_data_gb(instance, guest,block_migration)  #达到最大允许切换停机时间的步阶downtime_steps = list(self._migration_downtime_steps(data_gb)) #迁移允许执行的最长时间(之后会终止迁移)completion_timeout = int(CONF.libvirt.live_migration_completion_timeout * data_gb)#更新迁移进度的最大等待时间progress_timeout = CONF.libvirt.live_migration_progress_timeout  """下面是一长串的if else条件判断,根据迁移所处的状态执行不同的操作"""n = 0start = time.time()progress_time = startprogress_watermark = Nonewhile True:#获取实例的作业信息info = host.DomainJobInfo.for_domain(dom)if info.type == libvirt.VIR_DOMAIN_JOB_NONE:"""这个type可以表示三种状态:1. 迁移任务还没有开始,这可以通过判断迁移线程是否还在运行来分辨2.迁移由于失败/完成而结束了,这可以通过判断实例是否还在当前主机运行来分辨"""#任务还没有开始if not finish_event.ready():LOG.debug("Operation thread is still" " running",instance=instance)else:#如果获取实例状态出错,则抛异常try:#如果实例还在当前主机运行,说明迁移失败了if guest.is_active():LOG.debug("VM running on src," "migration failed",instance=instance)info.type = libvirt.VIR_DOMAIN_JOB_FAILED#否则就是迁移完成了else:LOG.debug("VM is shutoff,migration""finished",instance=instance)info.type = libvirt.VIR_DOMAIN_JOB_COMPLETED except libvirt.libvirtError as ex:LOG.debug("Error checking domain" "status %(ex)s", ex, instance=instance)#如果错误码是实例不存在,说明迁移完成了if ex.get_error_code() == libvirt.VIR_ERR_NO_DOMAIN:LOG.debug("VM is missing,migration""finished", instance=instance)info.type = libvirt.VIR_DOMAIN_JOB_COMPLETED#否则就是迁移失败了else:LOG.info(_LI("Error %(ex)s,""migration failed"),instance=instance)info.type = libvirt.VIR_DOMAIN_JOB_FAILED   #迁移还没有开始if info.type == libvirt.VIR_DOMAIN_JOB_NONE:      LOG.debug("Migration not running yet",instance=instance)#正在执行迁移elif info.type == libvirt.VIR_DOMAIN_JOB_UNBOUNDED:now = time.time()elapsed = now - startabort = False#如果进度发生了变化,就更新if ((progress_watermark is None) or(progress_watermark > info.data_remaining)):progress_watermark = info.data_remainingprogress_time = now#如果进度更新间隔大于配置值,就终止迁移if (progress_timeout != 0 and(now - progress_time) > progress_timeout):LOG.warn(_LW("Live migration stuck for %d"" sec"),(now - progress_time), instance=instance)abort = True#如果迁移时间超过了最大的允许迁移时间,就终止迁移if (completion_timeout != 0 andelapsed > completion_timeout):LOG.warn(_LW("Live migration not completed""after %d sec"), completion_timeout, instance=instance)abort = True#终止迁移任务if abort:try:dom.abortJob()except libvirt.libvirtError as e:LOG.warn(_LW("Failed to abort migration""%s"),e, instance=instance)raise""" See if we need to increase the max downtime. Weignore failures, since we'd rather continue tryingto migrate增加在线迁移的最大切换时间"""if (len(downtime_steps) > 0 andelapsed > downtime_steps[0][0]):downtime = downtime_steps.pop(0)LOG.info(_LI("Increasing downtime to %""(downtime)dms after %(waittime)d sec elapsed"" time"), {"downtime": downtime[1],"waittime": downtime[0]},instance=instance)try:dom.migrateSetMaxDowntime(downtime[1])except libvirt.libvirtError as e:LOG.warn(_LW("Unable to increase max""downtime to %(time)d ms: %(e)s"),{"time": downtime[1], "e": e}, instance=instance)#每5s记录一次debug日志if (n % 10) == 0:#更新进度remaining = 100if info.memory_total != 0:remaining = round(info.memory_remaining *100 / info.memory_total)instance.progress = 100 - remaininginstance.save()#每30s记录一次info日志lg = LOG.debugif (n % 60) == 0:lg = LOG.info#这里省略日志语句n = n+1#迁移完成了elif info.type == libvirt.VIR_DOMAIN_JOB_COMPLETED:#调用ComputeManager._post_live_migration方法,执行扫尾#工作,请看后面的具体分析post_method(context, instance, dest, block_migration,migrate_data)break#迁移失败了elif info.type == libvirt.VIR_DOMAIN_JOB_FAILED:#调用ComputeManager._rollback_live_migration方法,执#行回滚操作recover_method(context, instance, dest, block_migration,migrate_data)break#迁移被取消了elif info.type == libvirt.VIR_DOMAIN_JOB_CANCELLED:#调用ComputeManager._rollback_live_migration方法,执#行回滚操作recover_method(context, instance, dest, block_migration,migrate_data)break   else:LOG.warn(_LW("Unexpected migration job type: %d"),info.type, instance=instance)#睡眠0.5s,再循环time.sleep(0.5)                     

小结:一个大循环在不停的监视迁移状态,如果发生错误则退出;如果迁移完成就调用_post_live_migration 执行扫尾工作,如果迁移失败或者被取消就调用_rollback_live_migration执行回滚操作。

下一篇博文将分析complete完成阶段,敬请期待!!!

Openstack liberty 云主机迁移源码分析之在线迁移3相关推荐

  1. 云客Drupal源码分析之Session进阶

    在本系列之前写过<云客Drupal源码分析之Session系统>,但那部分仅仅讲到了drupal会话的基础:Symfony的Session组件 至于drupal怎么去使用这个基础就是本主题 ...

  2. 云客Drupal源码分析之数据库Schema及创建数据表

    本主题是<云客Drupal源码分析之数据库系统及其使用>的补充,便于查询,所以独立成一个主题 讲解数据库系统如何操作Schema(创建修改数据库.数据表.字段:判断它们的存在性等等),以及 ...

  3. 云客Drupal源码分析之配置系统Configuration(一)

    各位<云客drupal源码分析>系列的读者: 本系列一直以每周一篇的速度进行博客原创更新,希望帮助大家理解drupal8底层原理,并缩短学习时间,但自<插件系统(上)>主题开始 ...

  4. 云客Drupal源码分析之节点实体访问控制处理器

    以下内容仅是一个预览,完整内容请见文尾: 本篇讲解节点实体的访问控制,总结了访问检查链,对"域"."授权id"进行了清晰论述(该知识点可能是中文资料第一次提及, ...

  5. 云客Drupal源码分析之类型化数据Typed Data API

    各位<云客drupal源码分析>系列的读者: 本系列一直以每周一篇的速度进行博客原创更新,希望帮助大家理解drupal底层原理,并缩短学习时间,但自<插件系统(上)>主题开始博 ...

  6. 云客Drupal源码分析之数据库系统及其使用

    在开始本主题前请允许一点点题外话: 在我写这个博客的时候(2016年10月28日),<Begining Drupal 8>这本书已经翻译完成并做成了PDF格式供给大家免费下载,这是一本引导 ...

  7. 云客Drupal源码分析之国际化Internationalization:核心翻译系统

    各位<云客drupal源码分析>系列的读者: 本系列一直以每周一篇的速度进行博客原创更新,希望帮助大家理解drupal底层原理,并缩短学习时间,但自<插件系统(上)>主题开始博 ...

  8. 云客Drupal源码分析之前端js中的翻译

    从本主题开始<云客Drupal源码分析>系列将连续发布和前端js相关的内容,如果您对JavaScript还不熟悉或者需要来一次系统性的整理回顾,在此云客为您准备了以下资料: <PHP ...

  9. 云客Drupal源码分析之插件系统(下)

    以下内容仅是一个预览,完整内容请见文尾: 至此本系列对插件的介绍全部完成,涵盖了系统插件的所有知识 全文目录(全文10476字): 实例化插件 插件映射Plugin mapping 插件上下文   具 ...

最新文章

  1. ryu和mysql实现控制_openflow的初步认识及RYU控制器实践
  2. 微软Java面试题-按照字母排序
  3. Element Swapping
  4. 利用python制作漂亮的词云图_利用python制作漂亮的词云图
  5. java汉字转化accic_Java自主学习贴
  6. 【原】vue-router中params和query的区别
  7. 神经网络-损失函数是不是凸的
  8. 【原创】记一次HttpWebRequest中国移动查账单爬虫的攻克历程
  9. Silverlight C# 游戏开发:Flyer06小小的改进让游戏更有趣
  10. 【实用工具】之CSDN表格模板
  11. OpenSessionInViewFilter 的配置
  12. Git 报错fatal: not a git repository (or any parent up to mount point /) Stopping at filesystem bounda
  13. OCRKit Pro for mac (OCR文字识别工具)
  14. ubuntu14.04中mysql的安裝及utf8编码集配置
  15. 多线程开发必须知道的概念
  16. 基于protues与keli下贪吃蛇的实现
  17. visio_连接线样式设置:如箭头线
  18. Ubuntu20安装搜狗拼音输入法
  19. 如何实现单行/多行文本溢出的省略样式?
  20. PDF如何旋转页面?这样旋转就好了

热门文章

  1. 测试WiFi距离的软件,WiFi新用途:利用WiFi测量室内运动速度和距离 精准!
  2. 市场调研报告-全球与中国比重计市场现状及未来发展趋势
  3. 视频教程-【吴刚】淘宝天猫网站设计初级入门标准视频教程-UI
  4. hyperic hq安装
  5. PostgreSQL大小敏感问题
  6. 【SMT】SMT电子制造工艺发展趋势及PCBA电子组件可靠性现状分析
  7. 【团队管理】这样开晨会,员工不累,效率加倍!
  8. 调用CALL FUNCTION 'ZFI22F_JUDGE_CBXM'
  9. 2018福建省“百越杯”CTF初赛writeup
  10. 求最大公约数(更相减损术)