系统升级软件流程

本章节结合源码剖析Recovery系统升级流程,流程中相关技术难点或者细节会单独文章介绍,文中相应位置会附上链接。

从APP检测到服务器推送OTA升级包到设备启动到新版本系统的整个软件流程如下图所示,
文章将围绕图中涉及到的模块详细讲解。

软件流程

  • 系统升级软件流程
    • 1. App下载升级包并调用RecoverySystem接口
    • 2. Framework RecoverySystem 触发升级
      • 2.1 构造 Recovery 升级指令
      • 2.2 通过 RECOVERY_SERVICE 把升级指令写入到BCB
      • 2.3 通过 POWER_SERVICE 触发重启设备
    • 3. BootLoader 读取 BCB 启动到 Recovery System
    • 4. Kernel 加载 ramdisk,启动 init 并拉起 recovery 进程
    • 5. 进入 Recovery 升级流程
      • 5.1 从 misc 分区 BCB 读取升级指令
      • 5.2 把升级包 mmap 到内存
      • 5.3 校验升级包完整性、合法性
      • 5.4 Fork update-binary 子进程升级系统
      • 5.5 退出 update-binary 子进程、保存 log 并擦除 misc 分区 BCB
      • 5.6 重启机器返回 main system
    • 6. BootLoader 启动 Main System
    • 7. Init 拉起 flash_recovery 服务升级 recovery 分区
      • 7.1 flash_recovery 存在的意义是什么?
      • 7.2 什么时候启动 flash_recovery 服务?
      • 7.3 flash_recovery 怎么升级 recovery 分区?
    • 8. 启动到 launcher,升级流程结束

1. App下载升级包并调用RecoverySystem接口

检测是否有OTA推送并从服务器下载升级包的业务逻辑由oem厂商自行实现,下面从触发升级开始分析。
App下载完升级包后调用 framework RecoverySystem 类的 installPackage 接口传入下载好的升级包路径,app的任务到此即结束。

// android.os.RecoverySystem
RecoverySystem.installPackage(Context context, File packageFile)

2. Framework RecoverySystem 触发升级

Google AOSP RecoverySystem
函数 installPackage()

// frameworks/base/core/java/android/os/RecoverySystem.java
public static void installPackage(Context context, File packageFile, boolean processed)throws IOException {synchronized (sRequestLock) {/* 1. 构造固定格式的 recovery 升级指令 */LOG_FILE.delete();// Must delete the file in case it was created by system server.UNCRYPT_PACKAGE_FILE.delete();String filename = packageFile.getCanonicalPath();Log.w(TAG, "!!! REBOOTING TO INSTALL " + filename + " !!!");// If the package name ends with "_s.zip", it's a security update.boolean securityUpdate = filename.endsWith("_s.zip");//  如果升级包存储于data分区,则需要对升级包特殊处理,原因和原理见下文介绍。// If the package is on the /data partition, the package needs to// be processed (i.e. uncrypt'd). The caller specifies if that has// been done in 'processed' parameter.if (filename.startsWith("/data/")) {// 如果升级包已经被处理过则检查处理后输出文件是否存在即可if (processed) {if (!BLOCK_MAP_FILE.exists()) {Log.e(TAG, "Package claimed to have been processed but failed to find "+ "the block map file.");throw new IOException("Failed to find block map file");}} else {// 升级包预处理是由服务 uncryptd 完成的,其输入为文件 UNCRYPT_PACKAGE_FILE,// 输出为文件 BLOCK_MAP_FILE, 此处初始化这两个文件。 uncryptd 详见下文介绍。FileWriter uncryptFile = new FileWriter(UNCRYPT_PACKAGE_FILE);try {uncryptFile.write(filename + "\n");} finally {uncryptFile.close();}// UNCRYPT_PACKAGE_FILE needs to be readable and writable// by system server.if (!UNCRYPT_PACKAGE_FILE.setReadable(true, false)|| !UNCRYPT_PACKAGE_FILE.setWritable(true, false)) {Log.e(TAG, "Error setting permission for " + UNCRYPT_PACKAGE_FILE);}BLOCK_MAP_FILE.delete();}// 预处理的升级包参数改为 "@+BLOCK_MAP_FILE(/cache/recovery/block.map)",// 为什么这么做,仅仅是约定而已,原理见下文介绍 uncryptd 。// If the package is on the /data partition, use the block map// file as the package name instead.filename = "@/cache/recovery/block.map";}final String filenameArg = "--update_package=" + filename + "\n";final String localeArg = "--locale=" + Locale.getDefault().toLanguageTag() + "\n";final String securityArg = "--security\n";String command = filenameArg + localeArg;if (securityUpdate) {command += securityArg;}/* 2. 通过 RECOVERY_SERVICE 把升级指令写入到 BCB(也就是misc分区头部)*/RecoverySystem rs = (RecoverySystem) context.getSystemService(Context.RECOVERY_SERVICE);if (!rs.setupBcb(command)) {throw new IOException("Setup BCB failed");}/* 3. 通过 POWER_SERVICE 触发重启 */// Having set up the BCB (bootloader control block), go ahead and rebootPowerManager pm = (PowerManager) context.getSystemService(Context.POWER_SERVICE);String reason = PowerManager.REBOOT_RECOVERY_UPDATE;// On TV, reboot quiescently if the screen is offif (context.getPackageManager().hasSystemFeature(PackageManager.FEATURE_LEANBACK)) {WindowManager wm = (WindowManager) context.getSystemService(Context.WINDOW_SERVICE);if (wm.getDefaultDisplay().getState() != Display.STATE_ON) {reason += ",quiescent";}}// 进入关机重启流程pm.reboot(reason);throw new IOException("Reboot failed (no permissions?)");}
}

分析 installPackage 一共干了以下3件事:

2.1 构造 Recovery 升级指令

通过升级包存储位置的绝对路径判断存储设备是 data 分区还是其他存储介质(U盘、TF卡等)决定是否对升级包预处理 。
如果需要预处理,则把参数 update_package 的值改为固定值 “@/cache/recovery/block.map”,同时把参数 update_package、locale、security 等格式化成固定格式的升级指令字符串。

1)uncryptd 为什么预处理升级包?
Android data 分区的数据会被加密(FDE/FBE),AOSP recovery 没有实现分区解密功能,因此 recovery 无法访问data分区的数据,也就无法从 data 分区文件系统直接 load 升级包。(recovery模式下:FDE加密的设备无法挂载data分区,FBE加密的设备看到的data分区文件内容是乱码)。

  • 所以在进入recovery 前先把升级包数据解密,解密后会把升级包在存储介质中的存储信息写入固定文件 /cache/recovery/block.map 中,进入recovery后,从 block.map 文件中解析出升级包位置信息即可 load 升级包数据。
  • 当然如果升级包保存在未加密的TF卡、U盘中,那么无需多升级包做额外处理,recovery可以从存储器文件系统直接 load 数据)

2)uncryptd 如何预处理升级包?
输入: UNCRYPT_PACKAGE_FILE 输入参数。
输出:处理结果写入到文件BLOCK_MAP_FILE(/cache/recovery/block.map)。
当然此处只是准备好输入输出文件,预处理操作是在下文第 2.3 步重启设备时执行的,见下文介绍。
UNCRYPT_PACKAGE_FILE :升级包的实际在文件系统的路径;
BLOCK_MAP_FILE :升级包数据在存储器中块分布信息(主要就是块号);

3)升级指令字符串中各个参数的格式为 –key=value 或者 –key,同时以换行符分隔。

4)函数参数 processed 的作用
uncryptd 在关机时对升级包做预处理解密,当升级包的 size 比较大时会造成关机耗时,因此可以事先预处理好升级包,再调用 installPackage 时processed 置为 true,那么在关机时就不会启动 uncryptd,从而不影响关机速度。

2.2 通过 RECOVERY_SERVICE 把升级指令写入到BCB

把格式化后的指令字符串写入 misc 分区头部的 BCB (bootloader control block)区域。

RecoverySystem

// frameworks/base/core/java/android/os/RecoverySystem.java
// 1. RecoverySystem 调用 setupBcb
public static void installPackage(Context context, File packageFile, boolean processed) {...RecoverySystem rs = (RecoverySystem) context.getSystemService(Context.RECOVERY_SERVICE);rs.setupBcb(command)...
}// 2. 调 RecoverySystemService 的 setupBcb 接口
private boolean setupBcb(String command) {return mService.setupBcb(command);
}

RecoverySystemService

// frameworks/base/services/core/java/com/android/server/recoverysystem/RecoverySystemService.java
// 1. 调用 setupOrClearBcb 把升级指令字符串 command 写入 BCB
public boolean setupBcb(String command) {if (DEBUG) Slog.d(TAG, "setupBcb: [" + command + "]");return setupOrClearBcb(true, command);
}// 2. setupOrClearBcb 实际上是启动 native 服务 setup-bcb 并通过它把指令字符串写入 BCB
private boolean setupOrClearBcb(boolean isSetup, String command) {// 2.1 检查 uncrypt/setup-bcb/clear-bcb 服务是否正在运行,// 如果处于runing状态则说明在这之前已经触发工作了,中止本次操作。final boolean available = checkAndWaitForUncryptService();if (!available) {Slog.e(TAG, "uncrypt service is unavailable.");return false;}// 2.2 通过 isSetup 判断往 BCB 写入还是擦除参数,启动不同的服务// (本质上 uncrypt/setup-bcb/clear-bcb 都是同一个binary,// 只是传入不同参数执行不同任务而已,详见下文讲解)if (isSetup) {mInjector.systemPropertiesSet("ctl.start", "setup-bcb");} else {mInjector.systemPropertiesSet("ctl.start", "clear-bcb");}// 2.3 启动 setup-bcb 或者 clear-bcb 服务后通过socket 与其通信// Connect to the uncrypt service socket.UncryptSocket socket = mInjector.connectService();if (socket == null) {Slog.e(TAG, "Failed to connect to uncrypt socket");return false;}try {// 如果是写 BCB 参数则把升级指令通过 socket 传输给服务 setup-bcb// Send the BCB commands if it's to setup BCB.if (isSetup) {socket.sendCommand(command);}// 从 socket 读取 setup-bcb/clear-bcb 执行的结果// Read the status from the socket.int status = socket.getPercentageUncrypted();// Ack receipt of the status code. uncrypt waits for the ack so// the socket won't be destroyed before we receive the code.socket.sendAck();// setup-bcb/clear-bcb 定义好的成功返回值,执行成功返回100// (100仅仅是 setup-bcb 服务定义的正确返回值,无计量等特殊含义)if (status == 100) {Slog.i(TAG, "uncrypt " + (isSetup ? "setup" : "clear")+ " bcb successfully finished.");} else {// Error in /system/bin/uncrypt.Slog.e(TAG, "uncrypt failed with status: " + status);}}
}

此处不展开介绍 native 服务 setup-bcb/clear-bcb 如何写入和擦除BCB数据,详见文章:(待续)

2.3 通过 POWER_SERVICE 触发重启设备

该步骤的重点是启动 uncryptd 预处理升级包。

RecoverySystem:

// frameworks/base/core/java/android/os/RecoverySystem.java
public static void installPackage(Context context, File packageFile, boolean processed) {...PowerManager pm = (PowerManager) context.getSystemService(Context.POWER_SERVICE);String reason = PowerManager.REBOOT_RECOVERY_UPDATE;pm.reboot(reason);
}

PowerManager:

// frameworks/base/core/java/android/os/PowerManager.java
public void reboot(@Nullable String reason) {mService.reboot(false, reason, true);}

PowerManagerService:

// frameworks/base/services/core/java/com/android/server/power/PowerManagerService.java
public void reboot(boolean confirm, @Nullable String reason, boolean wait) {shutdownOrRebootInternal(HALT_MODE_REBOOT, confirm, reason, wait);
}private void shutdownOrRebootInternal(final @HaltMode int haltMode, final boolean confirm,@Nullable final String reason, boolean wait) {...// 启动关机线程 ShutdownThreadRunnable runnable = new Runnable() {@Overridepublic void run() {if (haltMode == HALT_MODE_REBOOT) {ShutdownThread.reboot(getUiContext(), reason, confirm);}}};// ShutdownThread must run on a looper capable of displaying the UI.Message msg = Message.obtain(UiThread.getHandler(), runnable);msg.setAsynchronous(true);UiThread.getHandler().sendMessage(msg);// PowerManager.reboot() is documented not to return so just wait for the inevitable.if (wait) {while (true) {runnable.wait();}}
}

ShutdownThread:

// frameworks/base/services/core/java/com/android/server/power/ShutdownThread.javapublic final class ShutdownThread extends Thread {ShutdownThread sInstance = new ShutdownThread()// 1. 重启机器public static void reboot(final Context context, String reason, boolean confirm) {mReboot = true;mRebootSafeMode = false;mRebootHasProgressBar = false;mReason = reason;shutdownInner(context, confirm);}// 2. 弹出关机进度条弹窗(uncryptd 处理升级包比较耗时)private static void shutdownInner(final Context context, boolean confirm) {beginShutdownSequence(context) {sInstance.mProgressDialog = showShutdownDialog(context);sInstance.start()}}// 3. 如上文所述,UNCRYPT_PACKAGE_FILE 存在以及 BLOCK_MAP_FILE 不存在 则说明需要uncryptd // 预处理升级包,此时标记本次重启需要给用户进度条弹窗,同时该标记 mRebootHasProgressBar 在// 下文也会作为是否启动 uncryptd 的标志。private static ProgressDialog showShutdownDialog(Context context) {// mReason could be "recovery-update" or "recovery-update,quiescent".if (mReason != null && mReason.startsWith(PowerManager.REBOOT_RECOVERY_UPDATE)) {// We need the progress bar if uncrypt will be invoked during the// reboot, which might be time-consuming.mRebootHasProgressBar = RecoverySystem.UNCRYPT_PACKAGE_FILE.exists()&& !(RecoverySystem.BLOCK_MAP_FILE.exists());}...}// 4. ShutdownThread 线程的任务实现/*** Makes sure we handle the shutdown gracefully.* Shuts off power regardless of radio state if the allotted time has passed.*/public void run() {// 记录本次重启原因{String reason = (mReboot ? "1" : "0") + (mReason != null ? mReason : "");SystemProperties.set(SHUTDOWN_ACTION_PROPERTY, reason);}// 此处进入 uncryptd 开始预处理升级包if (mRebootHasProgressBar) {sInstance.setRebootProgress(MOUNT_SERVICE_STOP_PERCENT, null);// If it's to reboot to install an update and uncrypt hasn't been// done yet, trigger it now.uncrypt();}// 最后关机 or 重启rebootOrShutdown(mContext, mReboot, mReason);}// 5. public static void rebootOrShutdown(final Context context, boolean reboot, String reason) {if (reboot) {Log.i(TAG, "Rebooting, reason: " + reason);PowerManagerService.lowLevelReboot(reason);Log.e(TAG, "Reboot failed, will attempt shutdown instead");reason = null;} ...}
}

函数 uncrypt()
通过 RecoverySystem 启动 uncryptd 预处理升级包,同时监听处理进度,更新弹窗显示的进度条。

// frameworks/base/services/core/java/com/android/server/power/ShutdownThread.java
private void uncrypt() {Log.i(TAG, "Calling uncrypt and monitoring the progress...");// 定义uncryptd 预处理升级包进度监听器,更新关机进度条进度值final RecoverySystem.ProgressListener progressListener =new RecoverySystem.ProgressListener() {@Overridepublic void onProgress(int status) {if (status >= 0 && status < 100) {// Scale down to [MOUNT_SERVICE_STOP_PERCENT, 100).status = (int)(status * (100.0 - MOUNT_SERVICE_STOP_PERCENT) / 100);status += MOUNT_SERVICE_STOP_PERCENT;CharSequence msg = mContext.getText(com.android.internal.R.string.reboot_to_update_package);sInstance.setRebootProgress(status, msg);} else if (status == 100) {CharSequence msg = mContext.getText(com.android.internal.R.string.reboot_to_update_reboot);sInstance.setRebootProgress(status, msg);} else {// Ignored}}};// 通过RecoverySystem的processPackage接口启动uncryptd预处理升级包final boolean[] done = new boolean[1];done[0] = false;Thread t = new Thread() {@Overridepublic void run() {RecoverySystem rs = (RecoverySystem) mContext.getSystemService(Context.RECOVERY_SERVICE);String filename = null;try {filename = FileUtils.readTextFile(RecoverySystem.UNCRYPT_PACKAGE_FILE, 0, null);// 把调用 RecoverySystem.installPackage 准备好的UNCRYPT_PACKAGE_FILE和// 进度监听器传入processPackage接口,最终uncryptd会把 UNCRYPT_PACKAGE_FILE// 的内容作为输入预处理升级包同时通过 progressListene r反馈处理进度rs.processPackage(mContext, new File(filename), progressListener);} catch (IOException e) {Log.e(TAG, "Error uncrypting file", e);}done[0] = true;}};t.start();try {t.join(MAX_UNCRYPT_WAIT_TIME);} catch (InterruptedException unused) {}if (!done[0]) {Log.w(TAG, "Timed out waiting for uncrypt.");final int uncryptTimeoutError = 100;String timeoutMessage = String.format("uncrypt_time: %d\n" + "uncrypt_error: %d\n",MAX_UNCRYPT_WAIT_TIME / 1000, uncryptTimeoutError);try {FileUtils.stringToFile(RecoverySystem.UNCRYPT_STATUS_FILE, timeoutMessage);} catch (IOException e) {Log.e(TAG, "Failed to write timeout message to uncrypt status", e);}}}

此处不展开介绍 uncryptd 如何预处理升级包,详见文章:(待续)

3. BootLoader 读取 BCB 启动到 Recovery System

Bootloader阶段代码AOSP非实现,由芯片平台产商提供,此处只粗略介绍高通平台升级时BootLoader流程,其他平台(MTK、三星 Exynos)虽然代码实现不一样,但是流程基本一致。
函数 LinuxLoaderEntry (…)
bootloader 启动 kernel 的入口。

// Bootloader load Linux kernel 入口
LinuxLoaderEntry (IN EFI_HANDLE ImageHandle, IN EFI_SYSTEM_TABLE *SystemTable) {// 1. 从 boot reason 确认启动到那种模式// 平时 "adb reboot recovery" 就是在这里决定启动到 recovery 模式// 而升级的话是通过后面第2步决定的。Status = GetRebootReason (&BootReason);// 2. 从 misc 分区读取 BCB 内容确认是否要进入 RecoveryStatus = RecoveryInit (&BootIntoRecovery);if (!BootIntoFastboot) {BootInfo Info = {0};// 3. 设置启动参数// BootIntoRecovery 为 true 则启动到 Recovery system// 否则启动到 Main system。Info.MultiSlotBoot = MultiSlotBoot;Info.BootIntoRecovery = BootIntoRecovery;Info.BootReasonAlarm = BootReasonAlarm;// 4. 分区镜像签名校验Status = LoadImageAndAuth (&Info);// 5. 从存储器 load kernel到内存并跳转到 kernelBootLinux (&Info);}
}

函数 RecoveryInit (…)
作用:根据 misc 分区 BCB 内容判断是否启动到 recovery 模式。
实现:RecoveryInit 直接把 misc 分区头部的 raw 数据填充 RecoveryMessage 结构体 (RecoveryMessage 即 BCB 从存储器到内存中的数据表示),然后判断 command 字段是否等于字符串 “boot-recovery” 来决定是否启动到 recovery system 还是 main system(由前文可知 misc 分区头部的 BCB 数据是框架 RecoverySystem类 通过服务 setup-bcb 写入的)。

#define RECOVERY_BOOT_RECOVERY "boot-recovery"/* Recovery Message */
struct RecoveryMessage {CHAR8 command[32];CHAR8 status[32];CHAR8 recovery[1024];
};EFI_STATUS
RecoveryInit (BOOLEAN *BootIntoRecovery)
{EFI_STATUS Status;struct RecoveryMessage *Msg = NULL;EFI_GUID Ptype = gEfiMiscPartitionGuid;MemCardType CardType = UNKNOWN;VOID *PartitionData = NULL;UINT32 PageSize;CardType = CheckRootDeviceType ();if (CardType == NAND) {Status = GetNandMiscPartiGuid (&Ptype);if (Status != EFI_SUCCESS) {return Status;}}GetPageSize (&PageSize);/* Get the first 2 pages of the misc partition.* If the device type is NAND then read the recovery message from page 1,* Else read from the page 0*/Status = ReadFromPartition (&Ptype, (VOID **)&PartitionData, (PageSize * 2));if (Status != EFI_SUCCESS) {DEBUG ((EFI_D_ERROR, "Error Reading from misc partition: %r\n", Status));return Status;}if (!PartitionData) {DEBUG ((EFI_D_ERROR, "Error in loading Data from misc partition\n"));return EFI_INVALID_PARAMETER;}Msg = (CardType == NAND) ?(struct c *) ((CHAR8 *) PartitionData + PageSize) :(struct RecoveryMessage *) PartitionData;// Ensure NULL terminationMsg->command[sizeof (Msg->command) - 1] = '\0';if (Msg->command[0] != 0 && Msg->command[0] != 255)DEBUG ((EFI_D_VERBOSE, "Recovery command: %d %a\n", sizeof (Msg->command),Msg->command));if (!AsciiStrnCmp (Msg->command, RECOVERY_BOOT_RECOVERY,AsciiStrLen (RECOVERY_BOOT_RECOVERY))) {*BootIntoRecovery = TRUE;}FreePool (PartitionData);PartitionData = NULL;Msg = NULL;return Status;
}

函数 BootLinux (…)
把存储在磁盘上不同分区的 ramdisk、kernel 加载到固定的内存区域中,并设置传递给 kernel 的 cmdline,最后通过指向 kernel 在内存中的首地址的函数指针跳转到 kernel 执行,此后启动流程进入 kernel 阶段。

从软件架构篇可知, recovery system 和 main system 的 kernel、ramdisk 会从不同的分区加载到内存。recovery system 的 kernel 和 ramdisk 是从 recovery 分区加载,而 main system 的 kernel 和 ramdisk 是从 boot 分区加载。两者的区别在于ramdisk 里面打包的目录结构、配置文件,执行程序等不一样,但 kernel 实际上是完全一致的,只是运行时因为 cmdline 不同流程会有差异。

EFI_STATUS
BootLinux (BootInfo *Info) {....LinuxKernel = (LINUX_KERNEL) (UINT64)BootParamlistPtr.KernelLoadAddr;LinuxKernel ((UINT64)BootParamlistPtr.DeviceTreeLoadAddr, 0, 0, 0);
}

4. Kernel 加载 ramdisk,启动 init 并拉起 recovery 进程

(待续)

5. 进入 Recovery 升级流程

Android Q 开始 Google 在 recovery 模式下增加了 fastbootd,用于使用动态分区的设备烧写system、vendor等分区,因此 main 函数里面调用 StartFastboot 或者 start_recovery 进入到不同的子模式中。

函数 main()
通过参数决定进入 user fastboot模式(StartFastboot) 还是 recovery 模式(start_recovery),同时在退出 fastboot/recovery 模式后根据返回值决定重启或者关机。

fastbootd
在用户态打开一个usb端口同时实现了bootloader fastboot 数据传输协议的服务,在这个模式下可以使用fastboot.exe烧写设备分区镜像,本文不做详细介绍。

// bootable/recovery/recovery_main.cpp
int main(int argc, char** argv) {// 初始化 log// We don't have logcat yet under recovery; so we'll print error on screen and log to stdout// (which is redirected to recovery.log) as we used to do.android::base::InitLogging(argv, &UiLogger);// 将程序标准输出重定向到临时log文件 /tmp/recovery.log// redirect_stdio should be called only in non-sideload mode. Otherwise we may have two logger// instances with different timestamps.redirect_stdio(Paths::Get().temporary_log_file().c_str());// 从 fstab load 分区信息load_volume_table();// 从 misc 分区把存储在 BCB 里的升级指令取出并保存到数组 argsstd::vector<std::string> args = get_args(argc, argv, &stage);while (true) {// We start adbd in recovery for the device with userdebug build or a unlocked bootloader.std::string usb_config =fastboot ? "fastboot" : IsRoDebuggable() || IsDeviceUnlocked() ? "adb" : "none";std::string usb_state = android::base::GetProperty("sys.usb.state", "none");if (usb_config != usb_state) {if (!SetUsbConfig("none")) {LOG(ERROR) << "Failed to clear USB config";}if (!SetUsbConfig(usb_config)) {LOG(ERROR) << "Failed to set USB config to " << usb_config;}}// 通过 args 里的参数识别到 recovery 模式,进入 start_recovery,并传入从 misc 分区 BCB// 读到升级指令数组。auto ret = fastboot ? StartFastboot(device, args) : start_recovery(device, args);// 升级结束,关机 or 重启 等switch (ret) {case Device::REBOOT:ui->Print("Rebooting...\n");Reboot("userrequested,recovery");break;}}// Should be unreachable.return EXIT_SUCCESS;
}

5.1 从 misc 分区 BCB 读取升级指令

函数 get_args()
从函数注释可知升级指令有三个来源,依次读取解析,只要其中一个地方读取到指令则直接返回。
get_args 依次从下面三个地方获取升级指令:

  1. 进程启动参数
  2. misc 分区 BCB
  3. COMMAND_FILE (/cache/recovery/command)

升级流程实际上都是从 第2项 misc 分区 BCB 读取的。

// bootable/recovery/recovery_main.cpp
// Parses the command line argument from various sources; and reads the stage field from BCB.
// command line args come from, in decreasing precedence:
//   - the actual command line
//   - the bootloader control block (one per line, after "recovery")
//   - the contents of COMMAND_FILE (one per line)
static std::vector<std::string> get_args(const int argc, char** const argv, std::string* stage) {CHECK_GT(argc, 0);bootloader_message boot = {};std::string err;// 1. 把 misc 分区头部 BCB 数据填充到 bootloader_message 结构体 bootif (!read_bootloader_message(&boot, &err)) {LOG(ERROR) << err;// If fails, leave a zeroed bootloader_message.boot = {};}if (stage) {*stage = std::string(boot.stage);}std::string boot_command;if (boot.command[0] != 0) {if (memchr(boot.command, '\0', sizeof(boot.command))) {boot_command = std::string(boot.command);} else {boot_command = std::string(boot.command, sizeof(boot.command));}LOG(INFO) << "Boot command: " << boot_command;}if (boot.status[0] != 0) {std::string boot_status = std::string(boot.status, sizeof(boot.status));LOG(INFO) << "Boot status: " << boot_status;}// 2. 把进程启动参数作为默认升级指令参数 (通常为空)   std::vector<std::string> args(argv, argv + argc);// 3. 如果进程启动参数为空,则从 misc BCB 的 "recovery" 字段获取升级指令// --- if arguments weren't supplied, look in the bootloader control blockif (args.size() == 1) {boot.recovery[sizeof(boot.recovery) - 1] = '\0';  // Ensure terminationstd::string boot_recovery(boot.recovery);std::vector<std::string> tokens = android::base::Split(boot_recovery, "\n");if (!tokens.empty() && tokens[0] == "recovery") {for (auto it = tokens.begin() + 1; it != tokens.end(); it++) {// Skip empty and '\0'-filled tokens.if (!it->empty() && (*it)[0] != '\0') args.push_back(std::move(*it));}LOG(INFO) << "Got " << args.size() << " arguments from boot message";} else if (boot.recovery[0] != 0) {LOG(ERROR) << "Bad boot message: \"" << boot_recovery << "\"";}}// 4. 如果前面均没有获取到参数则从 COMMAND_FILE 获取参数// --- if that doesn't work, try the command file (if we have /cache).if (args.size() == 1 && HasCache()) {std::string content;if (ensure_path_mounted(COMMAND_FILE) == 0 &&android::base::ReadFileToString(COMMAND_FILE, &content)) {std::vector<std::string> tokens = android::base::Split(content, "\n");// All the arguments in COMMAND_FILE are needed (unlike the BCB message,// COMMAND_FILE doesn't use filename as the first argument).for (auto it = tokens.begin(); it != tokens.end(); it++) {// Skip empty and '\0'-filled tokens.if (!it->empty() && (*it)[0] != '\0') args.push_back(std::move(*it));}LOG(INFO) << "Got " << args.size() << " arguments from " << COMMAND_FILE;}}// 5. 把读到的参数更新或misc 分区 BCB,这个操作是针对从进程启动参数或者COMMAND_FILE获取// 升级指令设计的,这样可以使得指令在正常退出前都保存在misc分区,即使中间出现中断等情况,设// 备可以自动恢复完成指令,知道最后执行完毕主动擦除misc分区// ***  此处增强了升级的可靠性 ****// Write the arguments (excluding the filename in args[0]) back into the// bootloader control block. So the device will always boot into recovery to// finish the pending work, until FinishRecovery() is called.std::vector<std::string> options(args.cbegin() + 1, args.cend());if (!update_bootloader_message(options, &err)) {LOG(ERROR) << "Failed to set BCB message: " << err;}// Finally, if no arguments were specified, check whether we should boot// into fastboot or rescue mode.if (args.size() == 1 && boot_command == "boot-fastboot") {args.emplace_back("--fastboot");} else if (args.size() == 1 && boot_command == "boot-rescue") {args.emplace_back("--rescue");}return args;
}

5.2 把升级包 mmap 到内存

升级流程进入 start_recovery(),接着调用 mmap 把升级包数据从存储器映射到进程内存空间,见源码分析。
函数 start_recovery()

Device::BuiltinAction start_recovery(Device* device, const std::vector<std::string>& args) {// 1. 从参数 "update_package" 里得到升级包的路径static constexpr struct option OPTIONS[] = {{ "update_package", required_argument, nullptr, 0 },};const char* update_package = nullptr;auto args_to_parse = StringVectorToNullTerminatedArray(args);// Parse everything before the last element (which must be a nullptr). getopt_long(3) expects a// null-terminated char* array, but without counting null as an arg (i.e. argv[argc] should be// nullptr).while ((arg = getopt_long(args_to_parse.size() - 1, args_to_parse.data(), "", OPTIONS,&option_index)) != -1) {switch (arg) {...case 0: {std::string option = OPTIONS[option_index].name;if (option == "install_with_fuse") {...} else if (option == "update_package") {update_package = optarg;}}}}InstallResult status = INSTALL_SUCCESS;// next_action indicates the next target to reboot into upon finishing the install. It could be// overridden to a different reboot target per user request.Device::BuiltinAction next_action = shutdown_after ? Device::SHUTDOWN : Device::REBOOT;if (update_package != nullptr) {// It's not entirely true that we will modify the flash. But we want// to log the update attempt since update_package is non-NULL.save_current_log = true;if (int required_battery_level; retry_count == 0 && !IsBatteryOk(&required_battery_level)) {ui->Print("battery capacity is not enough for installing package: %d%% needed\n",required_battery_level);// Log the error code to last_install when installation skips due to low battery.log_failure_code(kLowBattery, update_package);status = INSTALL_SKIPPED;} else if (retry_count == 0 && bootreason_in_blacklist()) {// Skip update-on-reboot when bootreason is kernel_panic or similarui->Print("bootreason is in the blacklist; skip OTA installation\n");log_failure_code(kBootreasonInBlacklist, update_package);status = INSTALL_SKIPPED;} else {// retry_count  用于记录升级过程中设备是否发生过重启// It's a fresh update. Initialize the retry_count in the BCB to 1; therefore we can later// identify the interrupted update due to unexpected reboots.if (retry_count == 0) {set_retry_bootloader_message(retry_count + 1, args);}if (update_package[0] == '@') {ensure_path_mounted(update_package + 1);} else {ensure_path_mounted(update_package);}// 2. 由函数名 CreateMemoryPackage 可知,把升级包 mmap 到内存,并通过对象// memory_package 管理 mmap 到内存中的升级包。if (install_with_fuse) {...} else if (auto memory_package = Package::CreateMemoryPackage(update_package,std::bind(&RecoveryUI::SetProgress, ui, std::placeholders::_1));memory_package != nullptr) {// 3. InstallPackage :有函数名可知开始安装升级包status = InstallPackage(memory_package.get(), update_package, should_wipe_cache,retry_count, ui);} else {...}if (status != INSTALL_SUCCESS) {ui->Print("Installation aborted.\n");// 4. 有时在升级过程中会发生 I/O 错误 可能导致升级无法进行下去,通常这类// 错误重启设备再次写数据就不会发生,因此 google 设计了一套升级中断并恢复升级的机制,// 此处就是当系统出现 I/O 等错误时,重启设备,再次尝试升级。// When I/O error or bspatch/imgpatch error happens, reboot and retry installation// RETRY_LIMIT times before we abandon this OTA update.static constexpr int RETRY_LIMIT = 4;if (status == INSTALL_RETRY && retry_count < RETRY_LIMIT) {copy_logs(save_current_log);// retry_count 加1,重启恢复升级时通过该标记就知道此次升级是属于// 重启后再次尝试升级,恢复机制生效。retry_count += 1;set_retry_bootloader_message(retry_count, args);// Print retry count on screen.ui->Print("Retry attempt %d\n", retry_count);// Reboot back into recovery to retry the update.Reboot("recovery");}}}}...
}

Package::CreateMemoryPackage
该方法实质上就是调用 mmap 把升级包数据映射到进程内存,但是还记得框架对保存在data分区中升级包做了解密处理同时传给 recovery 的升级包路径是 “@/cache/recovery/block.map”。
这是一个很巧妙的操作,会单独讲解,详见:(待续)。

INSTALL_RETRY
这个是一个特殊的升级失败错误码,得益于 Google 设计了一套升级中断后恢复的机制,可以在升级过程中出现设备重启、进程被杀等中断(主动或者被动)场景后继续恢复升级。此处就是遇到系统 I/O 错误时,主动重启机器后再次尝试升级,升级中断恢复机制详见:待续)。

5.3 校验升级包完整性、合法性

InstallResult InstallPackage(Package* package, const std::string_view package_id,bool should_wipe_cache, int retry_count, RecoveryUI* ui) {...bool updater_wipe_cache = false;result = VerifyAndInstallPackage(package, &updater_wipe_cache, &log_buffer, retry_count,&max_temperature, ui);should_wipe_cache = should_wipe_cache || updater_wipe_cache;...
}
static InstallResult VerifyAndInstallPackage(Package* package, bool* wipe_cache,std::vector<std::string>* log_buffer, int retry_count,int* max_temperature, RecoveryUI* ui) {// Verify package.if (!verify_package(package, ui)) {log_buffer->push_back(android::base::StringPrintf("error: %d", kZipVerificationFailure));return INSTALL_CORRUPT;}// Verify and install the contents of the package.ui->Print("Installing update...\n");if (retry_count > 0) {ui->Print("Retry attempt: %d\n", retry_count);}ui->SetEnableReboot(false);auto result = TryUpdateBinary(package, wipe_cache, log_buffer, retry_count, max_temperature, ui);ui->SetEnableReboot(true);ui->Print("\n");return result;
}
bool verify_package(Package* package, RecoveryUI* ui) {static constexpr const char* CERTIFICATE_ZIP_FILE = "/system/etc/security/otacerts.zip";std::vector<Certificate> loaded_keys = LoadKeysFromZipfile(CERTIFICATE_ZIP_FILE);if (loaded_keys.empty()) {return false;}int err = verify_file(package, loaded_keys);if (err != VERIFY_SUCCESS) {return false;}return true;
}

校验升级包的签名是否合法,本质上对升级包做 RSA 签名校验。

  1. 首先服务器用私钥签名升级包,同时把证书嵌入到升级包尾部;
  2. 升级包校验时从尾部取出证书,再从证书中取出公钥;
  3. 接着通过设备里存储的公钥列表判断该公钥是否合法;
  4. 最后使用该公钥验签。

升级包签名校验技术细节详见:(待续)

5.4 Fork update-binary 子进程升级系统

Google 在设计升级流程时有很多灵活巧妙的地方,比如上文提到的升级中断恢复机制。接下来介绍的 update binary 也是非常巧妙的。recovery 进程 (/system/bin/recovery) 在整个升级过程中实际上只是充当流程控制的角色,升级的实际执行者是 update-binary,它被打包到升级包路径 META-INF/com/google/android/update-binary

update-binary 运行流程如下

  1. recovery mmap 升级包到内存(上文已介绍);
  2. recovery 调用 TryUpdateBinary () -> SetUpNonAbUpdateCommands() 把 update-binary 从升级包里面释放到设备路径 /tmp/update-binary 下;
  3. recovery fork 子进程启动 update-binary,同时建立管道和子进程建立进程间通信;
  4. update-binary 调用 mmap 把升级包映射到自己的内存空间,然后开始从升级包拿数据更新相关分区的块设备数据,升级系统;
  5. update-binary 通过管道向父进程 recovery 传递升级进度、数据等,接着 recovery 更新界面进度条;
  6. recovery 进程调用 waitpid(pid, &status, 0) 等待 update-binary 子进程升级结束,最后根据进程退出码 status 的值判断升级是否成功;

update-binary 打包到升级包的好处

Android 系统升级过程实际上是比较复杂的,特别是基于存储块打 patch 的增量升级,很难保证不会出现bug,一旦出现严重bug,那么很可能导致用户手中的设备无法升级,这个影响就很大了。本来升级就是为了解决系统bug,但是这时recovery本身存在bug导致设别无法升级,那就很尴尬了。
update-binary 打包到升级包中,升级时释放到内存,再通过 update-binary 完成系统升级, 这样即使 update-binary 存在严重bug,再给用户推送新的升级包时解决掉就好,不影响系统升级到新版本

函数 TryUpdateBinary()

// If the package contains an update binary, extract it and run it.
static InstallResult TryUpdateBinary(Package* package, bool* wipe_cache,std::vector<std::string>* log_buffer, int retry_count,int* max_temperature, RecoveryUI* ui) {std::map<std::string, std::string> metadata;auto zip = package->GetZipArchiveHandle();if (!ReadMetadataFromPackage(zip, &metadata)) {LOG(ERROR) << "Failed to parse metadata in the zip file";return INSTALL_CORRUPT;}bool is_ab = android::base::GetBoolProperty("ro.build.ab_update", false);if (is_ab) {CHECK(package->GetType() == PackageType::kFile);}// Verify against the metadata in the package first.if (is_ab && !CheckPackageMetadata(metadata, OtaType::AB)) {log_buffer->push_back(android::base::StringPrintf("error: %d", kUpdateBinaryCommandFailure));return INSTALL_ERROR;}ReadSourceTargetBuild(metadata, log_buffer);// The updater in child process writes to the pipe to communicate with recovery.android::base::unique_fd pipe_read, pipe_write;// Explicitly disable O_CLOEXEC using 0 as the flags (last) parameter to Pipe// so that the child updater process will recieve a non-closed fd.if (!android::base::Pipe(&pipe_read, &pipe_write, 0)) {PLOG(ERROR) << "Failed to create pipe for updater-recovery communication";return INSTALL_CORRUPT;}// The updater-recovery communication protocol.////   progress <frac> <secs>//       fill up the next <frac> part of of the progress bar over <secs> seconds. If <secs> is//       zero, use `set_progress` commands to manually control the progress of this segment of the//       bar.////   set_progress <frac>//       <frac> should be between 0.0 and 1.0; sets the progress bar within the segment defined by//       the most recent progress command.////   ui_print <string>//       display <string> on the screen.////   wipe_cache//       a wipe of cache will be performed following a successful installation.////   clear_display//       turn off the text display.////   enable_reboot//       packages can explicitly request that they want the user to be able to reboot during//       installation (useful for debugging packages that don't exit).////   retry_update//       updater encounters some issue during the update. It requests a reboot to retry the same//       package automatically.////   log <string>//       updater requests logging the string (e.g. cause of the failure).//std::string package_path = package->GetPath();std::vector<std::string> args;if (auto setup_result =is_ab ? SetUpAbUpdateCommands(package_path, zip, pipe_write.get(), &args): SetUpNonAbUpdateCommands(package_path, zip, retry_count, pipe_write.get(), &args);!setup_result) {log_buffer->push_back(android::base::StringPrintf("error: %d", kUpdateBinaryCommandFailure));return INSTALL_CORRUPT;}pid_t pid = fork();if (pid == -1) {PLOG(ERROR) << "Failed to fork update binary";log_buffer->push_back(android::base::StringPrintf("error: %d", kForkUpdateBinaryFailure));return INSTALL_ERROR;}if (pid == 0) {umask(022);pipe_read.reset();// Convert the std::string vector to a NULL-terminated char* vector suitable for execv.auto chr_args = StringVectorToNullTerminatedArray(args);execv(chr_args[0], chr_args.data());// We shouldn't use LOG/PLOG in the forked process, since they may cause the child process to// hang. This deadlock results from an improperly copied mutex in the ui functions.// (Bug: 34769056)fprintf(stdout, "E:Can't run %s (%s)\n", chr_args[0], strerror(errno));_exit(EXIT_FAILURE);}pipe_write.reset();std::atomic<bool> logger_finished(false);std::thread temperature_logger(log_max_temperature, max_temperature, std::ref(logger_finished));*wipe_cache = false;bool retry_update = false;char buffer[1024];FILE* from_child = android::base::Fdopen(std::move(pipe_read), "r");while (fgets(buffer, sizeof(buffer), from_child) != nullptr) {std::string line(buffer);size_t space = line.find_first_of(" \n");std::string command(line.substr(0, space));if (command.empty()) continue;// Get rid of the leading and trailing space and/or newline.std::string args = space == std::string::npos ? "" : android::base::Trim(line.substr(space));if (command == "progress") {std::vector<std::string> tokens = android::base::Split(args, " ");double fraction;int seconds;if (tokens.size() == 2 && android::base::ParseDouble(tokens[0].c_str(), &fraction) &&android::base::ParseInt(tokens[1], &seconds)) {ui->ShowProgress(fraction * (1 - VERIFICATION_PROGRESS_FRACTION), seconds);} else {LOG(ERROR) << "invalid \"progress\" parameters: " << line;}} else if (command == "set_progress") {std::vector<std::string> tokens = android::base::Split(args, " ");double fraction;if (tokens.size() == 1 && android::base::ParseDouble(tokens[0].c_str(), &fraction)) {ui->SetProgress(fraction);} else {LOG(ERROR) << "invalid \"set_progress\" parameters: " << line;}} else if (command == "ui_print") {ui->PrintOnScreenOnly("%s\n", args.c_str());fflush(stdout);} else if (command == "wipe_cache") {*wipe_cache = true;} else if (command == "clear_display") {ui->SetBackground(RecoveryUI::NONE);} else if (command == "enable_reboot") {// packages can explicitly request that they want the user// to be able to reboot during installation (useful for// debugging packages that don't exit).ui->SetEnableReboot(true);} else if (command == "retry_update") {retry_update = true;} else if (command == "log") {if (!args.empty()) {// Save the logging request from updater and write to last_install later.log_buffer->push_back(args);} else {LOG(ERROR) << "invalid \"log\" parameters: " << line;}} else {LOG(ERROR) << "unknown command [" << command << "]";}}fclose(from_child);int status;waitpid(pid, &status, 0);logger_finished.store(true);finish_log_temperature.notify_one();temperature_logger.join();if (retry_update) {return INSTALL_RETRY;}if (WIFEXITED(status)) {if (WEXITSTATUS(status) != EXIT_SUCCESS) {LOG(ERROR) << "Error in " << package_path << " (status " << WEXITSTATUS(status) << ")";return INSTALL_ERROR;}} else if (WIFSIGNALED(status)) {LOG(ERROR) << "Error in " << package_path << " (killed by signal " << WTERMSIG(status) << ")";return INSTALL_ERROR;} else {LOG(FATAL) << "Invalid status code " << status;}return INSTALL_SUCCESS;
}
bool SetUpNonAbUpdateCommands(const std::string& package, ZipArchiveHandle zip, int retry_count,int status_fd, std::vector<std::string>* cmd) {CHECK(cmd != nullptr);// In non-A/B updates we extract the update binary from the package.static constexpr const char* UPDATE_BINARY_NAME = "META-INF/com/google/android/update-binary";ZipEntry binary_entry;if (FindEntry(zip, UPDATE_BINARY_NAME, &binary_entry) != 0) {LOG(ERROR) << "Failed to find update binary " << UPDATE_BINARY_NAME;return false;}const std::string binary_path = Paths::Get().temporary_update_binary();unlink(binary_path.c_str());android::base::unique_fd fd(open(binary_path.c_str(), O_CREAT | O_WRONLY | O_TRUNC | O_CLOEXEC, 0755));if (fd == -1) {PLOG(ERROR) << "Failed to create " << binary_path;return false;}if (auto error = ExtractEntryToFile(zip, &binary_entry, fd); error != 0) {LOG(ERROR) << "Failed to extract " << UPDATE_BINARY_NAME << ": " << ErrorCodeString(error);return false;}// When executing the update binary contained in the package, the arguments passed are://   - the version number for this interface//   - an FD to which the program can write in order to update the progress bar.//   - the name of the package zip file.//   - an optional argument "retry" if this update is a retry of a failed update attempt.*cmd = {binary_path,std::to_string(kRecoveryApiVersion),std::to_string(status_fd),package,};if (retry_count > 0) {cmd->push_back("retry");}return true;
}

update-binary
源码路径:bootable/recovery/updater/
update-binary 作为升级的实际执行者,其内部流程也是很复杂的。从下文的源码可以看出,update-binary 从进程启动参数得到升级包的路径,然后构造好参数 Updater 后,调用Updater.RunUpdate 开始执行升级任务。
update-binary 本身很复杂,它如何完成系统升级的详见文章:(待续)。

// bootable/recovery/updater/updater_main.cpp
static void UpdaterLogger(android::base::LogId /* id */, android::base::LogSeverity /* severity */,const char* /* tag */, const char* /* file */, unsigned int /* line */,const char* message) {fprintf(stdout, "%s\n", message);
}int main(int argc, char** argv) {// Various things log information to stdout or stderr more or less// at random (though we've tried to standardize on stdout).  The// log file makes more sense if buffering is turned off so things// appear in the right order.setbuf(stdout, nullptr);setbuf(stderr, nullptr);// We don't have logcat yet under recovery. Update logs will always be written to stdout// (which is redirected to recovery.log).android::base::InitLogging(argv, &UpdaterLogger);// Run the libcrypto KAT(known answer tests) based self tests.if (BORINGSSL_self_test() != 1) {LOG(ERROR) << "Failed to run the boringssl self tests";return EXIT_FAILURE;}if (argc != 4 && argc != 5) {LOG(ERROR) << "unexpected number of arguments: " << argc;return EXIT_FAILURE;}char* version = argv[1];if ((version[0] != '1' && version[0] != '2' && version[0] != '3') || version[1] != '\0') {// We support version 1, 2, or 3.LOG(ERROR) << "wrong updater binary API; expected 1, 2, or 3; got " << argv[1];return EXIT_FAILURE;}int fd;if (!android::base::ParseInt(argv[2], &fd)) {LOG(ERROR) << "Failed to parse fd in " << argv[2];return EXIT_FAILURE;}std::string package_name = argv[3];bool is_retry = false;if (argc == 5) {if (strcmp(argv[4], "retry") == 0) {is_retry = true;} else {LOG(ERROR) << "unexpected argument: " << argv[4];return EXIT_FAILURE;}}// Configure edify's functions.RegisterBuiltins();RegisterInstallFunctions();RegisterBlockImageFunctions();RegisterDynamicPartitionsFunctions();RegisterDeviceExtensions();auto sehandle = selinux_android_file_context_handle();selinux_android_set_sehandle(sehandle);Updater updater(std::make_unique<UpdaterRuntime>(sehandle));if (!updater.Init(fd, package_name, is_retry)) {return EXIT_FAILURE;}if (!updater.RunUpdate()) {return EXIT_FAILURE;}return EXIT_SUCCESS;
}

5.5 退出 update-binary 子进程、保存 log 并擦除 misc 分区 BCB

在上文介绍的函数 TryUpdateBinary 可以看到,recovery 进程 fork 出 update-binary 子进程后进入 while 循环从管道里读取从子进程传递过来的数据,解析成命令后执行对应的操作。

recovery 进程在打开管道的读端时没有参数 O_NONBLOCK,所以是阻塞式IO,那么只要子进程没有关闭写端,while 循环就不会退出,因此 recovery 要么被阻塞等待数据,要么读到数据解析命令执行操作,直到子进程退出。

FILE* from_child = android::base::Fdopen(std::move(pipe_read), "r");
while (fgets(buffer, sizeof(buffer), from_child) != nullptr) {std::string line(buffer);
size_t space = line.find_first_of(" \n");
std::string command(line.substr(0, space));
if (command.empty()) continue;// Get rid of the leading and trailing space and/or newline.
std::string args = space == std::string::npos ? "" : android::base::Trim(line.substr(space));if (command == "progress") {std::vector<std::string> tokens = android::base::Split(args, " ");double fraction;int seconds;if (tokens.size() == 2 && android::base::ParseDouble(tokens[0].c_str(), &fraction) &&android::base::ParseInt(tokens[1], &seconds)) {ui->ShowProgress(fraction * (1 - VERIFICATION_PROGRESS_FRACTION), seconds);} else {LOG(ERROR) << "invalid \"progress\" parameters: " << line;}
}
...
}

update-binary 进程执行完毕退出时,会关闭管道的写端,这时 recovery 进程退出监听子进程消息的 where 循环,接下来代码继续执行到:

  int status;waitpid(pid, &status, 0);logger_finished.store(true);finish_log_temperature.notify_one();temperature_logger.join();if (retry_update) {return INSTALL_RETRY;}if (WIFEXITED(status)) {if (WEXITSTATUS(status) != EXIT_SUCCESS) {LOG(ERROR) << "Error in " << package_path << " (status " << WEXITSTATUS(status) << ")";return INSTALL_ERROR;}} else if (WIFSIGNALED(status)) {LOG(ERROR) << "Error in " << package_path << " (killed by signal " << WTERMSIG(status) << ")";return INSTALL_ERROR;} else {LOG(FATAL) << "Invalid status code " << status;}return INSTALL_SUCCESS;

可以看出 recovery 调用 waitpid(pid, &status, 0),获取子进程的退出码。根据退出码来判断升级是否成功,接着流程从 install/install.cpp 回到 recovery.cpp。

Device::BuiltinAction start_recovery(Device* device, const std::vector<std::string>& args) {...// Determine the next action.//  - If the state is INSTALL_REBOOT, device will reboot into the target as specified in//    `next_action`.//  - If the recovery menu is visible, prompt and wait for commands.//  - If the state is INSTALL_NONE, wait for commands (e.g. in user build, one manually boots//    into recovery to sideload a package or to wipe the device).//  - In all other cases, reboot the device. Therefore, normal users will observe the device//    rebooting a) immediately upon successful finish (INSTALL_SUCCESS); or b) an "error" screen//    for 5s followed by an automatic reboot.if (status != INSTALL_REBOOT) {if (status == INSTALL_NONE || ui->IsTextVisible()) {auto temp = PromptAndWait(device, status);if (temp != Device::NO_ACTION) {next_action = temp;}}}// Save logs and clean up before rebooting or shutting down.FinishRecovery(ui);return next_action;
}

InstallPackage() 的返回值有 INSTALL_SUCCESS、INSTALL_RETRY、INSTALL_SUCCESS,即源码中的 status 变量的值。

  • 如果升级失败:进入函数 PromptAndWait(),界面上会显示提示信息,用户确认后才能进行下一步操作(这个步骤意义不大,不做进一步介绍)。
  • 如果升级成功或者退出 PromptAndWait() 时:进入 FinishRecovery() ,做退出 recovery 的准备工作。

函数FinishRecovery()

// Clear the recovery command and prepare to boot a (hopefully working) system,
// copy our log file to cache as well (for the system to read). This function is
// idempotent: call it as many times as you like.
static void FinishRecovery(RecoveryUI* ui) {std::string locale = ui->GetLocale();// Save the locale to cache, so if recovery is next started up without a '--locale' argument// (e.g., directly from the bootloader) it will use the last-known locale.if (!locale.empty() && HasCache()) {LOG(INFO) << "Saving locale \"" << locale << "\"";if (ensure_path_mounted(LOCALE_FILE) != 0) {LOG(ERROR) << "Failed to mount " << LOCALE_FILE;} else if (!android::base::WriteStringToFile(locale, LOCALE_FILE)) {PLOG(ERROR) << "Failed to save locale to " << LOCALE_FILE;}}copy_logs(save_current_log);// Reset to normal system boot so recovery won't cycle indefinitely.std::string err;if (!clear_bootloader_message(&err)) {LOG(ERROR) << "Failed to clear BCB message: " << err;}// Remove the command file, so recovery won't repeat indefinitely.if (HasCache()) {if (ensure_path_mounted(COMMAND_FILE) != 0 || (unlink(COMMAND_FILE) && errno != ENOENT)) {LOG(WARNING) << "Can't unlink " << COMMAND_FILE;}ensure_path_unmounted(CACHE_ROOT);}sync();  // For good measure.
}

FinishRecovery 两个关键的操作:

  1. 把当前输出到内存中的log文件 /tmp/recovery.log 转存到 /cache/recovery 下
    recovery log 之所以先输出到内存文件 /tmp/recovery.log 中而不直接保存在/cache/recovery的原因:由 main 函数可知,recovery log 是通过重定向的方式实时输出到文件,如果直接保存到 /cache/recovery 下那么和 recovery 的常规动作“擦除cache分区(格式化分区)”相冲突,当执行 wipeCache 的时候会因为 cache 分区被占用无法卸载,导致擦除失败。

  2. clear_bootloader_message 把misc分区BCB数据擦除
    在升级流程结束时要及时擦除 misc 分区 BCB。因为再次重启设备,在 BootLoader 检测 BCB 数据的时不会又回到 recovery system,当然也不能过早擦除 misc 分区的 BCB,因为这是升级中断恢复机制的重要一环。

5.6 重启机器返回 main system

退出函数 start_recovery,流程又回到 recovery_main.cpp,根据 start_recovery 返回值,重启到目标系统(正常一般是 main system)。

// recovery_main.cppauto ret = fastboot ? StartFastboot(device, args) : start_recovery(device, args);if (ret == Device::KEY_INTERRUPTED) {ret = action.exchange(ret);if (ret == Device::NO_ACTION) {continue;}}switch (ret) {case Device::SHUTDOWN:ui->Print("Shutting down...\n");Shutdown("userrequested,recovery");break;case Device::SHUTDOWN_FROM_FASTBOOT:ui->Print("Shutting down...\n");Shutdown("userrequested,fastboot");break;case Device::REBOOT_BOOTLOADER:ui->Print("Rebooting to bootloader...\n");Reboot("bootloader");break;case Device::REBOOT_FASTBOOT:ui->Print("Rebooting to recovery/fastboot...\n");Reboot("fastboot");break;case Device::REBOOT_RECOVERY:ui->Print("Rebooting to recovery...\n");Reboot("recovery");break;case Device::REBOOT_RESCUE: {// Not using `Reboot("rescue")`, as it requires matching support in kernel and/or// bootloader.bootloader_message boot = {};strlcpy(boot.command, "boot-rescue", sizeof(boot.command));std::string err;if (!write_bootloader_message(boot, &err)) {LOG(ERROR) << "Failed to write bootloader message: " << err;// Stay under recovery on failure.continue;}ui->Print("Rebooting to recovery/rescue...\n");Reboot("recovery");break;}case Device::ENTER_FASTBOOT:if (android::fs_mgr::LogicalPartitionsMapped()) {ui->Print("Partitions may be mounted - rebooting to enter fastboot.");Reboot("fastboot");} else {LOG(INFO) << "Entering fastboot";fastboot = true;}break;case Device::ENTER_RECOVERY:LOG(INFO) << "Entering recovery";fastboot = false;break;case Device::REBOOT:ui->Print("Rebooting...\n");Reboot("userrequested,recovery");break;case Device::REBOOT_FROM_FASTBOOT:ui->Print("Rebooting...\n");Reboot("userrequested,fastboot");break;default:ui->Print("Rebooting...\n");Reboot("unknown" + std::to_string(ret));break;}
void Reboot(std::string_view target) {std::string cmd = "reboot," + std::string(target);// Honor the quiescent mode if applicable.if (target != "bootloader" && target != "fastboot" &&android::base::GetBoolProperty("ro.boot.quiescent", false)) {cmd += ",quiescent";}if (!android::base::SetProperty(ANDROID_RB_PROPERTY, cmd)) {LOG(FATAL) << "Reboot failed";}while (true) pause();
}bool Shutdown(std::string_view target) {std::string cmd = "shutdown," + std::string(target);return android::base::SetProperty(ANDROID_RB_PROPERTY, cmd);
}

6. BootLoader 启动 Main System

此处流程和 3. BootLoader 读取 BCB 启动到 Recovery System 大同小异,只不过此时的 misc 分区 BCB 在退出 recovery 的时候已经被擦除了,因此启动的是 boot 分区的 kernel,接着启动到 main system。

7. Init 拉起 flash_recovery 服务升级 recovery 分区

vendor_flash_recovery 服务定义在 rc 文件,它的执行程序是 /vendor/bin/install-recovery.sh 。当然 flash_recovery 的定义不是唯一的,部分厂商就把他定义在 system,但是实际完成的任务和原理都一样,把 recovery 分区升级到新版本。

以 AOSP 为例,flash_recovery 定义在 vendor,名字改成了 vendor_flash_recovery :

# bootable/recovery/applypatch/vendor_flash_recovery.rc
service vendor_flash_recovery /vendor/bin/install-recovery.shclass mainoneshot

7.1 flash_recovery 存在的意义是什么?

为什么不把 recovery 分区的镜像打包到升级包,recovery 升级系统的时候顺便把自己也升级了 ?原因主要有以下两点:

  • 升级稳定性
    假如在升级 recovery 分区的时候发生异常重启,这时分区数据只写了一半,那么 recovery 分区的数据一定损坏了,这时上文提到的升级中断恢复机制就无法正常运行,因为再也无法启动到 recovery system,回过头来看目前的 flash_recovery 这套机制就很好的解决了这个问题。
  • 系统安全性
    论坛上经常有发烧友通过刷入第三方recovery来烧写第三方rom或者获取手机数据,非常不安全,这套机制可以一定程度上解决这个问题,flash_recovery 每次重启时都会计算 recovery 分区数据的 SHA1 值是否和预期匹配,不匹配就会恢复 recovery 分区的数据。

那么问题来了,flash_recovery 是怎么升级或者恢复 recovery 分区的?请见 7.3介绍。

7.2 什么时候启动 flash_recovery 服务?

由 vendor_flash_recovery.rc 可以可知,vendor_flash_recovery 属于 main class,也就是当触发启动 main 类服务时 vendor_flash_recovery 也就开始工作。
从 init.rc 可以看到,如果分区未加密,则在触发 nonencrypted 时启动 main class 服务,否则由加解密流程属性 vold.decrypt 控制。

  • on nonencrypted 在 builtins.cpp 中函数 queue_fs_event 触发。
  • 属性 decrypt 在 system/vold/cryptfs.cpp 中被设置。
// system/core/init/builtins.cpp
static Result<void> queue_fs_event(int code, bool userdata_remount)
# system/core/rootdir/init.rc
on nonencryptedclass_start mainclass_start late_starton property:vold.decrypt=trigger_restart_min_framework# A/B update verifier that marks a successful boot.exec_start update_verifierclass_start mainon property:vold.decrypt=trigger_restart_framework# A/B update verifier that marks a successful boot.exec_start update_verifierclass_start_post_data halclass_start_post_data coreclass_start mainclass_start late_startsetprop service.bootanim.exit 0start bootanimon property:vold.decrypt=trigger_shutdown_frameworkclass_reset late_startclass_reset mainclass_reset_post_data coreclass_reset_post_data hal

7.3 flash_recovery 怎么升级 recovery 分区?

启动服务 vendor_flash_recovery ,执行脚本 install-recovery.sh 升级 recovery 分区。

# bootable/recovery/applypatch/vendor_flash_recovery.rc
service vendor_flash_recovery /vendor/bin/install-recovery.shclass mainoneshot

install-recovery.sh 脚本的内容如下所示:

#!/system/bin/sh
if ! applypatch --check EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25; thenapplypatch  \--patch /system/recovery-from-boot.p \--source EMMC:/dev/block/bootdevice/by-name/boot:100663296:a362e080d203e34fbdcce47278cda2bda566409a \--target EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25 && \log -t recovery "Installing new recovery image: succeeded" || \log -t recovery "Installing new recovery image: failed"
elselog -t recovery "Recovery image already installed"

脚本升级 recovery 分区的步骤如下:
1). 计算 recovery 分区的 SHA1 值是否匹配;

SHA1 值是固化在脚本里面的。这个值在编译软件的时候就固定了,服务器编译生成 recovery 分区数据时,会计算内容的 SHA1 值,在生成 install-recovery.sh 脚本时直接写入到脚本里面。

2). 如果 SHA1 匹配,则说明 recovery 分区已经升级过了,结束;

 applypatch --check EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25

“–check” :指示 applypatch 执行分区数据校验操作,后面提供了分区路径、大小以及期待的 SHA1。

3). 如果 SHA1 不匹配,则说明 recovery 分区未升级或数据损坏,升级 recovery 分区。
applypatch 从 boot 分区 load 数据,并打上 patch (/system/recovery-from-boot.p)合成新的数据写到 recovery 分区。

  applypatch  \--patch /system/recovery-from-boot.p \--source EMMC:/dev/block/bootdevice/by-name/boot:100663296:a362e080d203e34fbdcce47278cda2bda566409a \--target EMMC:/dev/block/bootdevice/by-name/recovery:100663296:5859245349a4196c1e30f0ff21d727016e740e25

“–patch” :patch 文件。该文件是服务器编译软件时,diff 工具根据 boot 和 recovery 镜像 raw data 生成的差分补丁文件;
“–source” : 源文件。从代码可知,参数包括设备路径、大小以及 SHA1(说明打 patch 的时候也会校验源文件数据的完整性和合法性);
“–target”:目标文件。源文件和 patch 作用后生成的数据会写到目标文件,同样参数包括设备路径、大小以及 SHA1,也就说明升级结束后会检查目标文件数据是否正确。可以看到 “–target” 参数和 “–check” 的参数一致。

为什么源文件是boot分区?它的大小、SHA1以及 patch 文件怎么来的?

在讲软件架构的时候已经介绍,boot 和 recovery 分区的内容其实就是 kernel+ramdisk(recovery镜像有时也会是 kernel+ramdisk+dtb)。但是它们之间 kernel 的内容基本一样,ramdisk大同小异,dtb占用空间很小,也就是 boot 和 recovery 分区的数据大部分是一样的,因此当 recovery 把系统升级到新版本后,没必要在系统分区内保存完整的 recovery 镜像来升级 recovery 分区,可以充分利用 boot 分区的数据,编译软件时在系统分区内保存一份它们之间的 pacth,升级 recovery 的时候只需要 load boot 分区的数据打上 patch 就可以还原 recovery 分区的数据,同时也节省了不少系统分区的空间。

这里面的核心程序 applypatch 实现也是挺复杂的,此处不展开介绍,详见:(待续)

8. 启动到 launcher,升级流程结束

Recovery系统升级(3)---软件流程相关推荐

  1. OTA常见方案分析(差分升级 全量升级 AB面升级 Recovery系统升级)

    1.全量升级:   完整的下载新版本固件,下载完成后将固件搬运到APP程序运行的位置.(一般来说是将APP从片外flash搬运到片内flash上).搬运完成后校验通过后重启APP. 2.差分升级:   ...

  2. 对于理想的团队模式的设想和对软件流程的理解

    团队的样式多种复杂,但相对来说都有一些共同的特点: 1.团队有一致的集体目标,团队要一起完成这个目标,而且一个团队的成员不一定要同时工作. 2.团队成员有各自的分工,又互相依赖合作,共同完成任务. 书 ...

  3. 探讨对理想团队模式构建的设想及软件流程的理解

    1.1软件=程序+软件工程 ~软件开发的不同阶段 1)玩具阶段 2)业余爱好阶段 3)探索阶段 4)成熟产业阶段 5.1团队与非团队 团队共同的特点:1)团队有一致的集体目标,团队要一起完成这个目标 ...

  4. 软件工程中理想团队模式构建的设想与软件流程的理解

    根据<构建之法>第1,5,17章的内容,我对软件工程中的团队和软件流程有了一定的了解,也有了一些自己的想法.首先,作为一个团队要有一致的集体目标,团队成员有各自的分工,互相依赖合作,共同完 ...

  5. macbook正常卸载软件流程

    macbook正常卸载软件流程: 打开电脑左下角访达! 点击应用程序 选中想要卸载的软件,右击,移到废纸篓即可. **macbook卸载软件时出现软件已锁定,解决办法请见 macbook卸载软件时出现 ...

  6. i9300刷recovery 手机天堂-软件世界 | 手机天堂-软件世界

    今天安卓中文网小编给大家介绍下三星I9300使用Odin线刷Recovery的方法,前一段时间小编给大家介绍了韩版三星I9300的ROOT方法,今天就来带着大家刷入Recovery,只有刷入了Reco ...

  7. OTA和Recovery系统升级流程介绍

    本文介绍了Android原生OTA和Recovery升级过程步骤. 进入升级 - 1.1 正常启动和进入Recovery的区别 下面给出了升级流程的简单示意图.    上图中的上下两个部分,上面一部分 ...

  8. android recovery 模式启动进入流程

    1.  上层应用的设置->隐私权->恢复出厂设置对应的java代码在如下路径文件:  packages/apps/Settings/src/com/android/settings/Mas ...

  9. mysql反删除恢复软件下载_MySQL Recovery(MySQL恢复软件)下载 v4.1官方版-下载啦

    MySQL Recovery是一个强大好用的mysql数据库修复工具,与其他类似的程序相比,该软件具有更多更好的恢复功能,帮助用户更好地恢复MySQL数据,恢复的数据库有多种导出模式,可以同时打开多个 ...

最新文章

  1. Linux Ubuntu 自动登录
  2. Linux上的free命令详解
  3. python学习笔记(十一)——正则表达式
  4. Matlab 接受字符串并转为符号表达式,inline函数,匿名函数形式的方法汇总
  5. 吴恩达 coursera AI 专项三第二课总结+作业答案
  6. USB CDC 可变形参
  7. wdcp mysql密码_wdcp默认的mysql密码是多少?
  8. adf平稳性检测_ADF声明性组件示例
  9. 信息学奥赛一本通 1139:整理药名 | OpenJudge NOI 1.7 15
  10. 国家自科基金人工智能项目比较:西电第一 清华第二 电子科大第三
  11. 关于参数的写法规则,参数引用几种写法
  12. (转)比较全的OA模板
  13. c语言棋盘上的麦粒switch,C语言教材后习题及答案.doc
  14. Java 官方文档使用介绍
  15. 计算机网络纠错码,纠错码
  16. java识别图片文字_java 实现图片的文字识别
  17. 管理员三权分立是什么意思?
  18. 计算机老师教师节祝福语,2020教师节祝福语精选
  19. 计算机网络准入技术,计算机网络终端准入控制技术课件.pdf
  20. total-vm anon-rss file-rss shmem-rss含义

热门文章

  1. 友情链接我们怎么设置比较好
  2. jpcsp源码解读之四:Clock类
  3. 资深HR来告诉大家制作个人简历的时候内容要怎么写?
  4. 视频教程-Flutter布局实战网易新闻客户端-flutter
  5. Tiptop 安装说明书
  6. 易语言如何读内存地址的数值,模仿CE
  7. 华科大考研计算机系834大纲之计算机网络(五)
  8. mediainfo.js获取视频详细信息,js获取视频帧数
  9. 本地ie运行c语言,如何在win7系统中打开和运行64位IE浏览器
  10. Linuxmint13 / Ubuntu12.04 x86-64位 系统使用 WINE/winetricks 安装32位 IE7 + CAJViewer7 阅读器