LLVM的IR在整个LLVM工程中占据着核心地位,它是整个LLVM三个阶段的中间环节,起着承上启下的作用。如何读写LLVM的bitcode在LLVM的前端和后端都会涉及这个问题,在基于LLVM做一些定制化开发的时候,对LLVM的bitcode也是重中之重。

LLVM的官方文档在这个方面介绍很详细,但是随着LLVM的扩充,官方文档也越来越厚重,有的时候只是简单的想做某个事,可能需要查找好几个文档,并且要结合代码。Neil Henning写了一篇介绍如何读写LLVM的bitcode的文章,简单易读,内容非常详细,从安装LLVM到CMAKE文件,到最后每步代码做什么用,都介绍的很清楚。非常适合就想解决这个问题的朋友读,并且附上了示例代码。博客地址:How to read & write LLVM bitcode

Github代码地址:sheredom/llvm_bc_parsing_example

----------------------------------------------------------------------------------------------

为了防止抽风这个博客打不开,将博客全文附录如下:

How to read & write LLVM bitcode

I’ve read multiple posts on social media now complaining about how scary LLVM is to get to grips with. It doesn’t help that the repository is ginormous, there are frequently hundreds of commits a day, the mailing list is nearly impossible to keep track of, and the resulting executables are topping 40Mb now…
Those tidbits aside – LLVM is super easy to work with once you get to grips with the beast that it is. To help aid people in using LLVM, I thought I’d put together the most trivial no-op example you can do with LLVM – parsing one of LLVM’s intermediate representation files (known as bitcode, file extension .bc) and then writing it back out.
Firstly, lets go through some high level LLVM terms:

  • LLVM’s main abstraction for user code is the Module. It’s a class that contains all the functions, global variables, and instructions for the code you or other users write.
  • Bitcode files are effectively a serialization of an LLVM Module such that it can be reconstructed in a different program later.
  • LLVM uses MemoryBuffer objects to handle data that comes from files, stdin, or arrays.

For my example, we’ll use the LLVM C API – a more stable abstraction ontop of LLVM’s core C++ headers. The C API is really useful if you’ve got code that you want to work with multiple versions of LLVM, it’s significantly more stable than the LLVM C++ headers. (An aside, I use LLVM extensively for my job and nearly every week some LLVM C++ header change will break our code. I’ve never had the C API break my code.)
First off, I’m going to assume you’ve pulled LLVM, built and installed it. Some simple steps to do this:
git clone https://git.llvm.org/git/llvm.git <llvm dir>
cd <llvm dir>
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=install ..
cmake --build . --target install
After doing the above, you’ll have an LLVM install in <llvm dir>/build/install!
So for our little executable I’ve used CMake. CMake is by far the easiest way to integrate with LLVM as it is the build system LLVM also uses.
project(llvm_bc_parsing_example)
cmake_minimum_required(VERSION 3.4.3)

# option to allow a user to specify where an LLVM install is on the system
set(LLVM_INSTALL_DIR "" CACHE STRING "An LLVM install directory.")

if("${LLVM_INSTALL_DIR}" STREQUAL "")
message(FATAL_ERROR "LLVM_INSTALL_DIR not set! Set it to the location of an LLVM install.")
endif()

# fixup paths to only use the Linux convention
string(REPLACE "\\" "/" LLVM_INSTALL_DIR ${LLVM_INSTALL_DIR})

# tell CMake where LLVM's module is
list(APPEND CMAKE_MODULE_PATH ${LLVM_INSTALL_DIR}/lib/cmake/llvm)

# include LLVM
include(LLVMConfig)

add_executable(llvm_bc_parsing_example main.c)

target_include_directories(llvm_bc_parsing_example PUBLIC ${LLVM_INCLUDE_DIRS})

target_link_libraries(llvm_bc_parsing_example PUBLIC LLVMBitReader LLVMBitWriter)

So now we’ve got our CMake setup, and we can use our existing LLVM install, we can now get working on our actual C code!
So to use the LLVM C API there is one header you basically always need:
#include <llvm-c/Core.h>
And two extra headers we need for our executable are the bitcode reader and writer:
#include <llvm-c/BitReader.h>
#include <llvm-c/BitWriter.h>
Now we create our main function. I’m assuming here that we always take exactly 2 command line arguments, the first being the input file, the second being the output file. LLVM has a system whereby if a file named ‘-‘ is provided, that means read from stdin or write to stdout, so I decided to support that too:
if (3 != argc) {
fprintf(stderr, "Invalid command line!\n");
return 1;
}

const char *const inputFilename = argv[1];
const char *const outputFilename = argv[2];
So first we parse the input file. We’ll create an LLVM memory buffer object from either stdin, or a filename:
LLVMMemoryBufferRef memoryBuffer;

// check if we are to read our input file from stdin
if (('-' == inputFilename[0]) && ('\0' == inputFilename[1])) {
char *message;
if (0 != LLVMCreateMemoryBufferWithSTDIN(&memoryBuffer, &message)) {
fprintf(stderr, "%s\n", message);
free(message);
return 1;
}
} else {
char *message;
if (0 != LLVMCreateMemoryBufferWithContentsOfFile(
inputFilename, &memoryBuffer, &message)) {
fprintf(stderr, "%s\n", message);
free(message);
return 1;
}
}
So after this code, memoryBuffer will be usable to read our bitcode file into an LLVM module. So lets create the module!
// now create our module using the memory buffer
LLVMModuleRef module;
if (0 != LLVMParseBitcode2(memoryBuffer, &module)) {
fprintf(stderr, "Invalid bitcode detected!\n");
LLVMDisposeMemoryBuffer(memoryBuffer);
return 1;
}

// done with the memory buffer now, so dispose of it
LLVMDisposeMemoryBuffer(memoryBuffer);
Once we’ve got our module, we no longer need the memory buffer, so we can free up the memory straight away. And that’s it! We’ve managed to take an LLVM bitcode file, deserialize it into an LLVM module, which we could (I’m not going to in this blog post at least!) fiddle with. So lets assume you’ve done all you wanted with the LLVM module, and want to write the sucker back out to a bitcode file again.
The approach is orthogonal to the reading approach, we look for the special filename ‘-‘ and handle accordingly:
// check if we are to write our output file to stdout
if (('-' == outputFilename[0]) && ('\0' == outputFilename[1])) {
if (0 != LLVMWriteBitcodeToFD(module, STDOUT_FILENO, 0, 0)) {
fprintf(stderr, "Failed to write bitcode to stdout!\n");
LLVMDisposeModule(module);
return 1;
}
} else {
if (0 != LLVMWriteBitcodeToFile(module, outputFilename)) {
fprintf(stderr, "Failed to write bitcode to file!\n");
LLVMDisposeModule(module);
return 1;
}
}
Lastly, we should be good citizens and clean up our garbage, so also delete the module to:
LLVMDisposeModule(module);
And that’s it! You are now able to parse and then write an LLVM bitcode file. I’ve put the full example up here on GitHub – https://github.com/sheredom/llvm_bc_parsing_example.
Maybe I’ll do a follow-up post at some point taking you through the basics of how to do things to an LLVM module, but for now, adieu!

LLVM每日谈之三十一 如何读写LLVM的bitcode相关推荐

  1. LLVM每日谈之三 如何创建一个LLVM工程

    作者:snsn1984 阅读了文档<Creating an LLVM Project>(地址:http://llvm.org/docs/Projects.html)之后,自己照着做了一遍, ...

  2. LLVM每日谈之四十一 组装一个完整的工具链

    写在前面的话:这是一篇LLVM的官方文档,英文文档地址: Assembling a Complete Toolchain 之前读文档的时候,陆陆续续的翻译过一些,周末花了点时间把这个文章整理了出来.因 ...

  3. LLVM每日谈之三十七 Brief Intro to LLVM Backend (HelloLLVM杭州站分享PPT)

    今天参加了HelloLLVM在杭州的线下聚会,做了一个关于LLVM 后端的分享.旨在给对LLVM感兴趣的同学和刚接触LLVM的同学一点引导,帮助他们快速理清LLVM的后端架构.PPT如下:

  4. LLVM每日谈之三十四 LLVM IR生成和转换的几条指令

    本文将罗列几条关于LLVM IR生成和转换的几条指令,并没有技术含量可言,只是让刚接触LLVM IR的同学,有一个检索和参考作用.文中min.c作为输入. min.c int min(int a , ...

  5. LLVM每日谈 | 知乎

    llvm是什么? llvm是low level virtual machine的简称,其实是一个编译器框架.llvm随着这个项目的不断的发展,已经无法完全的代表这个项目了,只是这种叫法一直延续下来. ...

  6. LLVM每日谈之二十三 LLVM/Clang编译Linux内核资料

    作者:史宁宁(snsn1984) 之前有朋友问到这个问题,是否有使用LLVM/Clang编译Linux内核的,随手找了一些相关资料,在这里贴出来,与大家共享. 网址:http://llvm.linux ...

  7. LLVM每日谈之十二 LLVM的源码分析之Pass相关

    作者:snsn1984 题记:在学习LLVM的过程中,要想学的更加深入,掌握更多的技能,LLVM的源码是必须要读的,但是在这么多的源码中,从哪里下手?很容易让人找不到头脑,本文这里就先拿出几个Pass ...

  8. LLVM每日谈之二十九 面向机器学习的编译器——Glow

    当地时间五月三日,Glow开发者在LLVM开发者邮件列表中发布了Glow. 邮件地址:[llvm-dev] Thank you from the Glow Developers Glow的开源地址:p ...

  9. LLVM每日谈之二十八 I am leaving llvm

    Rafael 于当地时间五月二日宣称离开LLVM社区,在网络引发了广泛的讨论.Rafael 作为LLVM贡献排名第五的资深贡献者,对LLVM社区贡献极大,他一共提交了4,344个节点,占LLVM提交节 ...

最新文章

  1. VS2010 + Qt5.3.2配置教程
  2. php 多图上传编辑器,laravel中使用WangEditor及多图上传
  3. 【Android应用开发】分享一个录制 Android 屏幕 gif 格式的小技巧
  4. MP3文件转换成arduino可以直接播放的wav格式,MP3转WAV工具
  5. 自动化C语言第一次月考试卷,145班《计算机组成与工作原理》第一次月考试卷...
  6. bat常用命令操作符列表
  7. sap.ui.layout.form.SimpleForm.prototype
  8. 清空计算机网络缓存,【缓存清理工具】缓存清理软件_电脑缓存清理软件【最新】-太平洋电脑网...
  9. 如何把高版本的sqlserver 还原到低版本的 sqlserver
  10. mysql addslashes c_addslashes()用途与php怎样防止mysql注入?
  11. 数字图像处理与python实现 岳亚伟_数字图像处理与Python实现
  12. vscdoe之通过Ctrl+S实现代码格式自动化
  13. 看清喽别迷糊 英特尔本CPU型号之乱
  14. (售前)销售经理和产品经理的区别以及未来发展
  15. maven的下载与安装教程(超详细)
  16. PeopleSoft技术(Application Designer学习,简称AD)
  17. HOJ 10027 Longest Ordered Subsequence Extention
  18. 红米k30s至尊纪念版发布会直播地址红米k30s至尊纪念版发布会在线观看入口
  19. 【vue+HT+flyTo】HT图扑软件中的flyTo应用
  20. 使用 Excel 画像素画

热门文章

  1. 【剑指offter】重建二叉树
  2. Influx 产品常见问题及使用技巧(3)
  3. 支付宝2023集齐五福攻略技巧 五福卡全获取途径方法
  4. mysql创建部门表和员工表,并用sql语句进行查询
  5. 小米技术委员能扛起雷军技术立业的大旗吗?
  6. Unity3D 5 官方教程:粒子系统 How-Tos
  7. Windows 10 C盘所有软件,安装到D盘
  8. 对计算思维的一些认识
  9. mysql 表的结果作为表_SQL--如何将Sql语句查询出来的结果作为一个表名 再次进行查询...
  10. GBase 8a MPP使用时 数据库基础问题之管理工具三