Linux VM sockets in Go

1、内核版本4.8+;

2、qemu版本2.8+;

3、qemu-system-x86_64 -m 4G -hda /home/matt/ubuntuvm0.img -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3 -vnc :0 --enable-kvm

每个虚拟机都需要分配一个唯一的context id,该id类似于ip地址,vscok通过该context id + port与host通讯,host也有context id,默认为2,因此虚拟机的context id一般从3

开始配置;

4、

  CONFIG_VSOCKETS=yCONFIG_VIRTIO_VSOCKETS=yCONFIG_VIRTIO_VSOCKETS_COMMON=yCONFIG_VHOST_VSOCK=m

5、modprobe vhost_vsock, 默认会生成设备 /dev/vhost-vsock   /dev/vso ck

During a recent discussion with coworkers, I discovered a new Linux socket family: VM sockets (AF_VSOCK address family). This new socket family enables bi-directional, many-to-one, communication between a hypervisor and its virtual machines, using the classic BSD sockets API.

Although VM sockets were originally introduced by VMware, they can be used with QEMU+KVM virtual machines as well. This post will detail how VM sockets work and how they can be used.

If you’d like to see more examples and make use of VM sockets in your own Go applications, check out: github.com/mdlayher/vsock.

Introduction to VM sockets

VM sockets were added to the kernel to overcome some of the limitations that existing communication mechanisms faced:

  • Serial port communication is meant for one-to-one, not many-to-one communications.
  • Only 512 serial ports are available (relatively low limit).

Because VM sockets do not rely on the host’s networking stack at all, it is possible to configure VMs entirely without networking: only allowing communication using VM sockets.

VM sockets setup

To take advantage of VM sockets (using virtio-vsock), the Linux kernel (on both the hypervisor and guest) and QEMU must be fairly up-to-date. Kernel 4.8+ is required on both machines, and QEMU 2.8+ is required to execute the VM.

Once these components are in place, some setup must be done on the hypervisor to enable VM sockets communication.

First, the necessary kernel modules must be loaded on the hypervisor (with kernel 4.8+).

hypervisor $ uname -a
Linux hypervisor 4.8.0-39-generic #42~16.04.1-Ubuntu SMP Mon Feb 20 15:06:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
hypervisor $ sudo modprobe vhost_vsock

Once the kernel module is loaded, two special character devices will appear on the hypervisor.

hypervisor $ ls -l /dev/vhost-vsock
crw------- 1 root root 10, 53 May  4 11:55 /dev/vhost-vsock
hypervisor $ ls -l /dev/vsock
crw-rw-rw- 1 root root 10, 54 May  4 11:55 /dev/vsock

Next, QEMU must be started with a special vhost-vsock-pci device attached that enables VM sockets communication within the VM. Note that each VM on a hypervisor must have a unique “cid” (context ID). For this example, we’ve chosen guest-cid=3.

hypervisor $ sudo qemu-system-x86_64 -m 4G -hda /home/matt/ubuntuvm0.img -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3 -vnc :0 --enable-kvm

Within the virtual machine, verify that the /dev/vsock device is available.

vm $ ls -l /dev/vsock
crw-rw-rw- 1 root root 10, 55 May  4 13:21 /dev/vsock

VM sockets addresses

A VM sockets address is comprised of a context ID and a port; just like an IP address and TCP/UDP port.

The context ID (CID) is analogous to an IP address, and is represented using an unsigned 32-bit integer. It identifies a given machine as either a hypervisor or a virtual machine. Several addresses are reserved, including 01, and the maximum value for a 32-bit integer: 0xffffffff. The hypervisor is always assigned a CID of 2, and VMs can be assigned any CID between 3 and 0xffffffff — 1.

A port is analogous to a typical TCP or UDP port, and is represented using an unsigned 32-bit integer. Many different services can run on the same host by binding to different ports, and each port can serve multiple connections concurrently. As with IP ports, ports in the range 0-1023 are considered “privileged”, and only root or a user with CAP_NET_ADMIN may bind to these ports.

VM sockets API

Now that we are familiar with some of the basics of VM sockets, let’s dive into the API. As with other socket types on Linux, the BSD sockets API is used when configuring VM sockets. It appears that VM sockets can be used in both connection-oriented (like TCP) and connection-less (like UDP) modes, but this post will only cover the connection-oriented variant.

All pseudo-code in this post will make use of Go’s golang.org/x/sys/unixpackage.

If you have experience creating TCP sockets using system calls, this process should seem quite familiar. First, let’s start up a VM sockets server on the hypervisor.

// Retrieve host's context ID from /dev/vsock. More on this later.
cid := localContextID()
// Establish a connection-oriented VM socket.
socket, err := unix.Socket(unix.AF_VSOCK, unix.SOCK_STREAM, 0)
if err != nil {return err
}
// Bind socket to local context ID, port 1024.
sockaddr := &unix.SockaddrVM{CID:  cid,Port: 1024,
}
if err := unix.Bind(socket, sockaddr); err != nil {return err
}
// Listen for up to 32 incoming connections.
fd, err := unix.Listen(socket, 32)
if err != nil {return err
}
// Use fd to read and write data to and from a VM.

Next, we can dial out to the server running on the hypervisor, from a client running in the VM.

// Establish a connection-oriented VM socket.
socket, err := unix.Socket(unix.AF_VSOCK, unix.SOCK_STREAM, 0)
if err != nil {return err
}
// Connect socket to hypervisor context ID, port 1024.
sockaddr := &unix.SockaddrVM{CID:  2,Port: 1024,
}
if err := unix.Connect(socket, sockaddr); err != nil {return err
}
// Use fd to read and write data to and from the hypervisor.

As you can see, VM sockets can more or less be used as a drop-in replacement for typical TCP sockets.

Retrieving the local context ID

When working with VM sockets, it can be useful to retrieve the local context ID of a given machine. This can be done by performing an ioctl() system call on /dev/vsock.

// Open /dev/vsock to perform ioctl().
f, err := os.Open("/dev/vsock")
if err != nil {return err
}
defer f.Close()
// Kernel constant for retrieving local CID from device.
const getLocalCID = 0x7b9
// Ask kernel to deference a pointer to cid and place the local
// CID for this host in the uint32 value "cid".
var cid uint32
err := ioctl(f.Fd(), getLocalCID, uintptr(unsafe.Pointer(&cid)))
if err != nil {return err
}

Because of the very versatile and somewhat dangerous nature of the ioctl()system call, it is not implemented directly in x/sys/unix. You can see how I’ve implemented it in package vsock here.

Package vsock

To simplify the VM sockets setup process and enable code reuse, I have created a VM sockets package for Go: github.com/mdlayher/vsock.

Using package vsock, one can build client/server applications in Go using VM sockets in a rather straightforward and familiar way: using the net.Listener and net.Conn interfaces.

As an example, let’s create an “echo” service using VM sockets. A server listens for incoming connections, and when a message is received from a client, it is echoed back to the client.

Here’s the code for the server:

// Listen for VM sockets connections on port 1024.
l, err := vsock.Listen(1024)
if err != nil {return err
}
defer l.Close()
// Accept a single connection.
c, err := l.Accept()
if err != nil {return err
}
defer c.Close()
// Echo all data from the client back to the client.
if _, err := io.Copy(c, c); err != nil {return err
}

The code for the client is succinct as well:

// Dial a VM sockets connection to a process on the hypervisor
// bound to port 1024.
c, err := vsock.Dial(vsock.ContextIDHost, 1024)
if err != nil {return err
}
defer c.Close()
// Send a brief message to the hypervisor.
if _, err := c.Write([]byte("hello world")); err != nil {return err
}
// Read back the echoed response from the hypervisor.
b := make([]byte, 16)
n, err := c.Read(b)
if err != nil {return err
}
fmt.Println(string(b[:n]))

Summary

VM sockets are a very interesting new communication mechanism, but it may take some time for many production environments to deploy new enough versions of the Linux kernel and QEMU to take advantage of them.

Once widely deployed, they could become quite useful for offering additional services on Infrastructure-as-a-Service platforms. Guest agents running inside the VM could leverage services provided by the hypervisor in new and interesting ways. The possibilities are limitless!

If you’d like a more in-depth look at VM sockets, I recommend this excellent presentation by Stefan Hajnoczi. You may also be interested in Stefan’s proposed additions to the virtio specification for virtio socket devices. Finally, I’d like to thank Stefan for personally answering several of my questions about VM sockets terminology and architecture.

Thank you for reading, and I hope you’ve learned something new from this post. I encourage you to set up VM sockets in a development environment, and to try out package vsock as well!

If you enjoyed this post, you may also be interested in my series about using Netlink sockets in Go! Thank you for your time.

References

  • Features/Virtio-Vsock — QEMU: http://wiki.qemu.org/Features/VirtioVsock

vhost vsock相关推荐

  1. VHOST KICKCALL 原理

    1. vhost vring设置 VHOST_SET_VRING_KICK: 建立virtio前端到vhost后端的通知机制: VHOST_SET_VRING_CALL: 建立vhost后端到virt ...

  2. 解决Wamp 开启vhost localhost 提示 403 Forbbiden 的问题!

    非常奇怪的一个问题.我曾经从来都没有这样过!訪问 http://localhost/ 提示  403 Forbbiden. 我之前的设置一直都是这种: httpd.conf <Directory ...

  3. C#Project不生成.vhost.exe和.pdb文件的方法

    编译C#工程时,在C#的Project的属性界面的Build选项卡中当Configuration : Relese 时,依然会生成扩展名为.vhost.exe和.pdb文件. 其中.pdb是debug ...

  4. php配置默认index.php,Apache的vhost中配置默认访问入口index-test.php的方法(Yii)

    最近的参与的Yii项目有多个分支,所以在入口文件里面有区分(index.php index-test.php index-beta.php)等.不同的入口文件对应不同的环境和配置. 这个时候在本地建立 ...

  5. DPDK vhost库(十一)

    Vhost库实现了一个用户空间virtio网络服务器,允许用户直接操作virtio. 换句话说,它允许用户通过VM virtio网络设备获取/发送数据包. 为了达到这个功能,一个vhost库需要实现: ...

  6. KVM中virtio、vhost 和vhost-user比较(十一)

    virtio 在虚拟机中,可以通过qemu模拟e1000网卡,这样的经典网卡一般各种客户操作系统都会提供inbox驱动,所以从兼容性上来看,使用类似e1000的模拟网卡是非常一个不错的选择. 但是,e ...

  7. 【Apache】 配置 (http协议的) vhost

    前言 Apache 2.4.39 phpStudy 8.1.1.2 tomcat 9.0 的项目 准备 启用代理模块. 在 httpd.conf 配置文件中加载 Http 反向代理用到的模块 Load ...

  8. nginx: [emerg] duplicate “log_format“ name “main“ in /usr/local/phpstudy/vhost/sys/nginx/sys.conf:11

    前言 CentOS Linux release 8.2.2004 (Core) phpstdy X1.26 nginx1.15 配置nginx日志出错 nginx: [emerg] duplicate ...

  9. RTMP的URL/Vhost规则

    RTMP的url其实很简单,vhost其实也没有什么新的概念,但是对于没有使用过的同学来讲,还是很容易混淆.几乎每个新人都必问的问题:RTMP那个URL推流时应该填什么,什么是vhost,什么是app ...

最新文章

  1. 东野圭吾最值得看的书排行榜_东野圭吾最值得看的7本作品,我进了坑就再也没出来...
  2. d类功放芯片_【学术论文】应用于无滤波级D类音频功放的新型死区时间控制系统...
  3. LeetCode Verify Preorder Serialization of a Binary Tree
  4. 关闭linux远程桌面,[Linux]Ubuntu 16.04 远程桌面(简单暴力)
  5. pytorch——torch.backends.cudnn.benchmark = True
  6. cjmx:JConsole的命令行版本
  7. 零基础带你学习计算机网络复习—(五)
  8. 继云计算巨头失火事件后,微软决定送数据中心去“泡澡”!
  9. Codeforces 948D Perfect Security
  10. 斯坦福自然语言组的NLP及计算语言学的资料汇总
  11. 2022软件测试技能 Jmeter+Ant+Jenkins持续集成并生成测试报告教程
  12. isv支付宝小程序三方模板开发快速指南
  13. html鼠标放在图片上图片自动放大,css使图片自动放大
  14. 【转】当我们说“区块链是无需信任的”,我们的意思是
  15. dau计算公式_手游LTV(用户终生价值)计算公式
  16. 赫夫曼树(Haffman)及其运用
  17. 【图论】关于邻接表建图
  18. Py爬虫北京租房价格数据
  19. arm-none-eabi-gcc编译、链接选项详解
  20. c语言课程设计实验报告键盘电子琴,电子琴实验报告.doc

热门文章

  1. html5+css3基础总结
  2. 20_ue4进阶末日生存游戏开发[AI基础框架搭建]
  3. 套卷答题表设计(题库)
  4. 谷歌安装SwitchyOmega burp联动
  5. 「征集写作意见」活动进行中
  6. Python 年月日、儒略日、年纪日互转函数
  7. 暴风AI电视获双11人工智能品类销量第一
  8. 关于Unity RaycastHit2D 的使用心得
  9. 交换机基础与交换机命令入门
  10. java编程思想_基于jdk1.8