This page looks best with JavaScript enabled

PCIe passthru Bhyve

 ·  🎃 kr0m

As we saw in a previous article Bhyve is a lightweight virtualization system with a multitude of features, including PCIe device passthrough. The requirements to have this functionality are:

  • CPU supports Intel IOMMU (a.k.a. VT-d) feature:
    My processor is a Intel Core i7-6700K that supports both VT-x and VT-d:

    dmesg

    CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz (4007.99-MHz K8-class CPU)
      Origin="GenuineIntel"  Id=0x506e3  Family=0x6  Model=0x5e  Stepping=3
      Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
      Features2=0x7ffafbbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
      AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
      AMD Features2=0x121<LAHF,ABM,Prefetch>
      Structured Extended Features=0x29c6fbf<FSGSBASE,TSCADJ,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,NFPUSG,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PROCTRACE>
      Structured Extended Features3=0xbc002e00<MCUOPT,MD_CLEAR,TSXFA,IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
      XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
      IA32_ARCH_CAPS=0xc04<RSBA>
      VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
      TSC: P-state invariant, performance statistics
    

    In my case, enabling VT-x in the BIOS wasn’t enough; I also had to go to Advanced -> System Agent Configuration -> VT-d: Enabled

    We verify that ACPI has the necessary device mapping information:

    acpidump -t | grep DMAR

      DMAR: Length=176, Revision=1, Checksum=45,
    
  • PCI device (and driver) supports MSI/MSI-x interrupts:

    pciconf -lc | grep MSI

        cap 05[90] = MSI supports 1 message
        cap 05[ac] = MSI supports 1 message
        cap 05[80] = MSI supports 8 messages, 64 bit enabled with 1 message
        cap 05[8c] = MSI supports 1 message, 64 bit
        cap 05[80] = MSI supports 1 message enabled with 1 message
        cap 05[80] = MSI supports 1 message
        cap 05[80] = MSI supports 1 message
        cap 05[80] = MSI supports 1 message
        cap 05[80] = MSI supports 1 message
        cap 05[80] = MSI supports 1 message
        cap 05[60] = MSI supports 1 message, 64 bit enabled with 1 message
        cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
        cap 05[68] = MSI supports 1 message, 64 bit enabled with 1 message
        cap 05[68] = MSI supports 1 message, 64 bit enabled with 1 message
        cap 05[50] = MSI supports 8 messages, 64 bit
        cap 11[68] = MSI-X supports 8 messages, enabled
        cap 05[50] = MSI supports 8 messages, 64 bit
        cap 11[68] = MSI-X supports 8 messages
        cap 05[50] = MSI supports 1 message, 64 bit
        cap 11[b0] = MSI-X supports 33 messages, enabled
    

We must take into account that passthrough has some important limitations, such as:

  • The device must be disabled from the parent host in order to be used in virtual machines.
  • If it is a USB controller, individual ports cannot be assigned; the entire controller is assigned.

To avoid these limitations, I opted to buy a PCIe controller that I will use exclusively in the virtual machines. This way, my main system remains unaffected without losing any USB ports. It is a YEELIYA card :


Parent host:

The best way to identify the card is by using the commands installed by vm-bhyve:

vm passthru

DEVICE     BHYVE ID     READY        DESCRIPTION
hostb0     0/0/0        No           Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers
pcib1      0/1/0        No           6th-10th Gen Core Processor PCIe Controller (x16)
vgapci1    0/2/0        No           HD Graphics 530
xhci0      0/20/0       No           100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller
none0      0/22/0       No           100 Series/C230 Series Chipset Family MEI Controller
ahci0      0/23/0       No           Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode]
pcib2      0/27/0       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib3      0/27/3       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib4      0/28/0       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib5      0/28/7       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib6      0/29/0       No           100 Series/C230 Series Chipset Family PCI Express Root Port
isab0      0/31/0       No           Z170 Chipset LPC/eSPI Controller
none1      0/31/2       No           100 Series/C230 Series Chipset Family Power Management Controller
hdac1      0/31/3       No           100 Series/C230 Series Chipset Family HD Audio Controller
ichsmb0    0/31/4       No           100 Series/C230 Series Chipset Family SMBus
em0        0/31/6       No           Ethernet Connection (2) I219-V
vgapci0    1/0/0        No           GP106 [GeForce GTX 1060 6GB]
hdac0      1/0/1        No           GP106 High Definition Audio Controller
ahci1      3/0/0        No           ASM1062 Serial ATA Controller
xhci1      4/0/0        No           ASM1142 USB 3.1 Host Controller
xhci2      5/0/0        No           ASM2142/ASM3142 USB 3.1 Host Controller
nvme0      6/0/0        No           NVMe SSD Controller SM981/PM981/PM983

The bus/slot/function identifier we are interested in is:

xhci2      5/0/0        No           ASM2142/ASM3142 USB 3.1 Host Controller

We disable the card in the parent system. It’s important to load vmm from the loader during the OS boot and not afterwards, so that we can reserve the devices before the operating system itself starts using them. If we let it load from RC as a dependency of vm-bhyve (or any other virtualization software) or through kld_list, the devices won’t be available for passthrough.

vi /boot/loader.conf

# PCIe passthru, USB-Card: 5/0/0
vmm_load="YES"
pptdevs="5/0/0"

We restart to load the configuration.

shutdown -r now

We check the list of devices again and we will see that now the one we are interested in is marked as READY:

vm passthru

DEVICE     BHYVE ID     READY        DESCRIPTION
hostb0     0/0/0        No           Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers
pcib1      0/1/0        No           6th-10th Gen Core Processor PCIe Controller (x16)
vgapci1    0/2/0        No           HD Graphics 530
xhci0      0/20/0       No           100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller
none0      0/22/0       No           100 Series/C230 Series Chipset Family MEI Controller
ahci0      0/23/0       No           Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode]
pcib2      0/27/0       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib3      0/27/3       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib4      0/28/0       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib5      0/28/7       No           100 Series/C230 Series Chipset Family PCI Express Root Port
pcib6      0/29/0       No           100 Series/C230 Series Chipset Family PCI Express Root Port
isab0      0/31/0       No           Z170 Chipset LPC/eSPI Controller
none1      0/31/2       No           100 Series/C230 Series Chipset Family Power Management Controller
hdac1      0/31/3       No           100 Series/C230 Series Chipset Family HD Audio Controller
ichsmb0    0/31/4       No           100 Series/C230 Series Chipset Family SMBus
em0        0/31/6       No           Ethernet Connection (2) I219-V
vgapci0    1/0/0        No           GP106 [GeForce GTX 1060 6GB]
hdac0      1/0/1        No           GP106 High Definition Audio Controller
ahci1      3/0/0        No           ASM1062 Serial ATA Controller
xhci1      4/0/0        No           ASM1142 USB 3.1 Host Controller
ppt0       5/0/0        Yes          ASM2142/ASM3142 USB 3.1 Host Controller
nvme0      6/0/0        No           NVMe SSD Controller SM981/PM981/PM983

Virtual machine:

We specify the devices to pass through :

cat << EOF >> /zroot/vm/ubuntu-cloud/ubuntu-cloud.conf
passthru0="5/0/0"
EOF

We can view the available devices in the virtual machine using lspci.
Before passthru:

00:00.0 Host bridge: Network Appliance Corporation Device 1275
00:04.0 SCSI storage controller: Red Hat, Inc. Virtio block device
00:04.1 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode]
00:05.0 Ethernet controller: Red Hat, Inc. Virtio network device
00:1f.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]

After passthru:

00:00.0 Host bridge: Network Appliance Corporation Device 1275
00:04.0 SCSI storage controller: Red Hat, Inc. Virtio block device
00:04.1 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode]
00:05.0 Ethernet controller: Red Hat, Inc. Virtio network device
00:06.0 USB controller: ASMedia Technology Inc. ASM2142 USB 3.1 Host Controller
00:1f.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]

Troubleshooting:

If something goes wrong, we can always check the Bhyve logs:

tail -f /zroot/vm/ubuntu-cloud/vm-bhyve.log

If we encounter the following error, it means that we don’t have VT-d enabled in the BIOS or our CPU doesn’t support it:

Oct 25 21:58:36: fatal; pci passthrough not supported on this system (no VT-d or amdvi)

In my case, enabling VT-x in the BIOS wasn’t enough; I also had to go to Advanced -> System Agent Configuration -> VT-d: Enabled

If you liked the article, you can treat me to a RedBull here