Thursday, February 14, 2019

GPU passthrough on KVM

 

A use case came up where multiple virtual machines needed access to various video cards.

 

We had the following for the host:

 

·      ESC8000 G4

·      2x Intel Xeon Gold 6140

·      4x NVIDIA QUADRO RTX 6000

·      4x NVIDIA TESLA Pascal P100

·      Ubuntu 16.04 (Xenial)

 

Before installing confirm the host BIOS is capable of pass though (IOMMU or VT-d) and enabled.

 

It looks something like this:

 

And of course make sure you enable VMX

 

 

See if it is already enabled in the kernel

DMAR: IOMMU enabled

 

Prepare the host kernel for passthrough

 

Add intel_iommu=on into the grub file

 

GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"

 

Locate the PCI bus address of the GPUs that will be passed through

 

ubuntu@asus_gpu:~$ lspci | grep -i nvidia

1d:00.0 VGA compatible controller: NVIDIA Corporation Device 1e30 (rev a1)

1d:00.1 Audio device: NVIDIA Corporation Device 10f7 (rev a1)

1d:00.2 USB controller: NVIDIA Corporation Device 1ad6 (rev a1)

1d:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad7 (rev a1)

1e:00.0 VGA compatible controller: NVIDIA Corporation Device 1e30 (rev a1)

1e:00.1 Audio device: NVIDIA Corporation Device 10f7 (rev a1)

1e:00.2 USB controller: NVIDIA Corporation Device 1ad6 (rev a1)

1e:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad7 (rev a1)

1f:00.0 VGA compatible controller: NVIDIA Corporation Device 1e30 (rev a1)

1f:00.1 Audio device: NVIDIA Corporation Device 10f7 (rev a1)

1f:00.2 USB controller: NVIDIA Corporation Device 1ad6 (rev a1)

1f:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad7 (rev a1)

20:00.0 VGA compatible controller: NVIDIA Corporation Device 1e30 (rev a1)

20:00.1 Audio device: NVIDIA Corporation Device 10f7 (rev a1)

20:00.2 USB controller: NVIDIA Corporation Device 1ad6 (rev a1)

20:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad7 (rev a1)

21:00.0 3D controller: NVIDIA Corporation Device 15f8 (rev a1)

22:00.0 3D controller: NVIDIA Corporation Device 15f8 (rev a1)

23:00.0 3D controller: NVIDIA Corporation Device 15f8 (rev a1)

24:00.0 3D controller: NVIDIA Corporation Device 15f8 (rev a1)

ubuntu@asus_gpu:~$

 

 

Look at the kernel drivers and vendor and device codes

 

For one of the Tesla P100

ubuntu@asus_gpu:~$ lspci -nn -k -s 21:00.0

21:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1)

              Subsystem: NVIDIA Corporation Device [10de:118f]

              Kernel driver in use: nouveau

              Kernel modules: nouveau

 

For one of the RTX 6000, there are 4 because of the audio and other stuff I presume

 

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.0

1d:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e30] (rev a1)

                  Subsystem: NVIDIA Corporation Device [10de:12ba]

                  Kernel driver in use: nouveau

                  Kernel modules: nvidiafb, nouveau

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.1

1d:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f7] (rev a1)

                  Subsystem: NVIDIA Corporation Device [10de:12ba]

                  Kernel driver in use: snd_hda_intel

                  Kernel modules: snd_hda_intel

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.2

1d:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad6] (rev a1)

                  Subsystem: NVIDIA Corporation Device [10de:12ba]

                  Kernel driver in use: nouveau

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.3

1d:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad7] (rev a1)

                  Subsystem: NVIDIA Corporation Device [10de:12ba]

                  Kernel driver in use: nouveau

ubuntu@asus_gpu:~$

 

 

Update the file /etc/initramfs-tools/modules

 

With the following and fill in the device you want to pass through

vfio

vfio_iommu_type1

vfio_pci ids=10de:1e30,10de:10f7,10de:1ad6,10de:1ad7,10de:15f8 vhost-net

 

Update the /etc/modules file

 

vfio

vfio_iommu_type1

vfio_pci ids=10de:1e30,10de:10f7,10de:1ad6,10de:1ad7,10de:15f8 vhost-net

 

 

Because you updated grub and stuff run the following two commands then reboot.

 

 

sudo update-grub

sudo update-initramfs -u

 

 

Once it comes back up, see if the passthrough is enabled and if the drivers are assigned correctly.

 

 

ubuntu@asus_gpu:~$ lspci -nn -k -s 21:00.0

21:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1)

              Subsystem: NVIDIA Corporation Device [10de:118f]

              Kernel driver in use: vfio-pci

              Kernel modules: nvidiafb, nouveau

ubuntu@asus_gpu:~$

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.0

1d:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e30] (rev a1)

              Subsystem: NVIDIA Corporation Device [10de:12ba]

              Kernel driver in use: vfio-pci

              Kernel modules: nvidiafb, nouveau

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.1

1d:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f7] (rev a1)

              Subsystem: NVIDIA Corporation Device [10de:12ba]

              Kernel driver in use: vfio-pci

              Kernel modules: snd_hda_intel

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.2

1d:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad6] (rev a1)

              Subsystem: NVIDIA Corporation Device [10de:12ba]

              Kernel driver in use: vfio-pci

ubuntu@asus_gpu:~$ lspci -nn -k -s 1d:00.3

1d:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad7] (rev a1)

              Subsystem: NVIDIA Corporation Device [10de:12ba]

                  Kernel driver in use: vfio-pci

 

 

If the Kernel driver in use: is something other than vfio-pci double check your addresses.

 

Building the VM

 

Launch a Ubuntu KVM VM as you normally would. Once it is up and running start attaching the stuff you want by updating the KVM XML file

 

You can modify it manually using virsh edit or import it in:

 

Locate the address needed for the PCI – this would be the bus number in hex format, in my case it’s bus='0x21'

 

ubuntu@asus_gpu:~$ virsh nodedev-dumpxml pci_0000_21_00_0

<device>

  <name>pci_0000_21_00_0</name>

  <path>/sys/devices/pci0000:17/0000:17:00.0/0000:18:00.0/0000:19:08.0/0000:21:00.0</path>

  <parent>pci_0000_19_08_0</parent>

  <driver>

    <name>vfio-pci</name>

  </driver>

  <capability type='pci'>

    <domain>0</domain>

    <bus>33</bus>

    <slot>0</slot>

    <function>0</function>

    <product id='0x15f8' />

    <vendor id='0x10de'>NVIDIA Corporation</vendor>

    <iommuGroup number='41'>

      <address domain='0x0000' bus='0x21' slot='0x00' function='0x0'/>

    </iommuGroup>

    <numa node='0'/>

    <pci-express>

      <link validity='cap' port='8' speed='8' width='16'/>

      <link validity='sta' speed='8' width='16'/>

    </pci-express>

  </capability>

</device>

 

 

ubuntu@asus_gpu:~$

 

 

Create an XML with the following and plug in the information you got earlier

 

<hostdev mode='subsystem' type='pci' managed='yes'>

  <driver name='vfio'/>

  <source>

    <address domain='0x0000' bus='0x21' slot='0x00' function='0x0'/>

  </source>

  <alias name='hostdev0'/>

  <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>

</hostdev>

 

 

 

Attach it with the following command

 

virsh attach-device (VM Name) --file ~/(your_xml_file) --config"

 

 

This can all be done manually or from the GUI using Virt Manager as well. The RTX 6000 will need all 4 devices attached to function properly.

 

Shutdown the VM and the start it back up.

 

Login to the VM and run lspci | grep -i nvidia and confirm you see the GPU

 

 

NOTE:

 

Some Tesla GPUs will function without defining the CPU on the systems, some will not.

The ones that do not will still show up in the VM and look like it should function, but it will throw arbitrary errors when attempting to access the GPU.

 

Errors like:

clCreateContext(): CL_OUT_OF_RESOURCES

or

code=46(cudaErrorDevicesUnavailable) “cudaEventCreate(&start)”

 

strace will show device reading the memory but stops when attempting to write

 

Update the CPU information in KVM and define it something other than the default hypervisor

 

<cpu mode='custom' match='exact'>

    <model fallback='allow'>Broadwell-IBRS</model>

  </cpu>

 

Shutdown and restart the VM for changes to take effect.

 

Install the nvidia drivers and cuda on your VM and have fun.

 

It’s always a hassle for me to install drivers so just paste the following if you want to do it quickly

 

sudo apt update

 

sudo apt install wget -y

 

 

# Download the files to install

# Nvidia 410.79 driver

wget http://us.download.nvidia.com/tesla/410.79/nvidia-diag-driver-local-repo-ubuntu1604-410.79_1.0-1_amd64.deb

# Nvidia Cuda 10 installer

wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64

 

sudo dpkg -i cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64

sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub

sudo apt-get update

sudo apt-get install cuda -y

sudo apt install nvidia-cuda-toolkit -y

# reboot

 

All of this can be easily scripted – feel free to hit me up for it but good chance you might be ignored since I don’t log on that often. Next chapter will be multiple containers concurrently accessing the GPU

Monday, February 13, 2017

Updating glance image path in OpenStack CentOS 7

In my lab, I loaded OpenStack Mitaka on my CentOS 7 box.  Everything was pretty much loaded in the root partition because I did not have enough drives and space at the time. Eventually the root partition got 100% full and was throwing all sorts of errors in OpenStack and other places in CentOS. Since I have additional space now so I have to tell OpenStack to start using it.

 

The folders using most of the space was the images directory in the glance folder. I wanted to keep things easy without changing the current path of the folders inside the glance-api.conf file which is /var/lib/glance

 

First I had make the physical volumes then volume group, then I created the logical volumes with the following command.

 

lvcreate -L 200G -n glance cinder-volumes

 

Then formatted it

 

mkfs.xfs /dev/cinder-volumes/glance

 

I zipped up the current images folder and then temporarily moved the files into my home directory and removed the files from their original location

 

tar -cvf images.tar images

mv images.tar /home/

rm -drf /var/lib/glance/images/

 

Backup and edit the /etc/fstab file to mount the new file system by adding the following line

 

/dev/cinder-volumes/glance /var/lib/glance/images                    xfs    defaults        0 0

 

Extrat the files you zipped up earlier then move into the new mounted liv/var/glance/images directory and update the permissions and ownership

 

tar -xvf images.tar                              

chown -R glance:glance /var/lib/glance

 

SElinux will need to update the following to allow us to connect to our newly created path, a shell file was created and can be reused if you need to perform the following on any other directories.

 

#!/bin/bash

set -eu

 

[ -x /usr/sbin/semanage ] || exit 0

 

semanage fcontext -a -t glance_var_lib_t "/var/lib/glance(/.*)?"

restorecon -Rv /var/lib/glance

 

semanage fcontext -a -t glance_log_t "/var/log/glance(/.*)?"

restorecon -Rv /var/log/glance

 

 

Restart the appropriate services and reboot to make sure you can update and add images and it is populating the appropriate directories.

Tuesday, December 13, 2016

Python and NETCONF to manage HP Comware 7

Enterprise networking is evolving and much of it is heading toward automation. One thing I have been coming across regularly is the utilization of API and scripts to assist in building networks. One of the vendors, HP, in Comware 7 allows users to manage their platform utilizing NETCONF protocol.

There is a fair amount of documentation on HP NETCONF; but here is a basic example of how to read a hostname and then modify it utilizing Python and NETCONF

 

NETCONF servers needs to enable on the HP switch and can be done with the following command:

 

netconf ssh server enable 

 

port 830 is the default port it will communicate through.

 

Once this is enabled (and a user account), you can read and write to the device

 

The module used in Python 2.7 is called ncclient and can be installed via PIP install. Here is my output from the interactive shell:

 

 

from ncclient import manager

 

Define the XML path of the information you need to obtain

 

filterget = '<top xmlns="http://www.hp.com/netconf/data:1.0"><Device><Base><HostName></HostName></Base></Device></top>'

 

Build the connection to the device with the data

 

sodarocks = manager.connect(host='192.168.11.15',port=830,username='admin',password='admin',hostkey_verify=False,allow_agent=False,look_for_keys=False, device_params={'name':'hpcomware'})

 

>>> sodarocks.get(('subtree', filterget))

 

The return value will be string in XML format

 

<?xml version="1.0" encoding="UTF-8"?><rpc-reply xmlns:config="http://www.hp.com/netconf/config:1.0" xmlns:data="http://www.hp.com/netconf/data:1.0" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:4486954e-2142-4896-a977-b66a294f4fec"><data><top xmlns="http://www.hp.com/netconf/data:1.0"><Device><Base><HostName>sodas5900</HostName></Base></Device></top></data></rpc-reply>

 

This return value can help you determine the path you need to modify to correct the XML string, in my case, I am looking to change the hostname

 

I will create a new filter with the modifications I want to the hostname:

 

filterchange = '<config xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"><top xmlns="http://www.hp.com/netconf/config:1.0"><Device><Base><HostName>notsodas5900</HostName></Base></Device></top></config>'

 

Then execute the modification using the new filter and the type of operation performed:

 

sodarocks.edit_config(target='running', config=filterchange, default_operation='replace')

 

You should get an rpc ok

 

>>> sodarocks.edit_config(target='running', config=filterchange, default_operation='replace')

 

<?xml version="1.0" encoding="UTF-8"?><rpc-reply xmlns:config="http://www.hp.com/netconf/config:1.0" xmlns:data="http://www.hp.com/netconf/data:1.0" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:03b4f996-d025-4aa7-be5e-782d0a4ffed3"><ok/></rpc-reply>

 

Some additional resources you can look at the is the API documentation that comes with each release of Comware from HP support site and the ncclient reference on github.

 

Wednesday, September 28, 2016

Setting up Windows Server 2008 R2 RADIUS authentication with Juniper SRX

Under Windows Network Policy Server (NPS)

Create a shared secret template, name it SRXpassword or something and make a password.

Create a new network policy and name it, leave the network access server unspecified

 

Click next and select the Windows Group and select the group(s) you want to access the device.

Click next and select access granted

For the type of encryption, click add, select MS-CHAPv2

Do not change anything under the constraint page and click next.

Remove everything from the Standard RADIUS attributes and select the vendor specific type. Click add and select vendor specific

Enter the Juniper vendor code 2636 and click yes it conforms

Put in the vendor-assigned attribute number 1 and select string as the attribute format and type in su

 

Click OK to close it and back to the menu select the encryption type. Uncheck everything except strongest encryption and click next and finished.

 

Create the new RADIUS client and populate the information of your firewall. Select the share secret template you create earlier.

 

On the Juniper SRX Firewall

 

Type in the following and fill in your server IP and password.

set system authentication-order [ password radius ]

set system radius-server 192.168.1.2 secret WhatEverPasswordYouMade

set system radius-options password-protocol mschap-v2

set system login user su class super-user

commit

 

Monday, September 5, 2016

Remotly Downloading Torrents

Lets say you only have ssh connections to your server, and you wanted to download torrents.


Things you will need:


Remote system:

putty or SecureCRT

sftp https://winscp.net/eng/docs/free_sftp_client_for_windows


Linux commands installed:

screen

transmission-cli


1. Download your torrent file.

2. Use sftp to send the file to your BitTorrent file folder.

3. Login to your system

4. Startup screen by typing

 $screen

  This will start a terminal run man screen for more information http://www.tecmint.com/screen-command-examples-to-manage-linux-terminals/

5. start torrent client

$transmission-cli Example.of.Torrent.file.torrent -w /location/to/save/file

6. Detach from screen session to log off.

hold Ctrl+a  then hit the d key to detach.
This will output a session id <id>.fqdn

7. When you login later

$screen -ls
111111.hostname1
222222.hostname2

8. Connect to screen
$screen -r 111111

Linux job commands:
Ctrl+z will put the job(process) on pause
$bg
Will put the job in the background
$fg
will put the job in the foreground
Ctrl+c will quit the program.

10. After checking the status you can Ctrl+c to end the torrent or Ctrl+a d to detach from screen.

11. To start another screen just type in
$screen
Detach (Ctrl+a d)
$screen -ls
$screen -r <id#>

Please revise for transmission-remote if possible

Friday, September 2, 2016

Virtual firewalls in OpenStack not routing packets properly

So I ran into an issue after loading up a vSRX and virtual F5 Big IP in OpenStack. I setup the virtual appliances as the gateway for my other instances but their packets were not traversing across the other networks if they were routed by my vSRX or vBigIP.

To eliminate all other factors, I went ahead and enable ping on the all interfaces and moved them all into the trusted zone of the SRX. Then I attempted to pings the other interfaces of the firewall that were in the other subnets. They all failed, but the SRX showed that it received the packets and sent the replies. I experience the same thing on the virtual instance of Big IP. That lead me to believe it was failing on the virtual switch in Openstack.

 

It seems port security is enable by default and needs to be disabled in neutron. To do this, perform the following.

Delete any virtual objects that you have created that will need packets routed across your virtual firewall and remove any security groups you have applied to the instance.

Modify the file in /etc/neutron/plugins/ml2/ml2_conf.ini

Right below [ml2] – you will need to add extension_drivers = port_security

# An ordered list of extension driver entrypoints to be loaded from the

# neutron.ml2.extension_drivers namespace. For example: extension_drivers =

# port_security,qos (list value)

extension_drivers = port_security

After this is completed, restart neutron

systemctl restart neutron-server

 

Start creating the new tenant networks and attaching the interfaces to your virtual firewall and other instances. Find out what port ID they have been assigned, this will be easy to locate using the GUI

 

 

Once you get the port ID execute the following to turn off port security

neutron port-update  2bf6b77b-627e-4fd0-8cd9-69dc0b27d65e --port-security-enabled=False

 

You can check to see if it too with the following command

neutron port-show 1ee02bbe-4f87-4cb4-91e0-ced0ef691e1c

 

Once completed on all interfaces required, that should resolve the routing issue.

 

Disabling port security will prevent you from using security groups. Firewalls don't really need security groups enabled but if you want more restriction you can use allowed address pairs instead.

 

neutron port-update 'Port UUID' --allowed_address_pairs list=true type=dict ip_address='ip or CIDR'

 

Apply this on all ports that will be use to route traffic.