Docker Raw Format
A Virtual Machine (VM) is an environment on a host computer thatcan be used as if it were a separate physical computer. VMs can beused to run multiple operating systems simultaneously on a singlecomputer. Operating systems running inside a VM see emulated virtualhardware rather than the actual hardware of the host computer. Thisprovides more isolation than Jails, although there isadditional overhead. A portion of system RAM is assigned to each VM,and each VM uses a zvol for storage. While a VMis running, these resources are not available to the host computer orother VMs.
Docker Disk File: string: Browse to the location to store a new raw file. Add /, a unique name to the end of the path, and.img to create a new raw file with that name. Example: /mnt/pool1/rancherui.img: Size of Docker Disk File (GiB) integer: Allocate storage size in GiB for the new raw file. 20 is the minimum recommendation. Name of the disk image for the Docker Host to use as storage. 5: Raw filename password: string: Alphanumeric password added to the raw file. This is used to log in to the Docker Host. The default is docker. 5: Raw file size: integer: Set the size of the new raw file. 5: Raw file location: browse button: Select a directory to store the new raw. Docker Hub is the world's largestlibrary and community for container images. Browse over 100,000 container images from software vendors, open-source projects, and the community. Official Images.
FreeNAS® VMs use thebhyve(8)virtual machine software. This type of virtualization requires anIntel processor with Extended Page Tables (EPT) or an AMD processorwith Rapid Virtualization Indexing (RVI) or Nested Page Tables (NPT).
To verify that an Intel processor has the required features, useShell to run
grepVT-x/var/run/dmesg.boot. If theEPT and UG features are shown, this processor can be used withbhyve.
To verify that an AMD processor has the required features, useShell to run grep POPCNT /var/run/dmesg.boot. If theoutput shows the POPCNT feature, this processor can be used withbhyve.
By default, new VMs have thebhyve(8)
-H option set. This causes the virtual CPU thread toyield when a HLT instruction is detected, and prevents idle VMsfrom consuming all of the host’s CPU.
AMD K10 “Kuma” processors include POPCNT but do not supportNRIPS, which is required for use with bhyve. Production of theseprocessors ceased in 2012 or 2013.
14.1. Creating VMs¶
SelectVMs ‣ Add VMfor the Add VM dialog shown inFigure 14.1.1:
VM configuration options are described inTable 14.1.1.
|VM Type||drop-downmenu||Choose between a standard VM or a specialized Docker VM VM.|
|Name||string||Enter a name to identify the VM.|
|Description||string||Enter a short description of the VM or its purpose.|
|Virtual CPUs||integer||Select the number of virtual CPUs to allocate to the VM.The maximum is 16 unless the host CPU limits the maximum.The VM operating system might also have operational or licensing restrictions on the number of CPUs.|
|Memory Size (MiB)||integer||Allocate the amount of RAM in mebibytes for the VM.|
|Boot Method||drop-downmenu||Select UEFI for newer operating systems,or UEFI-CSM for (Compatibility Support Mode) older operating systems that only understand BIOS booting.|
|Autostart||checkbox||Set to start the VM automatically when the system boots.|
14.2. Adding Devices to a VM¶
After creating the VM, click it to select it, then clickDevices and Add Device to add virtual hardwareto it:
Select the name of the VM from the VM drop-down menu, thenselect the Type of device to add. These types areavailable:
A Docker VM does not support VNC connections.
Figure 14.2.2 shows the fields that appear whenNetwork Interface is the selected Type.
14.2.1. Network Interfaces¶
The default Adapter Type emulates an Intel e82545 (e1000)Ethernet card for compatibility with most operating systems. VirtIOcan provide better performance when the operating system installed inthe VM supports VirtIO paravirtualized network drivers.
If the system has multiple physical network interface cards, use theNic to attach drop-down menu to specify whichphysical interface to associate with the VM.
By default, the VM receives an auto-generated random MAC address. Tooverride the default with a custom value, enter the desired addressinto the MAC Address field.
To check which interface is attached to a VM, start the VMand go to the Shell. Type ifconfig and find thetap interface that showsthe name of the VM in the description.
14.2.2. Disk Devices¶
Zvols are typically used as virtual hard drives.After creating a zvol, associate it with the VMby selecting Add device.
Choose the VM, select a Type of Disk, select the createdzvol, then set the Mode:
- AHCI emulates an AHCI hard disk for best software compatibility.This is recommended for Windows VMs.
- VirtIO uses paravirtualized drivers and can provide betterperformance, but requires the operating system installed in the VM tosupport VirtIO disk devices.
If a specific sector size is required, enter the number of bytes intoDisk sector size. The default of 0 uses an autotune scriptto determine the best sector size for the zvol.
14.2.3. Raw Files¶
Raw Files are similar to Zvol disk devices,but the disk image comes from a file. These are typically used withexisting read-only binary images of drives, like an installer diskimage file meant to be copied onto a USB stick.
After obtaining and copying the image file to the FreeNAS® system,select Add device, choose the VM, select aType of Raw File, browse to the image file, then set theMode:
- AHCI emulates an AHCI hard disk for best software compatibility.
- VirtIO uses paravirtualized drivers and can provide betterperformance, but requires the operating system installed in the VM tosupport VirtIO disk devices.
A Docker VM also has a password field. Keka online. This is the loginpassword for the Docker VM.
If a specific sector size is required, enter the number of bytes intoDisk sectorsize. The default of 0 uses an autotuner tofind and set the best sector size for the file.
14.2.4. CD-ROM Devices¶
Adding a CD-ROM device makes it possible to boot the VM from a CD-ROMimage, typically an installation CD. The image must be present on anaccessible portion of the FreeNAS® storage. In this example, a FreeBSDinstallation image is shown:
VMs from other virtual machine systems can be recreated foruse in FreeNAS®. Back up the original VM, then create a new FreeNAS®VM with virtual hardware as close as possible to the original VM.Binary-copy the disk image data into the zvolcreated for the FreeNAS® VM with a tool that operates at the levelof disk blocks, likedd(1).For some VM systems, it is best to back up data, install theoperating system from scratch in a new FreeNAS® VM, and restore thedata into the new VM.
14.2.5. VNC Interface¶
VMs set to UEFI booting are also given a VNC (Virtual NetworkComputing) remote connection. A standardVNCclient can connect to the VM to provide screen output and keyboard andmouse input. Each standard VM can have a single VNC device. ADocker VM does not support VNC devices.
Using a non-US keyboard with VNC is not yet supported. As aworkaround, select the US keymap on the system running the VNC client,then configure the operating system running in the VM to use akeymap that matches the physical keyboard. This will enablepassthrough of all keys regardless of the keyboard layout.
Figure 14.2.6 shows the fields that appear whenVNC is the selected Type.
The Resolution drop-down menu can be used tomodify the default screen resolution used by the VNC session.
The VNC port can be set to 0, left empty forFreeNAS® to assign a port when the VM is started, or set to a fixed,preferred port number.
Select the IP address for VNC to listen on with theBind to drop-down menu.
Set Wait to boot to indicate that the VNC client should waituntil the VM has booted before attempting the connection.
To automatically pass the VNC password, enter it into thePassword field. Note that the password is limited to 8characters.
To use the VNC web interface, set VNC Web.
If a RealVNC 5.X Client shows the error
RFBprotocolerror:invalidmessagetype, disable theAdapt to network speed option and move the slider toBest quality. On later versions of RealVNC, selectFile ‣ Preferences,click Expert, ProtocolVersion, thenselect 4.1 from the drop-down menu.
14.2.6. Virtual Serial Ports¶
VMs automatically include a virtual serial port.
/dev/nmdm1Bis assigned to the first VM
/dev/nmdm2Bis assigned to the second VM
And so on. These virtual serial ports allow connecting to the VMconsole from the Shell.
The nmdmdevice is dynamically created. The actual
nmdm name candiffer on each system.
To connect to the first VM:
Seecu(1)for more information on operating cu.
14.3. Running VMs¶
SelectVMsto see a list of configured VMs. Configuration and control buttonsappear at the bottom of the screen when an individual VM is selectedwith a mouse click:
The name, description, running state, VNC port (if present), and otherconfiguration values are shown. Click on an individual VM foradditional options.
Some standard buttons are shown for all VMs:
- Edit changes VM settings.
- Deleteremoves the VM.
- Devices is used to add and remove devices to this VM.
When a VM is not running, these buttons are available:
- Start starts the VM.
- Cloneclones or copies the VM to a new VM. The new VMis given the same name as the original, with _cloneN appended.
When a VM is already running, these buttons are available:
- Stop shuts down the VM.
- Power off immediately halts the VM, equivalent todisconnecting the power on a physical computer.
- Restart restarts the VM.
- Vnc via Web starts a web VNC connection to the VM. TheVM must have a VNC device and VNC Web enabled in thatdevice.
14.4. Deleting VMs¶
A VM is deleted by clicking theVM, then Delete at the bottom of the screen. Adialog shows any related devices that will also be deleted and asksfor confirmation.
Zvols used indisk devices and image files used inraw file devices are not removed when a VMis deleted. These resources can be removed manually after it isdetermined that the data in them has been backed up or is no longerneeded.
14.5. Docker VM¶
Dockeris open source software for automating application deployment insidecontainers. A container provides a complete filesystem, runtime, systemtools, and system libraries, so applications always see the sameenvironment.
Rancheris a web-based tool for managing Docker containers.
FreeNAS® runs the Rancher web interface within the Docker VM.
14.5.1. Docker VM Requirements¶
The system BIOS must have virtualization support enabled for aDocker VM to work properly. On Intel systems this is typically anoption called VT-x. AMD systems generally have an SVM option.
20 GiB of storage space is required for the Docker VM. For setup, theSSH service must be enabled.
The Docker VM requires 2 GiB of RAM while running.
14.5.2. Create the Docker VM¶
Figure 14.5.1 shows the window that appearsafter going to theVMspage, clicking Add VM, and selecting Docker VM as theVM Type.
|VM Type||drop-down menu||Choose between a standard VM or a specialized Docker VM VM.|
|Name||string||A descriptive name for the Docker VM.|
|Description||string||A description of this Docker VM.|
|Virtual CPUs||integer||Number of virtual CPUs to allocate to the Docker VM. The maximum is16 unless the host CPU also limits the maximum.The VM operating system can also have operational or licensing restrictions onthe number of CPUs.|
|Memory Size (MiB)||integer||Allocate this amount of RAM in MiB for the Docker VM. A minimum 2048 MiB ofRAM is required.|
|Autostart||checkbox||Set to start this Docker VM when the FreeNAS® system boots.|
|Root Password||string||Enter a password to use with the Docker VM root account. The password cannotcontain a space.|
|Docker Disk File||string||Browse to the location to store a new raw file. Add |
|Size of Docker Disk File (GiB)||integer||Allocate storage size in GiB for the new raw file. 20 is the minimumrecommendation.|
Recommendations for the Docker VM:
- Enter Rancher UI VM for the Description.
- Leave the number of Virtual CPUs at 1.
- Enter 2048 for the Memory Size.
- Leave 20 as the Size of Docker Disk File (GiB).
Click OK to create the virtual machine.
To make any changes to the raw file after creating the Docker VM,click on the Devices button for the VM to show the devicesattached to that VM. Click on the RAW device to select it, then clickEdit. Figure 14.5.2 showsthe options for editing the Docker VM raw file options.
Fig. 14.5.2 Changing the Docker VM Password
The raw file options section describes the optionsin this window.
14.5.3. Start the Docker VM¶
Click VMs, then click on the Docker VM line to select it.Click the Start button and Yes to start the VM.
14.5.4. SSH into the Docker VM¶
It is possible to SSH into a running Docker VM. Go to theVMspage and find the Docker VM. The Info column shows theDocker VM Com Port. In this example,
/dev/nmdm12B is used.
Use an SSH client to connect to the FreeNAS® server. Remember this alsorequires the SSH service to be running. Depending on the FreeNAS®system configuration, it might also require changes to theSSH service settings, like settingLogin as Root with Password.
At the FreeNAS® console prompt, connect to the running Docker VM withcu, replacing
/dev/nmdm12B with the value from the Docker VMCom Port:
If the terminal does not show a
Enter. The Docker VM can take several minutes to startand display the login prompt.
14.5.5. Installing and Configuring the Rancher Server¶
Using the Docker VM to install and configure the Rancher Server isdone from the command line. Open the Shell and enter the command
/dev/nmdm12B isthe Com Port value in the Info column for theDocker VM.
If the terminal does not show a
rancherlogin: prompt aftera few moments, press
Enter rancher as the username, press
Enter, then type thepassword that was entered when the raw file was created above andpress
Enter again. After logging in, a
[[email protected]~]$ prompt is displayed.
Ensure Rancher has functional networking and can ping anoutside website.
If ping returns an error, adjust the VMNetwork Interface and reboot theDocker VM.
Download and install the Rancher server withsudo docker run -d --restart=unless-stopped -p 8080:8080 rancher/server.
CannotconnecttotheDockerdaemon error is shown,enter sudo dockerd and trysudo docker run -d --restart=unless-stopped -p 8080:8080 rancher/serveragain. Installation time varies with processor and network connectionspeed.
[[email protected]~]$ is shown when the installationis finished.
Enter ifconfig eth0 grep 'inet addr' to view the RancherIP address. Enter the IP address followed by
:8080 into a webbrowser to connect to the Rancher web interface. For example, if the IPaddress is
10.231.3.208:8080in the browser.
The Rancher web interface takes a few minutes to start. The web browsermight show a connection error while the web interface starts. If a
connectionhastimedout error is shown, wait one minute andrefresh the page.
When the Rancher web interface loads, click Add a host fromthe banner across the top of the screen. Verify thatThis site’s address is chosen and click Save.
Follow the steps shown in the Rancher web interface and copy the full
sudodockerrun command from the text box. Paste it in theDocker VM shell. The Docker VM will finish configuring Rancher. A
[[email protected]~]$ prompt is shown when theconfiguration is complete.
Verify that the configuration is complete. Go to the Rancher webinterface and clickINFRASTRUCTURE ‣ Hosts.When a host with the Rancher IP address is shown,configuration is complete and Rancher is ready to use.
For more information on Rancher, see the Rancherdocumentation.
14.5.6. Configuring Persistent NFS-Shared Volumes¶
Rancher supports using a single persistent volume with multiplecontainers. This volume can also be shared with FreeNAS® using NFS.FreeNAS® must be configured with specific NFS permissions and aRancher NFS servermust have a properly configured stack scoped volume.
A stack scoped volume is data that is managed by a single Rancher stack.The volume is shared by all services that reference it in the stack.
Configure NFS sharing for a stack scoped volume by setting specificoptions in the command line of the Rancher NFS server and the FreeNAS®system:
- Log in to the Rancher NFS server and modify
/etc/exports. Addan entry for the NFS shared directory, typically
/nfs, withseveral permissions options:
/nfsIP(rw,sync,no_root_squash,no_subtree_check).IP is the IP address of the client and can also be set to thewildcard
- In the FreeNAS® web interface, go toServices ‣ NFS Settings.Set Enable NFSv4 andNFSv3 ownership model for NFSv4. Click SAVEand restart the NFS service.
:nocopyto the end of the pool to be mounted:
Somewhat following on from my previous post about running containers in non-root environments I’ve been spending some more time reading up on Capabilities, so thought it would be worth making some notes.
What are Capabilities?
Linux capabilities have been around in the kernel for some time. The idea is to break up the monolithic root privilege that Linux systems have had, so that smaller more specific privileges can be provided where they’re required. This helps reduce the risk that by compromising a single process on a host an attacker is able to fully compromise it.
One point to make note of is, that capabilities are only needed to carry out privileged actions on a host. If your process only needs to carry out actions that an ordinary user could without the use of sudo, su or setuid root binaries, then your process doesn’t need any capabilities assigned to it.
To provide a concrete example, take the
ping program which ships with most Linux disitributions. Traditionally this program has been setuid root due to the fact that it needs to send raw network packets and this privilege is not availble to ordinary users. With a capability aware system this can be broken down and only the CAP_NET_RAW privilege can be assigned to the file. This means that an attacker who was able to compromise the ping binary, would only get a small additional level of privilege and not full access to the host, as might have been possible when it was setuid root.
Practical use of capabilities
So how do we actually manipulate capabilites on a Linux system? The most basic way of handing this (without writing custom code) is to use the
setcap binaries which come with the libcap2-bin package on debian derived systems.
If you use getcap on a file which has capabilities, you’ll see something like this
We can see here that the arping file has cap_net_raw with
+ep at the end of it, so what does that mean. The e here refers to the effective capability of the file and the p to the permitted capability. Effectively for file capabilities the effective flag is needed where the binary isn’t “capability aware” i.e. it’s not written with capabilities in mind (which is usually the case). For practical purposes if you’re assigning capabilities to files, you’ll use
+ep most of the time.
So if you want to assign a capability, for example to apply cap_net_raw to an nmap binary
It’s important to note that you can’t set capabilities on symlinks, it has to be the binary, and also you can’t set capabilities on shell scripts (well unless you have a super-recent kernel)
Some More background - Inheritable and Bounded
If you look at capability sets for files and processes, you’ll run across two additional terms which bear looking at, Inheritable and Bounded.
Inheritable capabilites are capabilites that can be passed from one program to another.
Bounded capabilities are, to quote the Man page for capabilities
The capability bounding set acts as a limiting superset for the capabilities that a thread can add to its inheritable set using capset(2).
So they restrict which capabilities can be inherited by a process.
Back to the practical - Auditing capabilities
This is all well and good, but how do we audit capabilities?
there’s a number of ways of reviewing what capabilities a process or file has got. From a low-level perspective, we can review the contents of /proc/[pid]/status. This will contain some information that looks like this :-
This set was for a user level process (using the command
cat /proc/self/status). As you can see the CapPrm and CapEff are both all zero’s indicating that I don’t have any capabilities and assigned.
Docker.raw File Size
If I then switch to a root user using
sudo bash and run the same command, I get the following
which is quite the difference, here CapPrm and CapEff have a lot more content as I’m a privileged user.
If we try the same in a Docker process using the command
docker run alpine:latest cat /proc/self/status we get
which is quite different. In this container we were running as root, so you might have guessed that we’d have the same permissions as we did in the root shell before. However as Docker limits the available default permissions we don’t get as much.
Of course these long hex strings aren’t exactly the most friendly way of viewing capabilities. Luckily there are ways of making this a bit more readable. if we use
capsh (which comes with libcap2-bin on debian derived systems) we can work out what’s meant here.
capsh --decode=0000003fffffffff returns
So that shows that our root shell run outside basically had all the capabilities. If we run
capsh --decode=00000000a80425fb we can see what Docker provides by default
which corresponds to the list in the source code.
So which of these capabilities are a concern from a security perspective? Well to an extent that’s going to depend on how you’re using them. However there are some good starting points to look at, firstly there’s this post from the grsecurity forum that goes into the risks of allowing various capabilities. You can also look at the default list of capabilities that Docker allows (link above), as the ones they block are things that have been determined as dangerous in the context of containers.
There are some other utilities which are handy for doing things like auditing capabilities. the libcap-ng-utils package has the very handy
pscap programs which can be used to review capapbilties on all files and all processes on a system by default. There’s also
captest which will review capabilities in the context of the current process.
Docker Raw Format Example
Also if you’re running containers and want a nice quick way to assess capabilities amongst other things, you could use Jessie Frazelle’s amicontained
Capabilities and Containers
So what has all this to do with Containers? Well it’s worth noting what was mentioned early in this post which is, if you have a container which will run as a non-root user and which has no setuid or setgid root prgrams in it, you should be good to drop all capabilities. This adds another layer of hardening to the container, which can be helpful in preventing container breakout issues.
If you’re running with root containers, then it’s well worth reviewing the default list of capabilities that is provided by your container runtime an ensuring that you’re happy that these are needed.
Specifically there are ones like CAP_NET_RAW in the default docker set which could be dangerous (see here for more details)
There are some gotcha’s to be aware of when using capabilities. First up is that, to use file capabilities, the filesystem you’re running from needs have extended attribute (xattr) support. A notable exception here is some versions of aufs that ship with some versions Debian and Ubuntu. This can impact Docker installs, as they’ll use aufs by default.
Another one is that where you’re manipulating files you need to make sure that the tools you’re using understand capabilities. For example when backing up files with tar, you need to use the following switches to make it all work.
In practice for tar you’ll likely want to use
--xatttrs-include=security.capability to make backups of files with capabilities.