Singularity – NIH HPC Systems

Posted: June 4, 2021 at 4:08 pm

Singularity

Extreme Mobility of Compute

Singularity containers let users run applications in a Linux environment of their choosing.

Possible uses for Singularity on Biowulf:

Web sites

Additional Learning Resources

Example definition files written by the NIH HPC staff

These definition files can all be found on GitHub, and the containers built from them are hosted on Singularity hub.

Additionally, a large number of staff maintained definition files and associated helper scripts can be found at this GitHub repo. These are files that staff members use to install containerized apps on the NIH HPC systems.

Please Note: Singularity gives you the ability to install and run applications in your own Linux environment with your own customized software stack. With this ability comes the added responsibility of managing your own Linux environment. While the NIH HPC staff can provide guidance on how to create and use singularity containers, we do not have the resources to manage containers for individual users. If you decide to use Singularity, it is your responsibility to build and manage your own containers.

Creating Singularity containers

To use Singularity on Biowulf, you either need to create your own Singularity container, or use one created by someone else. You have several options to build Singularity containers:

You can find information about installing Singularity on your Linux build system here for 2.x and here for the current 3.x series.

In addition to your own Linux environment, you will also need a definition file to build a Singularity container from scratch. You can find some simple definition files for a variety of Linux distributions in the /example directory of the source code. You can also find a small list of definition files containing popular applications at the top of this page.Detailed documentation about building Singularity container images is available at the Singularity website.

Troubleshooting containers that hang when they run out of memory

A few containers have caused issues on Biowulf by triggering a kernel level bug described in detail here and here. These include fmriprep and nanodisco. The problems follow a predictable pattern:

Binding external directories

Binding a directory to your Singularity container allows you to access files in a host system directory from within your container. By default, Singularity will bind your $HOME directory (along with a few other directories such as /tmp and /dev). You can also bind other directories into your Singularity container yourself.The process is described in detail in the Singularity documentation.

While $HOME is bound to the container by default, there are several filesystems on the NIH HPC systems that you may also want to include.Furthermore, if you are running a job and have allocated local scratch space, you might like to bind mount your lscratch directory to /tmp in the container.

The following command opens a shell in a container while bind-mounting your data directory, /fdb, /scratch, and /lscratch into the same path inside the containerIf you have access to shared data directories, you'll want to add them to the list as well (for example, /data/$USER,/data/mygroup1,/data/mygroup2,/fdb,...).

NIH HPC Staff recommendations for bind mounts on Biowulf

When building containers for use on Biowulf, there are two steps you should take to ensure that users of the container can read and write data to all normal directories. These are the same steps that NIH HPC staff take when containerizing applications for general use.

Step 1. Add directories and symlinks at build time

Data directories hosted on the GPFS file system rely on a series of symbolic links. Singularity can't follow these symlinks if they don't exist within the container, so you need to create them at build time. Other data directories can be automatically created within the container by recent versions of Singularity, but it's still a good idea to create them.

Include the following lines in the %post section of your definition file.

Step 2. Set the --bind option or the SINGULARITY_BIND variable appropriately at run time

Use the following option/argument pair when you run your container.

Singularity as an Installation Medium: faking a native installation

One use case of Singularity is to transparently use software in a container as through it were directly installed on the host system.To accomplish this on our systems, you need to be aware of the shared filesystem locations and bind mount the corresponding directories inside the container, which is more complicated than it seems because we use symbolic links to refer to some of our network storage systems.As a result, you will need to specify some directories in addition to the ones you use directly to ensure that the symbolic link destinations are also bound into the container.

If you wanted to take advantage of a Debian package this way and use it to install software into your home directory, for example samtools and bcftools, you would use a definition file, Singularity, with these contents:

After finalizing the definition file, you can proceed to build the container (of course, on a system where you have sudo or root access):

You can then set up your installation prefix (here, it's $HOME/opt/hts) as follows, making use of symbolic links and a wrapper script:

So if you have added the installation prefix $HOME/opt/hts/bin to your PATH, then calling samtools or bcftools will run those programs from within your container.And because we have arranged to bind mount all the necessary filesystems into the container, the path names you provide for input and output into the programs will be available to the container in the same way.

Interactive Singularity containers

Singularity cannot be run on the Biowulf login node.

To run a Singularity container image on Biowulf interactively, you need to allocate an interactive session, and load the Singularity module. In this sample session (user input in bold), an Ubuntu 16.04 Singularity container is downloaded and run from Docker Hub. If you want to run a local Singularity container instead of downloading one, just replace the DockerHub URL with the path to your container image file.

Expand the tab below to view a demo of interactive Singularity usage.

Singularity interactive container demo

Singularity containers in batch

In this example, singularity will be used to run a TensorFlow example in an Ubuntu 16.04 container. (User input in bold).

First, create a container image on a machine where you have root privileges. These commands were run on a Google Cloud VM instance running an Ubuntu 16.04 image, and the Singularity container was created using this definition file that includes a TensorFlow installation.

Next, copy the TensorFlow script that you want to run into your home directory, or another directory that will be visible from within the container at runtime. (See 'binding external directories' above). In this case, this example script from the TensorFlow website was copied to /home/$USER, and the container was moved to the user's data directory

Then ssh to Biowulf and write a batch script to run the singularity command similar to this:

Submit the job like so:

After the job finishes executing you should see the following output in the slurm*.out file.

Expand the tab below to watch a quick demo of Singularity in batch mode.

Singularity containers in batch demo

Singularity containers on GPU nodes

With the release of Singularity v2.3 it is no longer necessary to install NVIDIA drivers into your Singularity container to access the GPU on a host node. If you still want the deprecated gpu4singularity script that was used to install NVIDIA drivers within containers for use on our GPU nodes you can find it on GitHub.

Now, you can simply use the --nv option to grant your containers GPU support at runtime. Consider the following example in which we will download some TensorFlow models to the user's home directory and then run the latest TensorFlow container from DockerHub to train a model on the MNIST handwritten digit data set using a GPU node.

Expand the tab below to see a demo of installing and using GPU support in a Singularity container.

Using the GPU demo

Using Docker containers with Singularity

Singularity can import, bootstrap, and even run Docker images directly from Docker Hub. For instance, the following commands will start an Ubuntu container running on a compute node with no need for a definition file or container image!And, of course, we remember to set SINGULARITY_BINDPATH appropriately to be able to access all our files.

In this example, we will create a Singularity container image starting from the official continuumio miniconda container on Docker Hub. Then we'll install a number of RNASeq tools. This would allow us to write a pipelinewith, for example, Snakemake and distribute it along with theimage to create an easily shared, reproducible workflow. This definition file also installs a runscript enabling us to treat our container like an executable.

Assuming this file is called rnaseq.def, we can create a Singularity container called rnaseq on our build system with the following commands:

This image contains miniconda and our rnaseq tools and can be called directly as an executable like so:

After copying the image to the NIH HPC systems, allocate an sinteractivesession and test it there

This could be used with a Snakemake file like this

Expand the tab below to see an example of creating a Singularity container to be used as an executable from a Docker image on DockerHub.

Singularity with Docker demo

Follow this link:

Singularity - NIH HPC Systems

Related Post