====== Creating a custom Singularity container for R ======

With a custom Singularity container on TSD you can create a custom R environment essentially independent of what the TSD infrastructure supports natively. A Singularity container can circumvent some of the ''<nowiki>gcc</nowiki>''-compiler or soft dependency issues. 

In this tutorial we'll create a Singularity container (a ''<nowiki>.sif</nowiki>''-file) on a local Mac (these steps might work for Windows too, but I don't have access to one to try it out). 

First of all, set up the following directory structure:

<code>
.
|- run_vagrant.sh
|- from_docker_image.sh
|- rcontainer/
    |- Dockerfile
|- scripts/
    |- apt_get_essential.sh
</code>

The ''<nowiki>rcontainer/</nowiki>'' directory will be the name of your Singularity container, you can rename this to anything you'd like (e.g. ''<nowiki>cerebellum-project</nowiki>''). Inside of this directory is your ''<nowiki>Dockerfile</nowiki>'', we'll go over that later. You'll also need the ''<nowiki>run_vagrant.sh</nowiki>'' and ''<nowiki>from_docker_image.sh</nowiki>'' scripts. Lastly, you'll need another directory called ''<nowiki>scripts</nowiki>'' and put the ''<nowiki>apt_get_essential.sh</nowiki>'' script in there. You can download these scripts (and a template directory tree) {{ rcontainer_template.zip?linkonly |here}}.


==== STEP 1 - Installing software ====

In order for this tutorial to work you need to have **Singularity**, **Docker**, **Vagrant**, and **VirtualBox** installed. You can install Singularity on Mac using the following command:

<code>
brew install singularity
</code>

You can install Docker from [[https://docs.docker.com/docker-for-mac/install/|here]]. 

You can install Vagrant and a necessary plugin like this:

<code>
brew install vagrant
vagrant plugin install vagrant-vbguest
vagrant plugin update vagrant-vbguest
</code>

Although I'm not 100% sure, I think you need VirtualBox installed as well. It facilitates infrastructure for running virtual machines on Mac (where Vagrant runs the virtual machines itself). You can download the installation file for VirtualBox [[https://www.virtualbox.org/wiki/Downloads|here]]. VirtualBox is also available via the UiO Software Center if you have a university computer.

You need to have these programs installed in order to use the steps in this tutorial.


==== STEP 2 - Dockerfile ====

The Dockerfile is where you specify the details of your container.

== R version ==

The first line of the Dockerfile says something like this.

<code>
FROM rocker/verse:4.1.0
</code>

You can specify the specific R version behind the colon. In this case, the R version for the container is 4.1.0.

== Software dependencies ==

The next few lines contain some technical parts. It also includes a call to the ''<nowiki>scripts/apt_get_essential.sh</nowiki>'' script. If you open the script, you'll see it'll install a few essential programs such as git, make, pandoc, and wget.


== R packages ==

The next part is where you install the R packages you want to use in the container. The format for installing a package goes like this.

<code>
RUN R -e 'install.packages("tidyverse")'
</code>

You can also use helper functions from libraries to install packages that are not on CRAN. You can for instance install from GitHub like this:

<code>
RUN R -e 'install.packages("remotes")'
RUN R -e 'remotes::install_github("norment/normentR")'
</code>

Or packages from Bioconductor like this:

<code>
RUN R -e 'install.packages("BiocManager")'
RUN R -e 'BiocManager::install("enrichplot")'
</code>

Or packages from a different source like this:

<code>
RUN R -e 'install.packages("http://cnsgenomics.com/software/gsmr/static/gsmr_1.0.9.tar.gz", repos = NULL, type = "source")'
</code>

etc.


== Install addtional tools ==
If you installed a package that requires some system-level data or functions, you can specify it here. For instance the ''<nowiki>{gsmr}</nowiki>'' package requires some files on the system. For this you need the `install_gcta.sh` script which you can download [[https://github.com/comorment/gwas/tree/main/scripts|here]]. You can also download a number of other scripts to install different tools. In general, the syntax looks something like this:

<code>
WORKDIR /tools/gcta
COPY /scripts/install_gcta.sh /tmp
RUN chmod +x /tmp/install_gcta.sh
RUN bash /tmp/install_gcta.sh
</code>

== Other ==

The last line of the Dockerfile says:

<code>
WORKDIR /tools
</code>

This sets the working directory. It's not super important I think.

==== STEP 3 - Build the container ====

Next, we'll use the ''<nowiki>run_vagrant.sh<nowiki>'' script to turn the Dockerfile into a Singularity container (''<nowiki>.sif<nowiki>''-file). This script runs with no issues on my Mac, if you run into any issues, you can try to run each line in this script one-by-one and debug from there. Note that the ''<nowiki>run_vagrant.sh</nowiki>'' script needs to be run with root privileges.

<code>
sudo bash run_vagrant.sh rcontainer
</code>

How long it takes to build the Singularity container depends on the number of packages and tool dependencies you include in the Dockerfile. Generally, it should take about 20 to 30 minutes overall. The script will provide you with plenty of feedback. 

When the command above finishes running, you'll have a file called ''<nowiki>rcontainer.sif</nowiki>'' in the current directory. This file you can upload to TSD, where you can run the container with this command:

<code>
singularity exec --home $PWD:/home:/cluster rcontainer.sif R
</code>

==== (OPTIONAL) STEP 4 - Using the container on TSD ====

My recommended way to use the Singularity R container on TSD while you're writing and editing your R scripts is to have terminal windows open, both with the same directory path. In one, you can have e.g. gedit or vim open and edit your R or RMarkdown scripts, and then you copy-paste the code into the Singularity R container running. When you've finished the script, you can of course run it as illustrated in the documentation elsewhere.

**DISCLAIMER** - This is not a bullet-proof script, depending on the setup on your Mac you may have to take additional steps and configurations to make it work. I can help with some of the stuff in this script. For the technical parts about virtual machines, singularity etc., Google is your friend.

**DISCLAIMER PT.II** - I haven't gotten RStudio to work in a container yet. Bayram has gotten it to work for Jupyterlab, but RStudio is still out of reach, so for now you can only use the command line interface. More importantly, I haven't gotten X11 forwarding to work either, so you can only view figures or plots by saving them and then opening the image file from your node.
