Uploading Data from your Local Computer
Uploading to DNAnexus from your laptop should not be done over the VPN — ensure you are disconnected before continuing with this guide!
DNAnexus provides a command line utility, dx
, to enable users to upload input data, download results, and run workflows in the cloud.
Click here to learn about all of the commands included in dx
.
The dx
command line tool is written in Python.
It can be installed onto your computer using pip install dxpy
.
However, many computers ship with multiple versions of Python, and that can lead to many really strange errors (e.g. one version of Python trying to load libraries from a different version of Python).
Thus, we highly recommend the use of conda environments to create an isolated Python environment and install dx
there.
If you are interested in learning more about the features provided by conda, their getting started guide is quite good. This is not strictly necessary for continuing with this guide.
Installing via conda
To install conda
and start creating isolated Python environments, you can visit the conda download documentation and complete the install instructions for your operating system.
Be sure to select "Python 3.X version" when choosing which version to download.
Once you complete installation, you should be able to use the conda
command in your terminal.
conda
can be used to create multiple, independent Python environments.
To leverage it to use dx
, you'll need to do two things:
- a one-time creation of a Python 3 environment with
dx
installed. - an environment activation step for every new terminal you open. Note that the environment we create will not be accessible without this explicit activation step.
To create an isolated Python 3 environment with dxpy
installed, use the following command (you can give your environment any name, here we name it dx-env
).
# conda create -n [environment-name] python=3 dxpy -y
conda create -n dx-env python=3 dxpy -y
Once you the environment is created, you can run this command each time you open a new terminal to ensure that environment is active.
# conda activate [environment-name]
conda activate dx-env
You should now be able to use the dx
command line tool in your terminal.
dx --help
Authenticating
Next, you'll need to configure the dx
command line tool with access to your St. Jude Cloud account.
Rather than exposing your username and password, best practice is to generate an authentication token that lives for a short period of time instead.
You can do so by following this guide on how to generate a DNAnexus authentication token.
Replace <auth-token>
with your own token in the example below.
dx login --token <auth-token> --noprojects
Upload and download files
With DNAnexus toolkit installed and configured, files can be transferred between St. Jude Cloud and your local computer by running dx upload
and dx download
.
To get acquainted with the command, you can view the relevant help messages.
dx upload -h
dx download -h
To upload a file sample.1.bam
to the /test/
folder in the project-alpha
cloud project, you could use the following command:
dx upload sample.1.bam --destination "project-alpha:/test/"
To download all files in the /results/
folder in the project-alpha
cloud project to the current working directory, you could use the following command:
dx download -r "project-alpha:/results/"
The dx
command line utility and its upload
/download
subcommands have many options you can configure based on your use case.
We recommend you view the help messages or reach out to us at support@stjude.cloud for more information.
Working With Our Data
In this overview, we will explain how to manage your data request(s) from St. Jude Cloud's genomics platform My Dashboard page and how to access and manage your data (once it has been vended to you) from within a DNAnexus project. The DNAnexus genomic ecosystem is the backbone for the computation and storage in St. Jude Cloud. This means that each data request in St. Jude Cloud corresponds to a project in DNAnexus. If you'd like, you can read an introduction to the DNAnexus ecosystem here. If you haven't already, follow this guide to request access to St. Jude data in this secure cloud ecosystem.
Uploading Data from St. Jude HPC to DNAnexus
This guide describes how to upload data from St. Jude's research computing cluster to DNAnexus. It covers logging in to the HPC, creating an interactive session, loading the DNAnexus upload agent, and uploading files to DNAnexus.