If you have a job that reads the same data file many times, or makes many “random” accesses to a data file, it may be more efficient to have that data locally on a node than compete with other users to access the file server.
Each node has almost 1TB of space mounted on /tmp. This /tmp space is local to each node.
So, you could copy your data to the node and access it locally, than when your job is done, copy the results back to your home directory.
Note: If what your program does is read a file strictly sequentially just once, this copy is unlikely to help.
There is a couple of options for doing the copy.
1) Do it directly in your script..
cp /home/username/mydata.fastq /tmp ... Run your process on the data in /tmp ... rm /tmp/mydata.fastq
(Really you would use mktemp to get a unique name to avoid clashes.)
Should be careful if you have multiple copies of your script running on a node: you could be copying the data multiple times.
2) Use Secure Copy(SCP) or Secure FTP(SFTP)
For detailed explanation refer How to use SCP and SFTP to securely transfer files
scp filename user@192.268.1.3:/tmp/
$ sftp username@192.168.1.3 sftp> put /etc/filename /tmp/
3) Use Globus
Use Globus to transfer files with a GUI interface or to transfer very large files.
Globus is a web based file transfer application that allows resilient, unattended file transfers between two Globus endpoints. Start the transfer and Globus ensures it completes successfully and sends email when the transfer is done. Globus may be preferable to SCP or SFTP when transferring very large files because it does so unattended, in the background, with status checking and fault tolerance.
There are two ways to use the “Globus Connect Personal” client in the BRC cluster. Below steps explain the text mode version. This requires both your web browser and a Unix terminal connected to the BRC cluster.
1) Load the latest Globus module
module load globuspersonal/3.2.2
2) Setup the client
globusconnectpersonal -setup
The program is going to create a URL that you need to copy and paste into the browser of your personal computer.In your browser follow all the instructions to login and authenticate into your Globus account.
At some point, it will show a page with an authorization code. Copy the code from the web browser and paste it into the Linux SSH terminal window at the prompt it says 'Enter the auth code: '.
Then it will ask for a name for your new Globus Endpoint, the prompt says “Input a value for the Endpoint Name: ”.
You can choose any name that makes sense when referring to the BRC cluster. Recommendation is to enter the answer: BRC Cluster.
The program is going to exit and return to the Linux command line.
3) Using the endpoint to transfer files
At this point you can start the client any time by doing the below
module load globuspersonal/3.2.2 globusconnectpersonal -start
While the client is running on the BRC cluster, you can access and transfer your files from the web Globus interface by searching the endpoint in the search bar. you can type BRC cluster, or navigate until you see “Your Collections” and choose the BRC cluster Endpoint.
As we transfer large amounts of data, it would be better to keep the Globus Connect client up and running by executing the below
nohup globusconnectpersonal -start &
Remember that the setup only gets done once. After that you can start the client.
Please refer below resources to learn how to transfer files using Globus.