Cloud for Dummies - Data Transfer

You understand all that Cloud stuff (if not, then read the Cloud for Dummies tutorial on how to start a VM) but now you have to transfer some biggish amount of data to your VM. This could be a custom VM image that you want to run or some input data for your VM. The manual talks about GridFTP but you have no idea what this is. Here is how to do it.

We will use GridFTP and Globus Transfer: it is wicked easy!

1. It all starts with a free (as in free-of-charge and also free-of-obligations) Globus Transfer ID. Surf to

    https://www.globusonline.eu/

    and click on “Log In” (although you do not yet have a Globus account).

globus Online

2. At this point you have to choose your institution, so that the service can verify your identity. LRZ is not present in the list, so you should revert to the "catch all" organisation Globus ID and create a new account. In order to do that, just click on the Globus ID link at the top or look for the entry in the drop-down menu (you can also type here so that irrelevant results are filtered out.

alternate login

A login page is now presented. Since you do not have an account yet, just choose the Sign Up option at the top-right corner. If this is not your first login, then just enter your credentials and skip to point 4.

Globus1

3. If this is the first time that you use Globus Transfer, then you have to create a Globus account. The following page will pop up automatically. Please fill in the details. It might be a good idea to use your LRZ user ID (but you should use a different, complicated throw away password). Do not forget to accept the Terms of Service and to click on the Create ID to end the procedure.

Globus2

4. In order to transfer data you need two, so-called “endpoints” between which you will transfer the data. LRZ has already set up one endpoint for the Cloud, lrz#ONE, but you have to set up an endpoint on your computer, where the data (e.g., the VM image) are that you want to transfer to the LRZ Cloud. This is what we will do now.  You only have to do this once, then you are all set forever. You can transfer data from a Mac, a Linux machine, or a Windows machine into the Cloud. To do this, you have to get “Globus Connect Personal”: Click on "Manage Data" and then on "Transfer Files" and then on "Get Globus Connect Personal".

Transfer data

Follow the two simple steps that are explained in the pop-up window. You can name your endpoint anything you like.

5. Now we connect to the two endpoints.

connect to the two endpoints

In the left endpoint field start typing LRZ’s endpoint name: lrz#ONE. You will see that it auto-completes after a few characters (or you select it from the drop-down menu). Hit return.

Then type YOUR Globus Connect endpoint name in the right field. Again it will auto-complete, so you don’t have to type it all.

6. To transfer a VM image to the LRZ cloud, select the destination on the left side. You have to double click “scratch” and then the directory named after your LRZ ID.

On the right side, select the VM image you want to transfer over. Then click the big blue left-pointing arrow in the middle.

Transfer Files

That’s it! Once you are done, you will receive an email from Globus Transfer telling you that your data transfer has finished.

7. Now you have to import that image from the stageing zone (the loading dock, so to say) into your datastore (the permanent warehouse, so to say). You do this by creating a new Image (select "Virtual Resources" in the left menue, then "Images", then press the green + sign). Give it a name, maybe add a description, select the correct type (CDROM or OS), make it the default. Now come the important steps: you say that the image can be found via a path. And then you specify the same path (under /media/scratch/...) that you used to upload the image in the step just before this one. Finally, you click on the green "Create" button.

image

Then you can select the image in the ONE web interface.

Similarly, if you want to stage-in some data files, you put them into the stagein folder instead of the scratch folder. Select the ID-number of the VM to which you want to stage the data. And do the transfer! After a few days your data will be auto-deleted again, so copy them to a safe place in the Cloud! Read the Q&A section on "How can I stage in files directly into the VM?"

FYI: you just used Grid computing tools for your data transfer! Not really complicated, or?