Use/Feedbacks of Terradue platform for Solar Pilot


#1

Hello,

I managed to successfully use the Terradue (T2) platform to create a WPS of our model called McClear. The WPS use the platform WPS system and the Hadoop backend for the computation. I provide my personal experience to help the NextGEOSS pilot developers with the various platform steps in the following sections:
1 - Account creation,
2 - Setup VPN and SSH
3 - Virtual machine setup
4 - Making real business

1 - Account creation.

The account creation was not as smooth, as I expected because of 2 root mistakes. The first mistake was that I did not set my email address properly, and it takes several days to figure out this issue. Once we figured it out, the T2 support updated my account with the proper email address and things go better, even if I had to update all the occurrences of the miss-spelled address in the account. The second mistake was that I used my email address as a login name. This is not correct as you will latter need a login name that is compatible with Unix type of logins and email addresses aren’t. Thus once again my account was updated with a better login name by the T2 team. The first mistake was obviously my fault, but the second is more an issue of the platform that should check for valid login and provide hints to help the user to choose a valid login. Finally I have a usable account.

2 - Setup VPN and SSH.

This step is quite technical, but you have to follow instructions here [1]. Nevertheless I used some customization of the setup, because I already use SSH keys for other purpose. Be careful to not overwrite an existing SSH key following the above instruction. Personally I installed my SSH key files in: $HOME/.ssh/id_rsa.terradue{.pub} and changed
$HOME/.ssh/config to use the key for T2 machines. The user name on virtual machine that you will use will be the login name of you account.

For the VPN, I followed the instruction here [1]. I get to the OpenVPN Web page and downloaded the client.ovpn as I use Linux and I have OpenVPN provided by my distribution. I provided this file to NetworkManager, the software that manage my network on my computer and everything was ok on my personal network. At this point I asked my IT department the authorization to connect to the VPN from the MINES ParisTech network. VPN have security concerns because is main feature is
to connect two network together. After negotiating and setting our network routers to allow VPN above UDP, I finally get a working VPN.

3 - Virtual machine setup.

This step was easy and smooth, I followed instruction in support tickets [2] and few minutes later I get a new virtual machine. Since I have a proper login name set up in my account, a SSH key and the VPN up and running, I connect to the machine without any password.

4 - Making real business.

Now it’s time to compute something. My first choice was to implement a WPS of McClear, because this model does not use a large amount of data, about 25G to support 2 year timespan, it is nearly a standalone executable and it uses almost every technology we will during the development of our Energy pilots. First I gathered and uploaded all data and sources required, this step was quite smooth, because we have a 30 MB network upload bandwidth at MINES ParisTech, but may be difficult with low upload rate. Next I had to build sources. For this step I needed to install some libraries and softwares with 'sudo yum’. Nothing uncommon for developers. The main issue was that the OS distribution is quite old, and do provide old « autotools » and compiler. For the « autotools », I updated scripts and configuration files to use an older version. In that case, the gap wasn’t large, but for the compiler I had to build from the sources file the compiler gcc-4.9 which is the minimum version of gcc that compile our code. That was things that I’m used to do and I consequently get a working McClear executable.

The next step was to setup the Hadoop script and files to provide WPS service and to call my executable. To do so, I did the « hands-on » tutorials provided here [3] to get used with Hadoop. I get an issue with the Firefox browser that do not show the manager platform correctly, I had to switch to Google Chrome browser. Once on Google Chrome every thing was ok. My testing nodes worked. I finally customized the WPS « hands-on » example to get my own WPS to run McClear model.

Now I still have some issue to treat, like the GIT repository and need to deal with the upload of about 500Go of data, but the T2 platform is promising.

[1] @http://docs.terradue.com/developer-sandbox/start/laboratory/
[2] @https://support.terradue.com/
[3] @http://docs.terradue.com/developer-sandbox/developer/index.html