In the following sections, the concept of the CEF will be described. We start with describing how the CEF access activation is selected by the specific system parameter useCEF accepting the values FALSE and TRUE. If the value useCEF is FALSE, the whole calculation will be executed in the PSE on the local machine separately. The scientist is able to use all features of the PSE, such as debugging, printing, but must wait until the calculation is completely finished. Without parallel extensions a PSE uses only one core. Depending on the power of the computer used, long running calculations can take a while. If the scientist sets useCEF to TRUE, the CEF will be used. The Code Execution Controller starts the required amount of VMs, transmits the calculation to VMs, executes the calculations, and generates the combined result. The administrator of the CEF must define which cloud platforms (e.g. Amazon EC2, Eucalyptus) are used. For each cloud platform he/she must set (a) what machines types should be used (e.g. m1.small, m1.xlarge), (b) how many instances can be started simultaneously, (c) the shut down behavior (e.g. shut down immediately after all waiting calculations are finished or just before the researcher has to pay for another hour for this idle machine), and (d) the total available daily/monthly budget for this cloud platform. The Code Execution Controller (CEC) is able to call the WorkerNodeStatus Web service from each VM to request the number of available cores, core usage, total and available memory. At the moment the CEC starts the maximum available amount of virtual machines if required, the maximum cost boundary is not yet implemented. The CEC stores all started VMs in a queue. If a calculation is waiting, the first free VM will be used for this execution. In terms of security, the worker node (VM) only accepts requests of the CEC that started the VM. When the calculation at a worker node is finished or failed, the result and log information will be sent to the CEC and afterwards all files from this calculation will be deleted immediately. In the future, the CEC will send sub-calculations from one user to a worker node at the same time, even if multiple cores are available. Therefore it is impossible to spy out data of other users by executing dangerous PSE code.
The advantages for the scientists are (a) the result will be available much faster than running locally, (b) the scientists can use the client computer for other purposes, (c) the scientist can look up the status of the calculation at the CEF-Portal, (d) the scientist is able to download the result to another computer, and (e) the scientist is able to execute other PSE code, even if the required PSE is not installed locally.
Figure 6 shows an overview of the whole CEF. It provides a framework for executing code from different PSEs, including MATLAB, R, and Octave. The system consists of four main parts. That is (a) the Code Execution Controller (CEC) Web application, (b) the different client libraries, (c) the Cloud infrastructure, and (d) the required Code Execution Framework virtual machine. Components depicted in color represent third party libraries that are being reused.
Components description
In the section we describe the components in detail.
Code Execution Controller (CEC)
The CEC consists of the Code Execution Java Library, two different Web service groups (client and worker node services), and a MySQL database to store all calculations and sub-calculations. The Java library is at the heart of the CEF. It provides methods to produce sub-calculations, start and stop virtual machines, to copy the code to be executed into S3 (AWS) or Walrus (Eucalyptus), as well as to monitor running calculations and virtual machines. The CEF supports parallel code execution on the level of executing methods in parallel with different parameter sets.
The client Web services support online execution of functions, methods, and scripts written in different PSEs (e.g. MATLAB, R, Octave, etc.). The user of the system communicates with the client Web services while worker node services are used only internally. In the following these two groups of services are described.
● Client services - These services must be invoked by one of the clients (e.g. Octave, MATLAB, R, Liferay and Taverna). They include
– Execute Calculation - this service can be used to start a new calculation. In order to execute a new calculation, all parameters for parallelization needed in the code are passed as comma-separated values to the service. At the beginning, this service generates parallel executable sub-calculations (same method with different parameters). Afterwards the PSE code will be stored at S3/Walrus to reduce time and data transfer for parallel execution. Finally, the sub-calculations will be transmitted to a free worker node virtual machine (VM) to be executed.
– Calculation Status - this service allows for the monitoring of the code execution by requesting the current status, which can either be compiling, waiting, running, finished, or error. The status can be requested either for the entire calculation or for each sub-calculation.
– Load Calculation Results - this service loads the results from either the entire calculation or from each sub-calculation.
– Load Calculation Logs - this service loads the logs from sub-calculations. This includes all output on the console from the used PSE.
– Load Calculation Code - this service can be used to download already executed code from the CEC. This code contains the source code and, in case of MATLAB code, the compiled code as well. This compressed zip file can be used as code for additional code executions with different parameter sets. If the code contains an already compiled MATLAB code the execution with the same parameter set will be faster than without the compiled code. The CEC recognizes the compiled code and skips the compilation step, depending on the amount of code this can last from some seconds up to a couple of minutes.
– Load All Available Calculations - this service returns all accessible calculations of the authenticated user. The Code Execution Liferay [37] portlet uses this method to show an overview on the calculations.
● Worker Node services - These services will be invoked by the worker node VM. They include
– Calculation Finished - this service informs about successfully finished sub-calculations and receives the calculation results and logs from the VM.
– Calculation Finished with Failure - in case the calculation finished with errors, then this service receives the calculation logs from the VM.
Supported clients
The CEF will be easily accessible from different clients. Each user is able to communicate with the CEF from within R/Octave/MATLAB, the workflow engine Taverna, or even from the Web without needing to install any specific environment. To support Taverna we implemented a Taverna activity, that is able to use the Web services of the CEF. We provide several different R/Octave/Matlab code examples (e.g. PI calculation, recursive CEF invocation, download code and re-execute the downloaded code). All Web services described above can be used with these client libraries, and have been tested on Windows, Linux, and OS X. Additionally, a researcher is able to start new calculations or monitor running calculations within our Web portal (Liferay). Each client/toolbox communicates with the client CEC Web services.
Cloud infrastructure
The CEF uses the EC2 API to communicate with the cloud infrastructure. The controller needs to start/stop instances on the cloud and store data within the data storage (Walrus/S3). All these steps can be done with the AWS SDK for Java and the Jets3t library.
Code Execution Framework virtual machine
We provide a specific worker node virtual machine (Amazon EC2 and Eucalyptus) for the execution of the different PSE code. On this VM all three PSEs (R, Octave, MATLAB Component Runtime) are installed and a Tomcat application server is running, hosting Code Execution Services of the CEF.
The worker node Web application provides several different Web services for the CEC. They include:
● Execute Calculation - this service can be used to start a new calculation at the specific worker node. In order to execute a new calculation all parameters needed in the code are passed as comma-separated values to the service. The worker node downloads the required PSE code from the Walrus/S3. All information or error outputs will be stored in files during the whole calculation. After the calculation is finished or failed the result and log information will be sent back to the CEC and all files will be deleted.
● Worker Node status - this service returns information about the worker node, such as total and used memory, number of available cores, used cores, etc. The worker node uses the SIGAR (System Information Gatherer And Reporter) Java library to request the required values from the machine.
● Load Calculation status - this service returns information about one specific sub-calculation, such as used memory, used CPU, etc.
● Load Calculation Logs - this service returns the log of a running calculation.
Execution sketches
In this section we walk through a complete execution sketch.
Figure 7 shows more details of the whole calculation process. The arrows show the direction of the communication between the involved systems. At the moment, the CEF can exchange CSV data. To be more generically usable in the future, we are planning to support HDF5 [38] as well. The whole code execution workflow can be started within a supported PSE, Taverna or the Web. Each client has to prepare the code and parameter data. At the first step the client converts the parameter set (e.g. in MATLAB cells or arrays) to a CSV string and zips the required code files (step 1). The maximum number of parallel executable sub-calculations is the number of rows of the parameter set. At the moment, the CEC starts one sub-calculation per row on idle VMs. In the future, the CEC is able to execute several sub-calculations with one Web service invocation at one worker node VM to reduce the transfer and Web service overhead. The number of starting VMs depends on (a) the number of available worker nodes, and (b) the duration of one single sub-calculation. The zip file contains the PSE code and a text file (java properties file) that includes information about the PSE used, compilation status, function name, and their input/output parameters. After the data preparation the client invokes the executeCalculation Web service at the CEC (step 2). The Code Execution Controller (a) stores the received data on the disk, (b) compiles the MATLAB source code, if required (for further information have a look at Section ‘MATLAB Component Runtime approach’), (c) generates the sub-calculations, (d) adds all sub-calculations to the calculation queue (step 3), and (e) starts additional Code Execution VMs, if required (step 4, step 5). A specific thread processes the calculation queue. For each calculation the code will be sent once to the Walrus or S3, depending on the cloud infrastructure used (step 6). This reduces the amount of transmitted data and the required time and costs. Afterwards the sub-calculation will be executed at an idle Code Execution VM(step 7). The worker node (a) requests the Code from Walrus/S3 (step 8), (b) executes the code in the shell (step 9), (c) generates the result CSV, (d) sends the result back to the CEC (step 10), and (e) deletes all generated files. Step (e) is important to keep a minimal amount of free disk space, otherwise we have to start a new instance if the Worker Node has not enough free disk space for further calculations. Additionally this must be done because of security reasons. At the end of the execution, the CEC checks the received data and updates the status information of the calculation. The researcher is able to request the status of the calculation (e.g., running, finished) and the results. Therefore the client invokes the loadCalculationResult Web service method with the id to download the result (step 11). The CEC (a) authorizes the user, (b) checks if the calculation is finished, and (c) generates the result CSV. At the end, the client converts the received CSV result set to the internal data structure of the corresponding PSE.
MATLAB Component Runtime approach
The MATLAB Component Runtime (MCR) enables a cloud node to execute compiled MATLAB methods without the need of any costly MATLAB license. In [39] MathWorks writes “All deployed components and applications can be distributed free of charge. The deployment products support the MATLAB language, most MATLAB toolboxes, and user-developed GUIs.” In order to use the MCR, the MATLAB method needs to be compiled into a standalone application, which can then run without the MATLAB interpreter. The following text segment is taken from the MATLAB Compile toolbox documentation, showing clearly the drawback of this approach: “... the components generated by the MATLAB Compiler product cannot be moved from platform to platform as is.” In order to deploy a MATLAB method to a machine with an operating system different from the machine used to develop the method, it is necessary to rebuild the program on the desired targeted platform. To solve this problem we generated and deployed a MATLAB compiler Web service on another machine with the same operating system as our worker node VM (Ubuntu 11.04). For this compile service we need a MATLAB license with all required toolboxes and additionally the MATLAB compiler toolbox. The administrator of the MATLAB compiler Web service must determine which toolboxes must be installed. If, nevertheless, a user would like to use a MATLAB toolbox, that is not installed, the compile step (first step) will fail and a corresponding error will be reported to the user. At our online test installation no additional toolboxes are installed. With this step, every user of the CEF is able to execute MATLAB source files without having to buy a MATLAB license.
MathWorks products license example
To calculate the license cost with and without CEF, the following six assumptions are made: (1) the company is allowed to use the academic price list (2013); (2) five researchers of the company are using Matlab at their computers (individual licenses); (3) all researchers must have all six Computational Finance toolboxes (financial toolbox, econometrics toolbox, datafeed toolbox, database toolbox, spreadsheet Link EX, and financial instruments toolbox); (4) the license for MATLAB itself costs € 500 (single named user or single computer); (5) all Computational Finance toolboxes cost € 200 each; (6) the MATLAB Compiler toolbox costs € 500.
With these assumptions without CEF the total license costs are € 8500 (for each user the MATLAB license costs and additionally all six Computational Finance toolboxes). In the best case with CEF the total license costs are € 2200 (one MATLAB license costs for a single machine, all six Computational Finance toolboxes, and additionally the MATLAB Compiler toolbox). You must take into account, that without having a valid MATLAB license for each user the development process is more complicating (e.g. no debugging, no GUI, no auto completion).