Parallel Programming Contest at CESGA
In 2017 the Spanish Parallel Programming Contest will run in systems of the CESGA.
The contest is September 20th, from 15:30 to 19:30 (Spanish time) in Málaga, coinciding with the Jornadas Sarteco. Non official, non on-site participation (exhibition) is also allowed.
The participation rules can be found at the webside of the Spanish Parallel Programming Contest.
The registration is open until September 14th, through the Mooshak system in CESGA. Ask for an account in the test “warmup+registration”, group “inscription2017” for regular participation, or group “inscriptionExhibition”. You should receive an e-mail with the account information in a few hours. If it does not work (it fails for the mail addesses of some universities) send an e-mail through CONTACT of the webside of the Spanish Parallel Programming Contest.
Contests “practice2011” and "2017ClaA" are open in the Mooshak system in CESGA. "practice2011" contains problems from the 2011 contest, and "2017ClaA" the problems of the qualification test. They can be used to practice and an account can be requested by connecting to Mooshak, and registering in the group “practice”.
General norms for test warmup+registration, and for the final contest:
The test "warmup+registration" contains an example of a simple matrix multiplication, with examples of implementations OpenMP, MPI, CUDA and XeonPhi (with OpenMP in offload mode).
There are problems to be solved with OpenMP (gcc v 6.1.0, C or C++), MPI (gcc v 6.1.0 + openmpi with MPI 2.0), CUDA (7.5) and XeonPhi (intel compiler, with OpenMP in offload mode).
OpenMP solutions use a multicore with 2 processors Haswell 2680v3 (24 cores). MPI solutions use a maximum of four nodes of this type. CUDA solutions use an NVIDIA Tesla K80. XeonPhi solutions use an Intel Xeon Phi 7120P.
After entering the selected test, there are five options:
-A problem can be selected and viewed. It is presented together with:
The scheme mooshak uses for input/output and for time account and speed-up calculation. This file is not modified by the contestants.
A file with the sequential solution provided by the organization. The contestants should modify the function which solves the problem, parallelizing it to obtain the maximum speed-up with respect to the sequential version. Speed-up can be obtained through parallelization or optimization of the sequential version. In the contest “warmup+registration”, a parallelization example is provided for each problem.
An example of input file, similar in problem size and number of problems to that with which the problem is assessed.
-The file where the contestants have written their solutions should be selected and submitted.
-Questions can be sent to the organizers with the Ask option.
-The Help option gives general information on Mooshak (it is not of use for our contest).
Submissions, Ranking, and Questions can be viewed.
The number of nodes and MPI processes and the problem type are indicated with comments at the beginning of the file sec.cpp (or sec.c, or sec.cu for CUDA programs). The problem we are working with is indicated in the comments and by selecting the corresponding problem when file sec.cpp is sent to the evaluation system.
The comments at the beginning of sec.cpp include five variables:
CPP_CONTEST=warm2017 or 2017 (not modify)
CPP_PROBLEM=Name of the problem. It must not be modified.
CPP_LANG= C+OPENMP or CPP+OPENMP for OpenMP problems, C+MPI or CPP+MPI for MPI problems, CUDA for CUDA problems, and XEONPHI for problems on XeonPhi.
CPP_NUM_NODES=Number of nodes. 1 for OpenMP, CUDA or XeonPhi, and from 1 to 4 for MPI.
CPP_PROCESSES_PER_NODE=Number of MPI processes per node. 1 for OpenMP, CUDA or XeonPhi.
In OPENMP problems, the 24 cores of the reserved node can be used. The number of threads should be indicated in the code (sec.cpp), for example with omp_set_num_threads.
In MPI problems the number of MPI processes to be started in each node is indicated. All the cores in the reserved nodes can be used. The program is compiled without the openmp option, so, hybrid MPI+OpenMP parallelism is not allowed.
In CUDA and XeonPhi problems, a node is reserved, but only one GPU or XeonPhi in the node is used, and only one core of the CPU is used.
A submission which solves the corresponding problem appears with the time (miliseconds) spent on the solution. Other answers can be Compilation Error, Wrong Answer, Time Limit Exceeded, Evaluating. Evaluating means the submission has not gained access to the requested resources. Currently, some submissions stack up. In the qualifying contest and in the final contest the access will be guaranteed. No duplicated submissions with exactly the same code are admitted (you can modify the submission by adding a comment). Sometimes, after a submission, the system says “can't read errorCode: no such variable”. Just resend.
In Classification, the score for each problem and contestant and the total score of the contestant are shown. For each problem appears the score and, in brackets, the execution time, the speed-up minus 1, the number of submission counting for penalization and the total number of submission. Submissions with compilation errors are not penalized. For each problem, 1 point is substracted from the speed-up for each submission over 10.
The score awarded is calculated on the basis of the speed-up achieved with respect to the sequential solution given by the organizing committee. A problem will be awarded zero when no correct solution is obtained or when the speed-up is lower than one.
The inputs used for testing will not give large execution times (less than 40 seconds).
A maximum score is established for each problem. When the speed-up is higher than this value, the score for this problem is the maximum value, and the scores of the other contestants are calculated by interpolation.
Particularities of the practice2011 contest:
The comments at the beginning of sec.cpp are those used in the contests in 2011. For practising in the system at CESGA they should be modified accordingly with the given instructions.
The user “secuencial” was used to run the sequential versions, and the user “records” for the best solutions for each problem.