NPRG058: Advanced Programming in Parallel Environment
Lab practice 01 - C++ sort
This labs we will focus on brushing up on your basic C++ skills. We will implement a trivial parallel sorting using C++ threads only (or whatever native C++ STL libraries you can remember).
Setup
First we need to remember, how to work with the cluster.
-
Log in to
parlab.ms.mff.cuni.cz(SSH runs on port 42222) using your LDAP credentials. -
Use SLURM tools like
sinfoorsqueueto look around. Check out the documentation https://gitlab.mff.cuni.cz/mff/hpc/clusters if necessary. -
Make sure you can run jobs via
srun. To get you started, argument-pselects partition (job queue), the-Ais your account,-n 1makes sure your job is started only once (that's important if you single core-c 1), and-callocates given number of cores (regular workers have 32 physical cores with HT, so 64 is maximum). For instance:$> srun -p mpi-homo-short -A nprg058s -n 1 -c 64 ./your-app -
Setup your workspace so that you can easily edit the code and run it on the cluster. Please do not use Remote SSH VSCode extension! It consumes awfully lot memory on the parlab head node (which may lead to bad situations when many students do this simultaneously). A few tips:
- Use WinSCP if you run Windows. You can open a file for remote editing (in VSCode or any other IDE you prefer), it is copied to local tmp and then synced back with every save transparently (just be patient after saving).
- Use
rsyncorscp(on Linux or using WSL/MSYS/CygWin) to sync your code with parlab. You can set a script or configure your IDE to run it after every save. - Edit files directly on parlab remotely via SSH/Putty (using Linux console IDEs like Vim).
Remember to run all your code on the workers including the compilation (make). The head node parlab does not even have all the compilers/libraries installed and heavy computations on the head node slow down operations of other users.
Task
- Take a code prepared in
/home/_teaching/advpara/labs/01-cpp-sortand copy it into your home folder, so you can edit it. - Modify the sort implementation to run concurrently utilizing as many CPU cores as possible. Do not be afraid to use C++
std::sort()in your threads. The recommended approach is to use MergeSort or other divide-and-conquer sorting algorithms, but anything goes. - Stretch goal: think about how to execute even the merges in MergeSort in parallel so the level of parallelism do not decrease in the last steps of the algorithm.
Note: If you wish to implement and debug the assignment locally, you will need bpplib (a header-only C++ library) to make it work. On linux, just fix the Makefile (path to bpplib). On Windows (Visual Studio), set a BPP_LIB environmental variable that holds the path of bpplib's include subdir.