Advanced High Performance Computing

[307SM]
a.a. 2025/2026

2° Year of course - First semester

Frequency Not mandatory

  • 6 CFU
  • 48 hours
  • English
  • Trieste
  • Obbligatoria
  • Standard teaching
  • Oral Exam
  • SSD INF/01
  • Advanced concepts and skills
Curricula: HIGH PERFORMANCE COMPUTING AND DATA ENGINEERING
Syllabus

Knowledge and understanding: a complete understanding of modern architectures and how to exploit them efficiently with serial codes; a firm understanding of advanced concepts of: HPC, parallelism, parallelism in real-world, message-passing with MPI, shared-memory with OpenMP; understanding of how to decompose a computational problem in its essential traits and to express inherent parallelism; understanding of the parallel version of some widely used and common algorithms.

Applying knowledge and understanding: capability of writing a complex parallel code coupling different algorithms; capability of profiling a code and understanding its pitfalls in terms of computational and parallel efficiency; capability of optimizing a code; capability of conducting a real-world project from the initial planning to deployment of an efficient code.

Making judgments: capacity to individuate parallel potential in a complex problem and to express it either in message-passing or in shared-memory paradigms, or both; capacity to evaluate scalability and efficiency of complex codes and to understand how to optimize them.

Communication skills: capability of explaining to a technical audience, both in verbal and graphical terms, advanced concepts of HPC and personal analysis of a problem; capability to propose and discuss how to parallelize/optimize a computing-intensive problem.

Learning skills: capability of autonomously proceeding in learning more sophisticated or peculiar topics in HPC, in different parallel paradigms; capability of reading, analyzing and understanding papers and books on parallel algorithms and their implementations.

Students must have passed the course 319SM-2 “High Performance Computing”, or equivalent. Student must be comfortable with basics of modern computer architecture, HPC, message-passing and shared-memory paradigm, basic MPI and OpenMP concepts.
It is suggested that they have also attended a course on algorithms.

1. Recap of basics concepts of Computer’s architecture, HPC and parallelism
2. Advanced topics in Computer’s Architecture and code Optimization
- advanced techniques on memory hierarchy
- vectorization
- exploitation of pipelines
- exploitation of OoO and ILP
- accessing performance counters
3. Advanced topics in parallelism
- parallelism and parallel programming models
- examples on parallel implementation of some classical algorithms
- designing parallel algorithms: partitioning, domain and functional decomposition, communication strategies, parallel granularity, processes and threads mapping, modularity and concurrency
- parallel performance and performance modelling
- profiling a code and evaluating results
4. Advanced topics in MPI
- details on all blocking, non-blocking, buffered, point-to-point and collectives
- overview and details of all collective communications
- advanced use of communicators, cartesian and graph communicators
- DRMA via MPI memory windows
- MPI shared-memory windows and explicit NUMA awareness
5. Advanced topics in OpenMP
- beyond for loops: how to decompose work-load with some examples
- OpenMP task and their usage
- advanced affinity control and thread placement
- vectorisation
6. Elements of GPU offloading

For the computer’s architecture & Code optimization
“Computer Architecture. A quantitative approach”, Hennessy and Patterson, Morgan Kauffman, 6th ed. 2019
Chap. 1-5
“Computer Organization and design”, Patterson and Hennessy, Morgan Kauffman, 5th ed. 2014
Chap. 1, 4, 5
“Computer Systems. A programmer’s perspective”, Bryant and O’Hallaron, Pearson, 3rd ed. 2016
Chap 1-3, 7, 9 as a general recap, Chap 5, 6 on optimization
“Introduction to High Performance Computing for Scientists and Engineers”, Hager and Wellein, CRC Press, 2011
Chap. 1-3

For the Parallel Programming
“Designing and Building Parallel Programs”, Ian Foster, 2003, available at http://www-unix.mcs.anl.gov/dbpp/text/book.htm l
Part 1
“High Performance Computing. Modern Systems and Practices”, Sterling, Anderson and Brodiwicz, Morgan Kauffman, 2018
Chap. 1-4, Chap. 6-8, Chap. 9-11, 13-14, 17-18, 21
“Introduction to High Performance Computing for Scientists and Engineers”, Hager and Wellein, CRC Press, 2011
Chap. 4-11, App. A
“Parallel Programming for Science and Engineering”
Victor Eijkhout, publicly available at https://web.corral.tacc.utexas.edu/CompEdu/pdf/pcse/

Lectures on theory topics with the aid of slides projected in the classroom, and/or blackboard.
Tutorials conducted by lecturers and tutors on how to use at best HPC platforms ( one among those available will be chosen and the accounts for all the student will be provided )
Lab sessions on different exercises proposed by lecturers and tutors
Possibly, a few seminars from external guests

The students are required to bring their own laptop; they are also required to have a production-ready environment (linux box is strongly recommended) that allows them to edit and compile C/C++ codes with OpenMP and MPI support.

1. The examination consists of two parts:

a.The implementation of a parallel code that solves a given problem (it may be possible to choose a project among several), and the production of a report that explains extensively the problem analysis, that justifies the algorithmic and implementation strategies and assesses the scalability and efficiency of the code itself.
b.An oral examination starts from the analysis of the code and the report provided by the student. The discussion then expands to topics presented during the course that are relevant to the course

2. Every topic presented in the course may be discussed during the oral examination

3. The following elements are evaluated :
a.The quality of the report (metrics that define the quality f the report will be clearly explained), of the code and of the analysis provided.
b.The candidate’s mastery of the presented code and analysis.
c.The candidate’s mastery of any concept presented in the course that is discussed in the oral examination.
d.The candidate’s ease in exposing concepts and connecting different ideas, and in tackling new problems at first glance if they arise during the discussion.