Files
TDT4200/exercise2/report.md

2.6 KiB

title, date, author
title date author
theory questions 2025-09-16 fredrik robertsen

1) speed-up

the formula for speed-up is given by

\text{speed-up} = \frac{1}{\text{serialized-portion}}

such that if a program has 20% serial, you will at most get a 5x speed-up by parallelizing.

i did not get any speed-up. in fact, my program is blazingly slow: creating an image of twice the original XSIZE and YSIZE took closer to 5 seconds for my MPI program, whereas the provided CPU program did it in about 3 seconds. this is probably due to several issues, such as only running the MPI-program on my own laptop, thus likely not leveraging any parallel processing. thence and from other MPI overhead we obtain such discrepancies. i might have been able to break even, if i ran it on snotra. i just don't think my implementation is particularly fast, mostly due to the I/O part.

2) weak- and strong scaling

  • strong scaling refers to scaling by increasing the amount of processors while keeping the problem size fixed.
  • weak scaling refers to scaling by increasing both the problem size and amount of processors so as to keep the amount of work per processor fixed.

thought scenario from education-mode AI: adding one or more people to the task of washing 10 dishes and 100 dishes. both examples are strong scaling, because amount of dishes are the same, even if we add people. however, it highlights overhead. it might be harder to work together on 10 dishes as opposed to 100. then, if you were to hire people to wash dishes per 10 dishes you have to wash, then we are talking about weak scaling.

the main difference is that strong scaling becomes applicable when you have a sufficiently big enough problem size to begin with, such that the coordination overhead becomes minimal. weak scaling works at any problem size, because it is scaled proportionally to the amount of processors.

3) SIMD vs SPMD

MPI is SPMD, single program multiple data.

this is apparent from the fact that we have only a single program running, spawning subprocesses to handle the multiple data. program refers to how each subprocess has the same program code, using only clever if-statements to differentiate between relevant ranks.

SIMD is different in that it uses instruction-level handling of multiple data through vectorization. i.e. it could perform arithmetic or calculations on many elements of an array simultaneously through a single instruction, rather than spinning up subprocesses for doing it in parallel. this requires hardware support through the cpu architecture, i.e. through vector registers. though i don't know much about this (yet), as we've focused on SPMD so far.