TDT4200/exercise2/report.md

---
title: theory questions
date: 2025-09-16
author: fredrik robertsen
---

## 1) speed-up

the **formula for speed-up** is given by
$$\text{speed-up} = \frac{1}{\text{serialized-portion}}$$
such that if a program has 20% serial, you will at most get a 5x speed-up by
parallelizing.

i did not get any speed-up. in fact, my program is blazingly slow: creating an
image of twice the original XSIZE and YSIZE took closer to 5 seconds for my MPI
program, whereas the provided CPU program did it in about 3 seconds. this is
probably due to several issues, such as only running the MPI-program on my own
laptop, thus likely not leveraging any parallel processing. thence and from
other MPI overhead we obtain such discrepancies. i might have been able to break
even, if i ran it on snotra. i just don't think my implementation is
particularly fast, mostly due to the I/O part.

## 2) weak- and strong scaling

- **strong scaling** refers to scaling by increasing the amount of processors
  while keeping the problem size fixed.
- **weak scaling** refers to scaling by increasing both the problem size and
  amount of processors so as to keep the amount of work per processor fixed.

thought scenario from education-mode AI: adding one or more people to the task
of washing 10 dishes and 100 dishes. both examples are strong scaling, because
amount of dishes are the same, even if we add people. however, it highlights
overhead. it might be harder to work together on 10 dishes as opposed to 100.
then, if you were to hire people to wash dishes per 10 dishes you have to wash,
then we are talking about weak scaling.

the main difference is that strong scaling becomes applicable when you have
a sufficiently big enough problem size to begin with, such that the coordination
overhead becomes minimal. weak scaling works at any problem size, because it is
scaled proportionally to the amount of processors.

## 3) SIMD vs SPMD

MPI is **SPMD**, single program multiple data.

this is apparent from the fact that we have only a single program running,
spawning subprocesses to handle the multiple data. program refers to how each
subprocess has the same program code, using only clever if-statements to
differentiate between relevant ranks.

**SIMD** is different in that it uses instruction-level handling of multiple data
through vectorization. i.e. it could perform arithmetic or calculations on many
elements of an array simultaneously through a single instruction, rather than
spinning up subprocesses for doing it in parallel. this requires hardware
support through the cpu architecture, i.e. through vector registers. though
i don't know much about this (yet), as we've focused on SPMD so far.