ex2: theory questions report
This commit is contained in:
@@ -22,6 +22,15 @@ clean:
|
||||
show: $(OUTDIR)/mandel.bmp
|
||||
feh $<
|
||||
|
||||
pdf: report.md
|
||||
pandoc report.md -o $(OUTDIR)/report.pdf --pdf-engine=typst
|
||||
|
||||
zip: $(OUTDIR)/report.pdf Makefile mandel_mpi.c bench.py
|
||||
zip $(OUTDIR)/handin.zip $^
|
||||
|
||||
unzip: $(OUTDIR)/handin.zip
|
||||
unzip $< -d $(OUTDIR)/handin
|
||||
|
||||
# --- cpu-based ---
|
||||
$(OUT): $(SRC)
|
||||
$(CC) $(CFLAGS) -o $(OUT) $<
|
||||
|
||||
@@ -0,0 +1,56 @@
|
||||
---
|
||||
title: theory questions
|
||||
date: 2025-09-16
|
||||
author: fredrik robertsen
|
||||
---
|
||||
|
||||
## 1) speed-up
|
||||
|
||||
the **formula for speed-up** is given by
|
||||
$$\text{speed-up} = \frac{1}{\text{serialized-portion}}$$
|
||||
such that if a program has 20% serial, you will at most get a 5x speed-up by
|
||||
parallelizing.
|
||||
|
||||
i did not get any speed-up. in fact, my program is blazingly slow: creating an
|
||||
image of twice the original XSIZE and YSIZE took closer to 5 seconds for my MPI
|
||||
program, whereas the provided CPU program did it in about 3 seconds. this is
|
||||
probably due to several issues, such as only running the MPI-program on my own
|
||||
laptop, thus likely not leveraging any parallel processing. thence and from
|
||||
other MPI overhead we obtain such discrepancies. i might have been able to break
|
||||
even, if i ran it on snotra. i just don't think my implementation is
|
||||
particularly fast, mostly due to the I/O part.
|
||||
|
||||
## 2) weak- and strong scaling
|
||||
|
||||
- **strong scaling** refers to scaling by increasing the amount of processors
|
||||
while keeping the problem size fixed.
|
||||
- **weak scaling** refers to scaling by increasing both the problem size and
|
||||
amount of processors so as to keep the amount of work per processor fixed.
|
||||
|
||||
thought scenario from education-mode AI: adding one or more people to the task
|
||||
of washing 10 dishes and 100 dishes. both examples are strong scaling, because
|
||||
amount of dishes are the same, even if we add people. however, it highlights
|
||||
overhead. it might be harder to work together on 10 dishes as opposed to 100.
|
||||
then, if you were to hire people to wash dishes per 10 dishes you have to wash,
|
||||
then we are talking about weak scaling.
|
||||
|
||||
the main difference is that strong scaling becomes applicable when you have
|
||||
a sufficiently big enough problem size to begin with, such that the coordination
|
||||
overhead becomes minimal. weak scaling works at any problem size, because it is
|
||||
scaled proportionally to the amount of processors.
|
||||
|
||||
## 3) SIMD vs SPMD
|
||||
|
||||
MPI is **SPMD**, single program multiple data.
|
||||
|
||||
this is apparent from the fact that we have only a single program running,
|
||||
spawning subprocesses to handle the multiple data. program refers to how each
|
||||
subprocess has the same program code, using only clever if-statements to
|
||||
differentiate between relevant ranks.
|
||||
|
||||
**SIMD** is different in that it uses instruction-level handling of multiple data
|
||||
through vectorization. i.e. it could perform arithmetic or calculations on many
|
||||
elements of an array simultaneously through a single instruction, rather than
|
||||
spinning up subprocesses for doing it in parallel. this requires hardware
|
||||
support through the cpu architecture, i.e. through vector registers. though
|
||||
i don't know much about this (yet), as we've focused on SPMD so far.
|
||||
Reference in New Issue
Block a user