ex5: finish report
This commit is contained in:
5
exercise5/report/Makefile
Normal file
5
exercise5/report/Makefile
Normal file
@@ -0,0 +1,5 @@
|
||||
all: report.pdf show
|
||||
report.pdf: report.md
|
||||
pandoc $< -o $@ --pdf-engine=typst
|
||||
show: report.pdf
|
||||
zathura report.pdf
|
||||
70
exercise5/report/report.md
Normal file
70
exercise5/report/report.md
Normal file
@@ -0,0 +1,70 @@
|
||||
---
|
||||
title: theory questions
|
||||
author: fredrik robertsen
|
||||
date: 2025-10-20
|
||||
---
|
||||
|
||||
## 1. why no border exchange in pthreads?
|
||||
|
||||
threads operate on shared memory as opposed to the quarantined processes of MPI
|
||||
and the likes. thus there is no need to communicate the border values, since the
|
||||
threads can simply read them. as such, we need to be careful with placing some
|
||||
barriers, such that the threads access the memory at the correct times, i.e.
|
||||
when data has been calculated and is ready to be read for further processing.
|
||||
|
||||
we want to avoid serializing the program too much, so excessive barriering is
|
||||
bad for performance.
|
||||
|
||||
## 2. OpenMP vs MPI
|
||||
|
||||
they may sound similar, but they are fundamentally different, in that openmp
|
||||
uses threads, while mpi uses processes, as i mentioned in 1. they both have
|
||||
their strengths and weaknesses: openmp is rather good for abstract, higher-level
|
||||
parallelism using threads, essentially acting as a "go faster" button for
|
||||
computationally intensive code, abstracting away the ceremony of pthreads; mpi
|
||||
uses processes to compute in parallel, relying on message passing between them.
|
||||
|
||||
if you wish to compute on a large scale, for example using a cluster with many
|
||||
cores available, mpi might be more performant and more logical to use, whereas
|
||||
for a single computer, threading is a good way to gain some speed-up. just be
|
||||
careful about your data locality in either case!
|
||||
|
||||
## 3. pthreads vs OMP barrier vs OMP workshare
|
||||
|
||||
the provided openmp barrier implementation is almost identical to my pthreads
|
||||
solution. it manually tracks the thread ids and shares the workload dynamically
|
||||
based on how many threads you spawn, using meticulous barriers for correctness
|
||||
in a near-identical way to my pthreads. instead of doing a modulo/fixed-step
|
||||
division of work, i give each thread a row of data in pthreads for better cache
|
||||
locality, since the data will be close in memory on each thread, rather than
|
||||
having to jump a row for each thread. in short, i win because of row-major cache
|
||||
locality optimization.
|
||||
|
||||
my openmp workshare implementation parallelizes the for loop using the magical
|
||||
`#pragma omp parallel for`, known as the "free speed-up" button. it essentially
|
||||
splits the workload between each thread, giving each thread a row of the for-loop.
|
||||
this is a higher-level implementation similar to both the other two, but without
|
||||
any of the boilerplate. sweet!
|
||||
|
||||
## 4. parallelizing recursion problems with OpenMP
|
||||
|
||||
recursion can be seen as a tree-structure, where nested function calls create
|
||||
nodes of stack contexts that remember the state of their parent nodes. if you
|
||||
handle the race conditions properly using locks or atomic semantics, you can
|
||||
spawn threads for each recursive call, creating a situation where subthreads
|
||||
create more threads. similar to infinite recursion problems and OS fork-bombs,
|
||||
this must be handled carefully.
|
||||
|
||||
openmp can streamline this for us using task-oriented semantics. you can create
|
||||
a task which is then queued to be computed by a thread in the thread pool, which
|
||||
typically contains as many threads as you have cores available, though this is
|
||||
compiler-specific or user-specified. this can be done using
|
||||
|
||||
```c
|
||||
// pseudocode using regex patterns
|
||||
#pragma omp task [depend(in:$(var)) | firstprivate(x)]
|
||||
#pragma omp task[wait|group]
|
||||
```
|
||||
|
||||
by using a thread pool, we avoid the dangers of spawning too many threads,
|
||||
avoiding a fork-bomb.
|
||||
Reference in New Issue
Block a user