BMDFM: Parallel Processing on Multicore Computers Using Binary Modular Dataflow Machine

preview_player
Показать описание
This demo is going to show you how we can run applications implicitly on all available processor cores in parallel.
At the end of this demo you will see how we can run these applications in parallel using the dataflow machine speeding up execution time of the application.

Binary Modular Dataflow Machine (BMDFM) is a software layer that enables running an application in parallel on shared memory symmetric multiprocessing (SMP) computers using the multiple processors to speed up the execution of single applications. BMDFM automatically identifies and exploits parallelism due to the static and mainly dynamic scheduling of the dataflow instruction sequences derived from the formerly sequential program.

The BMDFM dynamic scheduling subsystem performs a symmetric multiprocessing (SMP) emulation of a tagged-token dataflow machine to provide the transparent dataflow semantics for the applications. No directives for parallel execution are needed.

0:00 Welcome.
0:13 Spoiler.
0:38 Intro/Background.
2:15 Test environment.
3:14 BMDFM on the internet.
3:44 Download and setup.
5:30 Test application.
6:58 GMP library.
7:39 BMDFM configuration.
9:11 Test running on a single core. (644sec.)
9:50 Test running automatically on all cores. (8sec.)
12:12 Conclusions.
Рекомендации по теме
Комментарии
Автор

In case of a runtime error, it is clear how do I get the location of an error in my code - just simply stating with command line option "-sd|--showDebugInfo".
But, suppose I typed an expression directly on BMDFM Server console having a runtime error, e.g.: '(PROGN (DEFUN FOO (++ A)) (FOO))'
I am getting: "[RunTimeErrCode=10] Variable_getval: variable was not initialized before use! (Fnc=Main:FOO; Dbg=136; Var=A)"
This is ok. But how do I know where exactly in my expression actually the error is (typed expression might be quite long)?

DimaBlonsky
Автор

Just tested on Ampere Altra 80 core server:

Single thread - 200 sec.
BMDFM multi-thread - 3 sec.

$ lscpu
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 1
Core(s) per socket: 80
Socket(s): 1
Vendor ID: ARM
Model: 1
Model name: Neoverse-N1
CPU max MHz: 3000.0000
CPU min MHz: 1000.0000
BogoMIPS: 50.00
L1d cache: 10 MiB
L1i cache: 10 MiB
L2 cache: 80 MiB

tr.trinity
Автор

Sequential accumulation of mpf_sum is non-parallel:

(setq mpf_sum (mpf (padl "0.0" mpf_precision)))

(for k 0 1 iterations (progn

# . . .

(setq mpf_sum (mpf_add mpf_sum mpf_f))
))


Parallelism can be achieved using a simple reduction operation where mpf_sum is comprised of multiple parallel mpf_sum_:

(setq mpf_sum (mpf (padl "0.0" mpf_precision)))

(setq rsize (n_cpuproc))

(for k_ 0 rsize iterations (progn

(setq mpf_sum_ (mpf (padl "0.0" mpf_precision)))

(setq kk (-- (if (> (+ k_ rsize) iterations) iterations (+ k_ rsize))))

(for k k_ 1 kk (progn

# . . .

(setq mpf_sum_ (mpf_add mpf_sum_ mpf_f))
))
(setq mpf_sum (mpf_add mpf_sum mpf_sum_))
))

tr.trinity
Автор

We have tried to run BMDFM under z/OS on our old IBM mainframe. Seems that normal files with our programs cannot be processed by BMDFM. We have only seen some trash error messages.

DimaBlonsky
welcome to shbcf.ru