6. Multicore Programming

Показать описание

MIT 6.172 Performance Engineering of Software Systems, Fall 2018
Instructor: Julian Shun

This lecture covers modern multi-core processors, the need to utilize parallel programming for high performance, and how Cilk abstracts processor cores, handles synchronization and communication protocols, and performs provably efficient load balancing.

License: Creative Commons BY-NC-SA

Рекомендации по теме

Комментарии

*My takeaways:*
1. Why we need multicore processors 1:13
- End of Moore's law
- End of frequency scaling
2. Abstract multicore architecture 7:40
3. Share-memory hardware
- Cache coherence 10:05
- MSI protocol 13:04
4. Concurrency platforms 19:21
- Pthreads (and WinAPI Threads) 24:45
- Threading building blocks 39:48
- OpenMP 45:27
- Cilk 50:55

leixun

집구석에서 배긁으면서 이런 수업을 들을 수가 있다고? 21세기 사랑해

rudnfehdgus

this was the most articulate programming lecture i've ever seen.

elijahjflowers

in the old days of Moore's Law, each doubling of transistor density, allowing for 2X transistors at a given die size with a new process, the expected performance gain over the previous architecture on the same process was about 1.4X. That, coupling with the 1.4X transistor switching frequency was the foundation of the original Moore's Law. To a degree, this was roughly the case for 386 -> 486 -> Pentium -> Pentium Pro. From that point on, it was difficult to 1) continue scaling performance at the single core level, and 2) coordinate the size of a team necessary to design more complex processor/cores. After Pentium and Pentium Pro, Intel started adding SIMD instructions which improved performance for code that could use the vector registers and instructions. The two Pentium 4's, Willamette and Prescott probably represented the final steps in trying to make heavy push to single core complexity.
Afterwards, most effort/die are went to increasing core count, though the super scalar width did increase (from 4 to 6 to 8 and 10?). In going to two cores, the expected throughput gain for a threaded workload could be 2X, i.e., almost linear with transistor count, versus square root of 2 for 2X, but obviously no gain for a single threaded work load. The reason we focused on increasing core complexity for 1.4X was that in theory, this automatically benefits most software, whereas increasing core count (and vector instructions) only benefit for code written to match the capability of recent generation processor architectures. Its been twenty five years since multi-processor systems have been generally accessible, yet many people still cannot write good multi-threaded code.

joechang

Good introduction
And less laziness in lecturing
We are all stupid
Unless somebody like you
Enlighten us with this knowledge
Thank you for the effort
I will make donation
Once I have some money

footballCartoon

the pthread example could have been written to execute on multiple threads not just 2 by starting the pthread in the fib function instead of the main function

Decapitorr

This is my biggest criticism of this class, which is otherwise perfect - Why Cilk?! Sure it is the professor's baby, but it is obsolete. Gcc doesn't even support it anymore.

asterx_obelix

The examples are not consistent: sometimes one of the fib() is evaluated in the main thread and the other in a spawned thread, while, sometimes, two threads are spawned to compute each fib().

slbewsp

This is a great presentation, it would be more fruitful if a timing or quantitative comparison also provided.

eee

"Favourite textbook". Yes. Most definitely.

kitten-inside

This has a decent introduction to multicore history. A few comments.

The pthread code could be easily cleaned up and generalized. Other pthread based lib or preprocessor are merely syntactic super.

Platform that requires special language processor or operating environment will face adaptation difficult.

Multi thread programming is difficult to master but starting learning with sugar coated language is not a good way master the skill specially for a formal computer science curriculum.

Lack of coverage of synchronization basics like muted, semaphore, lockless technique, etc. at early stage is kind of misguided. Perhaps these are covered later in the course. But these are core concepts not afterthoughts.

It might be better first to teach how to deal with pthread and related foundational libs then move on to higher level abstractions and syntactical aid.

videoplumber

@16:15
Don't hide anything!
Explain the protocol..
I am assuming
There might be some Opcode
in which it can represented by a signed integer or maybe some constant number (i.e. a flag)defining the invalidity of the particular processor's cache state.

footballCartoon

I love the sarcasm... Especially when he's talking about the reasons for the plateau in 2004/2005.

ynoT

Isnt it called multithreaded programming to be more exact? The cores are just the physical processors where the threads are the logical processors which can basically used as a "core" as described in the video.

Ryan-xqkl

11:06 ~ 11:28 You sure about this part? if the green rectangle indicated private cache memory

abytebit

how had only 2 people seen the fibonacci program? The last lecture was literally based around fib.

asterx_obelix

I don't get the pthreads example of fib(n). Wouldn't we want to use as many threads as we have processor cores, and not just 2 threads? E.g. if we have a 16-core machine, couldn't the program be written so that for 45 >= n >= 30 we start a new thread, and do that in fib(), and not in main() ? We could keep track of how many threads we already started with a mutex, and limit that to some cons MAX_THREADS... and calculate the rest sequentually. No? Yes?

pschneider

Julian Shun I would like to meet you and ask you some questions :)

eqnkzsz

@1:5:26 : Don't reducers need to be commutative as well?

valeriuok

I was waiting for assembly level of detail in this.

filipecotrimmelo

6. Multicore Programming

6. Multicore Programming

How Do CPUs Use Multiple Cores?

MULTICORE PROGRAMMING

Multicore - technology and terminology

Chapter 4 02 Multicore Programming and Its Challenges

Sec 4.2. Multicore programming

Conferencia magistral Barbara Liskov. Multicore Programming

Multi Core Power and Performance - Georgia Tech - HPCA: Part 6

Multicore and multi-OS

Chapter 4-2: Multicore Programming

C# : Multicore programming: the hard parts

Multi-core Programming / Parallax Propeller / Tachyon Forth

Multi core processors

My 2 Year Journey of Learning C, in 9 minutes

PWLSF-07/2015 - Devon O'Dell on Nonblocking Algorithms & Scalable Multicore Programming

Getting Started with Multicore Microcontroller Applications

#168 ESP32 Dual Core on Arduino IDE including Data Passing and Task Synchronization

Threads/Multiple cores: Parallelism in C++ #2/3 (also hyperthreading)

ET 193 L05 Intermediate Programming Multicore

Embedded multicore

Introduction to RTOS Part 12 - Multicore Systems | Digi-Key Electronics

Multi-Core software design for AURIX in combination with EB tresos product line

Multi Core Processor Computer Architecture: How does it work?

[OOPSLA] Veracity: Declarative Multicore Programming with Commutativity