7/25/2019 Pemrosesan Parale2l
1/27
Pemrosesan ParalelPemrosesan Paralel
Kudang B. SeminarKudang B. Seminar
7/25/2019 Pemrosesan Parale2l
2/27
Kebutuhan Komputer BerkinerjaKebutuhan Komputer Berkinerja
TinggiTinggi
Peramalan cuacaPeramalan cuaca
AerodinamikAerodinamik
Kercerdasan buatan: robotikKercerdasan buatan: robotik
Rekayasa genetikRekayasa genetik
Contoh aplikasi di atasmelibatkan komputasiintensif dan memerlukan
7/25/2019 Pemrosesan Parale2l
3/27
!ample ": #eather Prediction!ample ": #eather Prediction
Area$ segmentsArea$ segments% &'''(&'''("" cubic miles&'''(&'''("" cubic miles
% ."(."(." cubic mile: ) "'."(."(." cubic mile: ) "'""""
segmentssegmentsT*o day predictionT*o day prediction
% hal+ hour periods: ) "'' periodshal+ hour periods: ) "'' periods
,omputation per segment,omputation per segment% Temp$ Pressure$ -umidity$ #ind speed$ #indTemp$ Pressure$ -umidity$ #ind speed$ #ind
directiondirection
% Assume ) "'' /0PsAssume ) "'' /0Ps
7/25/2019 Pemrosesan Parale2l
4/27
Per+ormance: #eatherPer+ormance: #eather
PredictionPrediction
,omputational re1uirement: "',omputational re1uirement: "'"2"2
Serial supercomputer: "'Serial supercomputer: "'33 instr4secinstr4sec
Total serial time: "'Total serial time: "'55sec 6sec 6 78'78'hourshours
9ot too good +or9ot too good +or 88hour *eatherhour *eatherpredictionprediction
7/25/2019 Pemrosesan Parale2l
5/27
Parallel #eather PredictionParallel #eather Prediction
" K *orkstations$ grid connected" K *orkstations$ grid connected% "'"'88segment computations per processorsegment computations per processor
% "'"'88instructions per secondinstructions per second
% "'' instructions per segment computation"'' instructions per segment computation
% "'' time steps: "'"'' time steps: "'seconds 6seconds 6 )& hours)& hours ;uch more acceptable;uch more acceptable
% Assumption:Assumption: ,ommunication not a problem here,ommunication not a problem here
;ore *orkstations:;ore *orkstations:%
7/25/2019 Pemrosesan Parale2l
6/27
!ample 7: 9 body problem!ample 7: 9 body problem
Astronomy: bodies in spaceAstronomy: bodies in space% Attract each other: =ra>itational +orce 9e*tonsAttract each other: =ra>itational +orce 9e*tons
la*la*
% 0?n(n@ calculations per snapshot0?n(n@ calculations per snapshot =ala!y: ) "'=ala!y: ) "'""""bodies CD ) "'bodies CD ) "'7777calculationscalculations ,alculation " micro sec,alculation " micro sec Snapshot: "'Snapshot: "'"5"5secs 6 )"'secs 6 )"'"""" days 6 ) &("'days 6 ) &("'88yearsyears
% Es parallelism going to help usF 90Es parallelism going to help usF 90% #hat#hat doesdoes helpF Better algorithm: Barnes -uthelpF Better algorithm: Barnes -ut
Gi>ides the space in 1uad treeGi>ides the space in 1uad treeTreats +arH a*ay 1uads as one bodyTreats +arH a*ay 1uads as one body
7/25/2019 Pemrosesan Parale2l
7/27
0ther ,hallenging0ther ,hallenging
ApplicationsApplications Satellite data ac1uisition: billions o+ bits 4 secSatellite data ac1uisition: billions o+ bits 4 sec Satellite data processingSatellite data processing
% Pollution le>els$ Remote sensing o+ materialsPollution le>els$ Remote sensing o+ materials
% Emage recognitionEmage recognition
Giscrete optimiIation problemsGiscrete optimiIation problems% Planning$ Scheduling$ J/SE designPlanning$ Scheduling$ J/SE design
;aterial modeling;aterial modeling 9uclear *eapons modeling ?AS,E@9uclear *eapons modeling ?AS,E@ Airplane4Satellite4Jehicle designAirplane4Satellite4Jehicle design
7/25/2019 Pemrosesan Parale2l
8/27
Application Speci
7/25/2019 Pemrosesan Parale2l
9/27
ASE,S contHASE,S contH
-o* much +aster than =eneral purposeF-o* much +aster than =eneral purposeF% !ample: "G "'7 T!ample: "G "'7 T
=eneral purpose machine ?=@: 72 micro secs=eneral purpose machine ?=@: 72 micro secs
ASE, de>ice ?;ET /incoln /abs@: &7 nano secsASE, de>ice ?;ET /incoln /abs@: &7 nano secs ASE, de>ice uses 7' milli*atts ?"'' ( less po*er@ASE, de>ice uses 7' milli*atts ?"'' ( less po*er@
uture designs:uture designs:% 7 tera ops in small ? M cubic +t @ de>ice7 tera ops in small ? M cubic +t @ de>ice
% Target applicationsTarget applications
TT inite Empulse Response ?ER@ iltersinite Empulse Response ?ER@ ilters ;atri! multiply;atri! multiply NR decompositionNR decomposition
7/25/2019 Pemrosesan Parale2l
10/27
,ontoh 9yata,ontoh 9yata Peramalan cuacaPeramalan cuaca 2424 jam di UK melibatkan sekitarjam di UK melibatkan sekitar 10101212
operasioperasiuntuk dieksekusi. ni memerlukan !aktuuntuk dieksekusi. ni memerlukan !aktu 2." hours2." hourspada mesin Cray#1 $berkemampuanpada mesin Cray#1 $berkemampuan 1010%%operasi per detikoperasi per detik&.&.
'erapa operasi untuk peramalanmingguan( bulanan( tahunan)
*enurut +instein kecepatan cahaya,*enurut +instein kecepatan cahaya, - 10- 10%%m/dtm/dt. ua. uaperalatan elektronik yang masing#masing mampuperalatan elektronik yang masing#masing mampumelakukanmelakukan 10101212operasi/detikoperasi/detik dan terpisah dengan jarakdan terpisah dengan jarak 0.0.mm.mm. alam hal ini akan lebih lama !aktu yang diperlukanalam hal ini akan lebih lama !aktu yang diperlukanbagi sinyal melakukan perjalanan antar dua peralatanbagi sinyal melakukan perjalanan antar dua peralatantersebut daripada !aktu yang diperlukan untuk melakukantersebut daripada !aktu yang diperlukan untuk melakukaneksekusi operasieksekusi operasi$10$10#12#12 detik&detik& oleh salah satu peralatanoleh salah satu peralatan
elektronik tersebut.elektronik tersebut.adi faktor pembatasnyaadalah kecepatan cahaya.
35U3: mendayagunakan
paralelisme
7/25/2019 Pemrosesan Parale2l
11/27
;oti>ation o+ Parallel;oti>ation o+ Parallel
,omputing,omputing Parallel ,omputing is cost eOecti>eParallel ,omputing is cost eOecti>e
% 0O the shel+$ commodity processors are >ery +ast0O the shel+$ commodity processors are >ery +ast
% ;emory is >ery cheap;emory is >ery cheap
% Building a processor that is a small +actor +asterBuilding a processor that is a small +actor +astercosts an order o+ magnitude morecosts an order o+ magnitude more
% 9o# is the time9o# is the time ,heapest *ay to get more per+ormance: multiprocessor,heapest *ay to get more per+ormance: multiprocessor 9o#: 9et*orks o+ *orkstations9o#: 9et*orks o+ *orkstations
#orkstation can be an S;P#orkstation can be an S;P S;P: Symmetric ;ulti ProcessorS;P: Symmetric ;ulti Processor
% Shared memoryShared memory
% BusBus
7/25/2019 Pemrosesan Parale2l
12/27
#ile . ,oyoteHs Parallel#ile . ,oyoteHs Parallel
,omputer,omputer =et a lot o+ the +astest processors=et a lot o+ the +astest processors
=et a lot o+ memory per processor=et a lot o+ memory per processor
=et the +astest net*ork=et the +astest net*ork
-ook it all together-ook it all together
And then *hat FFFAnd then *hat FFF
7/25/2019 Pemrosesan Parale2l
13/27
9o* you need to program9o* you need to program
ititParallel programming introduces:Parallel programming introduces:
%Task partitioning$ task schedulingTask partitioning$ task scheduling
% Gata partitioningGata partitioning% SynchroniIationSynchroniIation
% /oad balancing/oad balancing
% /atency issues/atency issueshidinghiding
tolerancetolerance
7/25/2019 Pemrosesan Parale2l
14/27
Problem *ith #ile . ,oyoteProblem *ith #ile . ,oyote
ArchitectureArchitecture
Jon 9eumann ;achines not built +or 44ismJon 9eumann ;achines not built +or 44ismTo get high speed$ processors ha>eTo get high speed$ processors ha>e lots o+ statelots o+ state
% ,ache$ stack$ global memory,ache$ stack$ global memory
To tolerate latency$ *e needTo tolerate latency$ *e need +ast conte!t s*itch. #-Q+ast conte!t s*itch. #-QFF 9o +ree lunch:9o +ree lunch: canHt ha>e bothcanHt ha>e both
% ,ertainly not i+ the processor *as not designed +or both,ertainly not i+ the processor *as not designed +or both
;emory *all: memory gets slo*er and slo*er;emory *all: memory gets slo*er and slo*er% in terms o+ number o+ cycles it takes to accessin terms o+ number o+ cycles it takes to access
;emory hierarchy gets more and more comple!;emory hierarchy gets more and more comple!
;emory accesses block;emory accesses block% 9o split phase memory access9o split phase memory access
7/25/2019 Pemrosesan Parale2l
15/27
Se1uential >s ParallelSe1uential >s Parallel
AlgorithmsAlgorithms
cient Parallel Algorithmscient Parallel Algorithms% ;a!imiIe parallelism;a!imiIe parallelism
%;inimiIe synchroniIation$ remote accesses;inimiIe synchroniIation$ remote accesses
% ciency is Architecture Gependentciency is Architecture Gependent
cient Se1uential Algorithmscient Se1uential Algorithms% ;inimiIe time$ space;inimiIe time$ space
% ciency is portableciency is portable cient , program on Pentium ) cient , program oncient , program on Pentium ) cient , program on
AlphaAlpha
7/25/2019 Pemrosesan Parale2l
16/27
SpeedupSpeedup
Edeal: n processorsEdeal: n processors n +old speed upn +old speed up% Edeal not al*ays possible.Edeal not al*ays possible. #-QF#-QF
% Tasks are data dependentTasks are data dependent
% 9ot all processors are al*ays busy9ot all processors are al*ays busy% Remote dataRemote data
Super linear speedup: Dn speedupSuper linear speedup: Dn speedup% 9onsense Because *e can e!ecute the +aster9onsense Because *e can e!ecute the +aster
parallel program se1uentiallyparallel program se1uentially% 9o nonsense Because parallel computers do not9o nonsense Because parallel computers do not
just ha>e more processors$ they ha>e more cachesjust ha>e more processors$ they ha>e more caches
7/25/2019 Pemrosesan Parale2l
17/27
Parallel ProgrammingParallel Programming
Parallel Programming ParadigmsParallel Programming Paradigms% Super compilersSuper compilers
7' years o+ paralleliIing compilers and *hat do *e getF7' years o+ paralleliIing compilers and *hat do *e getF
..not much: *e understand loops ?a [email protected] much: *e understand loops ?a bit@
% ;ultithreading;ultithreading
Pthreads$ Solaris threads$ not much diOerencePthreads$ Solaris threads$ not much diOerence
% ;essage Passing;essage Passing
;PE rules$ ..*ell$ there is PJ; ?parallel >irtual machine@;PE rules$ ..*ell$ there is PJ; ?parallel >irtual machine@
% Gata parallel programmingGata parallel programming
9iche *ork$ but important9iche *ork$ but important
7/25/2019 Pemrosesan Parale2l
18/27
Emplicit >s !plicit 44ismEmplicit >s !plicit 44ism
Emplicit:Emplicit: super compilerssuper compilers% !tract parallelism +rom se1uential program!tract parallelism +rom se1uential program
% The general case is too hardThe general case is too hard pointers$ aliases$ recursion$ separate compilationpointers$ aliases$ recursion$ separate compilation dynamic dependence distances in array re+erencesdynamic dependence distances in array re+erences
!plicit Parallelism:!plicit Parallelism: threads or messagesthreads or messages% ,omplicates programming,omplicates programming
creation$ allocation$ scheduling o+ processescreation$ allocation$ scheduling o+ processes data partitioningdata partitioning SynchroniIation ? locks$ messages @SynchroniIation ? locks$ messages @
7/25/2019 Pemrosesan Parale2l
19/27
Pemrosesan Sekuensial Pemrosesan Sekuensial
ParalelParalel
- lebihcepatdari
7/25/2019 Pemrosesan Parale2l
20/27
Klasi
7/25/2019 Pemrosesan Parale2l
21/27
SESG ,omputersSESG ,omputers
Untuk operasi a16 a2 6 a-6 7 6 an
memerlukan sebanyak nakses kememori oleh prosesor dan sebanyak n#1operasi penjumlahan. adi kompleksitas!aktu operasi adalah $n&.
7/25/2019 Pemrosesan Parale2l
22/27
>on 9eumann Architecture>on 9eumann Architecture
,omputer,omputer
7/25/2019 Pemrosesan Parale2l
23/27
;ESG ,omputers;ESG ,omputers8prosesor yang memiliki unit kontrol pribadi( berbagi gunamemori bersama $shared memori&.
Parallelisme diperoleh dengan menugaskan semua prosesormengerjakan operasi4tugas yang berbeda secara simultan padadata yang sama.
7/25/2019 Pemrosesan Parale2l
24/27
SE;G ,omputersSE;G ,omputers8 prosesor beroperasi diba!ah kendali aliraninstruksi tunggal yang dikeluarkan oleh unit
kontrol pusat.
The processors operate synchronously and a
global clock is used to ensure lockstepoperation.
7/25/2019 Pemrosesan Parale2l
25/27
;E;G ,omputers;E;G ,omputers
7/25/2019 Pemrosesan Parale2l
26/27
Potensi dari kelasPotensi dari kelas
komputerkomputer
7/25/2019 Pemrosesan Parale2l
27/27
SP;G ,omputersSP;G ,omputersProgram yang sama dieksekusi pada prosesor komputer**.
3P* bukan merupakan paradigma hard!are( ini adalahsoft!are ekui9alen dari 3*( namun bersifatasynchronous.
Perhatikan instruksi : ; < 0 =>+8 31 +53+ 32
?sumsikan ; < 0pada prosesorP1( dan untuk ; @< 0padaprosesor P2
Proses P1 mengeksekusi 31 paralel dengan prosesorP7mengeksekusi 32 $ ini tidak dapat terjadi pada 3* &