Multi threading
prace
super computer infrastructer...
desislava ivanova
Prace
- дава достъп до ресурсите Prace
- уравниение шрьодингер...
- 1024*4 PowerPC процесори - българския компютър ( Blue Gene/P ...(в България), 27.85tflops
Ресурси които имаме
- Blue Gene ...(Blue Gene/P) в България
- Jugene (Blue Gene/P)
- Curie
- Hermit
- SuperMUC
- CRAY XE6
- Jurora
- NIIFI - ClusterGrid
- CINECA
Членове
- Аустриа
- ... mnogo evropejski dyrzhavi
Za da se uchastva trqbwa da dokazhesh, che imash zadacha , koqto izpolzwa 512 qdra, s obshto choveshko s znachenie...
Видове
Финна гранулизация - 1 whodno izhodna система - много ядра Финна гранулна паралезация
Message packing interface - gruba paalezaciq ...
Grid computing, Cloud Computing vs Parallel Computing....
isomorphna sistemna mrezha ... nachin na kluchvane na otdelnite vyzli Onet++
6Dimensional topologia
Klysteri
- Homogenni
- Nehomogenni
BlueVision
Bylgarska ideq -- analiz na satelitni snimki -- prognoza za osolenost na pochvi, predvizhdane na pozhari, zamyrseni vodi, navodneniq ... - na bazata na multi spektralen analiz
latenstnost propuskatelna sposobnost
pattern ...
fleet
SAN - system area network hibriden dizajn - 3d torus+fat Tru
Principles and Practices of Interconnection Networks (The Morgan Kaufmann Series in Computer Architecture and Design) [Hardcover] William James Dally
routing algorithm <-> topology (razmer na paketa)
store forward .... wormhole .... gossip, reduced ... testwane traffic , collective network, 3D -dimensionall tools toroid, Low latency global Barier and Interupt
fat tree - benes network
Dekompoziciq
Randevu
komunikaciq point to point Kontragent <=> processor
Kolektiwna komunikaciq
- Broadcast, ne personizirana komunikaciq (one to all)
- All to All
- Gossiping - personizirana globalna komunikaciq (naj tezhyk rezhim)
- Redukciq - vsichki izprashtat kym edin (root stava goreshta tochka)
Mrezhi
toroid debelo dyrvo
Wsqka programa nalaga rezhim na komunikaciq. Мрежата си има максимална възмозхност.
Предложен -“ Приет трафик (офферед delivered)
Devide and Conquer
KOmpoziciq и декомпозиция
- Coarse granularity - груба гранулация - подход с message passing MPI
- Обща памет => финна гранулациия OpenMPI
- Хибридно -MPI OpenMpi
...
MPI
- MPI_Init
- MPI_Comm_Rank
- MPI_Comm_Size
- MPI_Finalize -
- MPI_Barier - изчакване на процесите mpi_err = MPI_Barrier(MPI_COMM_WORLD);
- MPI_Wtime - време на изпълнение на програмата
- MPI_Wtick - ----------//-------------------
/* C Example */
#include <stdio.h>
#include <mpi.h>
int main (argc, argv)
int argc;
char *argv[];
{
int rank, size;
MPI_Init (&argc, &argv); /* starts MPI */
MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
printf( "Hello world from process %d of %d\n", rank, size );
MPI_Finalize();
return 0;
}
Мессаге пассинг модел
системна мрежа - full connectivity
Паралелната програма се представя само като граф не с блок схема
Система с рапределена памет
memory+CPU
Memory +cpu
...
Типове програми
- статично паралелни - broq na procesite e postoqnen
- динамично паралелни - примерно търсаките (предварително не се знае необходимия ресурс и трябва да се даде v run-time)
.
MPI_COMM_WORLD
Kompilirane
%mpicc -o myprog ....
Kolektiwna komunikaciiq
MPI_Reduce
MPI_REDUCE combines the elements provided in the input buffer of each process in the group, using the operation op, and returns the combined value in the output buffer of the process with rank root. The input buffer is defined by the arguments sendbuf, count and datatype; the output buffer is defined by the arguments recvbuf, count and datatype; both have the same number of elements, with the same type. The routine is called by all group members using the same arguments for count, datatype, op, root and comm. Thus, all processes provide input buffers and output buffers of the same length, with elements of the same type. Each process can provide one element, or a sequence of elements, in which case the combine operation is executed element-wise on each entry of the sequence. For example, if the operation is MPI_MAX and the send buffer contains two elements that are floating point numbers ( count = 2 and datatype = MPI_FLOAT), then and .
Ускорение
цели се паралелното програмиране
супер линейно ускорение
... ...
underload, overheading, super linier acceleration
eratosten
broene na prostite chisla
- markirame chetnite chisla
- markirame tezi koito se delqt na tri
- /5 se markirat
Block data decomposition
SPMD
funkcionalna dekompoziciq
Literatura
http://geco.mines.edu/workshop/class2/examples/mpi/index.html