Multi threading

From Ilianko


super computer infrastructer...

desislava ivanova


  • дава достъп до ресурсите Prace
    • уравниение шрьодингер...
    • 1024*4 PowerPC процесори - българския компютър ( Blue Gene/P ...(в България), 27.85tflops ,

Ресурси които имаме

  • Blue Gene ...(Blue Gene/P) в България
  • Jugene (Blue Gene/P)
  • Curie
  • Hermit
  • SuperMUC
  • CRAY XE6
  • Jurora
  • NIIFI - ClusterGrid


  • Аустриа
  • ... mnogo evropejski dyrzhavi

Za da se uchastva trqbwa da dokazhesh, che imash zadacha , koqto izpolzwa 512 qdra, s obshto choveshko s znachenie...


Финна гранулизация - 1 whodno izhodna система - много ядра Финна гранулна паралезация

Message packing interface - gruba paalezaciq ...

Grid computing, Cloud Computing vs Parallel Computing....

isomorphna sistemna mrezha ... nachin na kluchvane na otdelnite vyzli Onet++

6Dimensional topologia


  • Homogenni
  • Nehomogenni


Bylgarska ideq -- analiz na satelitni snimki -- prognoza za osolenost na pochvi, predvizhdane na pozhari, zamyrseni vodi, navodneniq ... - na bazata na multi spektralen analiz

Диаграма на Гант

latenstnost propuskatelna sposobnost

pattern ...


YARC chip -

SAN - system area network hibriden dizajn - 3d torus+fat Tru

Principles and Practices of Interconnection Networks (The Morgan Kaufmann Series in Computer Architecture and Design) [Hardcover] William James Dally

routing algorithm <-> topology (razmer na paketa)

store forward .... wormhole .... gossip, reduced ... testwane traffic , collective network, 3D -dimensionall tools toroid, Low latency global Barier and Interupt

fat tree - benes network



komunikaciq point to point Kontragent <=> processor

Kolektiwna komunikaciq

  • Broadcast, ne personizirana komunikaciq (one to all)
  • All to All
  • Gossiping - personizirana globalna komunikaciq (naj tezhyk rezhim)
  • Redukciq - vsichki izprashtat kym edin (root stava goreshta tochka)


toroid debelo dyrvo

Wsqka programa nalaga rezhim na komunikaciq. Мрежата си има максимална възмозхност. Предложен -“ Приет трафик (офферед delivered)

Devide and Conquer

KOmpoziciq и декомпозиция

  • Coarse granularity - груба гранулация - подход с message passing MPI
  • Обща памет => финна гранулациия OpenMPI
  • Хибридно -MPI OpenMpi



  • MPI_Init
  • MPI_Comm_Rank
  • MPI_Comm_Size
  • MPI_Finalize -
  • MPI_Barier - изчакване на процесите mpi_err = MPI_Barrier(MPI_COMM_WORLD);
  • MPI_Wtime - време на изпълнение на програмата
  • MPI_Wtick - ----------//-------------------
/* C Example */
#include <stdio.h>
#include <mpi.h>

int main (argc, argv)
     int argc;
     char *argv[];
  int rank, size;

  MPI_Init (&argc, &argv);	/* starts MPI */
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);	/* get current process id */
  MPI_Comm_size (MPI_COMM_WORLD, &size);	/* get number of processes */
  printf( "Hello world from process %d of %d\n", rank, size );
  return 0;

Мессаге пассинг модел

системна мрежа - full connectivity

Паралелната програма се представя само като граф не с блок схема

Система с рапределена памет


Memory +cpu


Типове програми

  • статично паралелни - broq na procesite e postoqnen
  • динамично паралелни - примерно търсаките (предварително не се знае необходимия ресурс и трябва да се даде v run-time)




%mpicc -o myprog ....

Kolektiwna komunikaciiq


MPI_REDUCE combines the elements provided in the input buffer of each process in the group, using the operation op, and returns the combined value in the output buffer of the process with rank root. The input buffer is defined by the arguments sendbuf, count and datatype; the output buffer is defined by the arguments recvbuf, count and datatype; both have the same number of elements, with the same type. The routine is called by all group members using the same arguments for count, datatype, op, root and comm. Thus, all processes provide input buffers and output buffers of the same length, with elements of the same type. Each process can provide one element, or a sequence of elements, in which case the combine operation is executed element-wise on each entry of the sequence. For example, if the operation is MPI_MAX and the send buffer contains two elements that are floating point numbers ( count = 2 and datatype = MPI_FLOAT), then and .


цели се паралелното програмиране

супер линейно ускорение

... ...

underload, overheading, super linier acceleration


broene na prostite chisla

  1. markirame chetnite chisla
  2. markirame tezi koito se delqt na tri
  3. /5 se markirat

Block data decomposition


funkcionalna dekompoziciq
