Page 239 - Invited Paper Session (IPS) - Volume 1
P. 239
IPS 151 Rub´en C.
for (i = 0, j = blockIdx.x * n; i < n; i = i + 1, j = j + 1) {
index = prand[j] * n;
sx = sx + v1[index];
sy = sy + v2[index];
sx2 = sx2 + v1[index] * v1[index];
sy2 = sy2 + v2[index] * v2[index];
sxy = sxy + v1[index] * v2[index];
}
ro = (sxy - sx * sy / (float) n) / sqrtf((sx2 - sx * sx / (float) n) * (sy2 - sy *
sy / (float) n)); rhodev[blockIdx.x] = ro;
}
Figure 2: GPU Code
Figure 2 shows the code executed on the GPU. The variable blockIdx.x is
the identifier of the process that executes the code and from this it is possible
to access the corresponding pseudo-random number portion and the position
of the array where the calculated Pearson coefficients are stored.
3. Result
In order to compare the different versions of the bootstrap
implementation to calculate the Pearson correlation coefficient, the maximum
size of the data to be analyzed that could affect the graphical card was
determined, which corresponds to two variables with 49,000 observations
each and 9,999 iterations were performed. boostrap On the computer with the
graphics card, the sequential version was executed, the parallel version
implemented with the library pthread and the parallel version with the GPU.
The details of the hardware and software used are shown in Table 3. In the
cluster with 24 computers, each with 4 cores, the distributed version
implemented with the library mpich was executed. The hardware and software
details of the cluster are shown in table 4.
Table 3: Hardware & Software of Table 4: Hardware & Software of
the Computer with GPU the Cluster
Processor intel Core i7-4790 Processor (intel Core i5-3470
CPU @ 3.60 Ghz x 8 CPU @ 3.20 Ghz x 4)
x 24
Graphic Card GEForce GTX 980 Ti – RAM Size 3,9 Gib
2 Gib of Memory –
2048 cores
RAM Size 7,5 Gib Operating Ubuntu 16.04 LTS -
System 64 bit
Operating Ubuntu 16.04 LTS - C Language gcc 4.8.4
System 64 bit
C Language gcc 4.8.4
228 | I S I W S C 2 0 1 9