Comparing CPU vs GPU Versions

A computer model of production wells was used to compare the parallel computing speed on CPUs and GPUs.

The hardware was selected from widely available user computing resources such as the Intel Core i7 CPU and the Nvidia Titan graphics card.

Intel Core i7-3770	Nvidia GeForce GTX Titan

Specifications	Specifications
Cores: 4	Cores: 2688
Base Clock: 3.4 GHz	Base Clock: 836 MHz
Boost Clock: 3.9 GHz	Boost Clock: 876 MHz
Graphics Card Power: 77 W	Graphics Card Power: 250 W
Recommended price: 250$	Recommended price: 1300$

The three-dimensional model was discretized with different spatial steps. As a result, meshes with the following number of cells were obtained: ~2 million, 4 million, 8 million and 16 million. Each computational mesh was computed on 1 core of Intel Core i7, 4 cores of Intel Core i7 and the GeForce GTX Titan video card. Below there are computational results for the two-year simulation prediction.

Number of Cells	Processing Time			Speedup Factor
Number of Cells	1 core of Intel Core i7 Single Core CPU Version	4 cores of Intel Core i7 Multi-Core CPU Version	GeForce GTX Titan GPU Version	4 cores of Intel Core i7 to 1 core	GeForce GTX Titan to 4 cores of Intel Core i7	GeForce GTX Titan to 1 core Intel Core i7
2 000 000	9.62 h (34,632 s)	5.97 h (21,504 s)	34.11 min (2,047 s)	1.61x	10.50x	16.91x
4 000 000	18.16 h (65,388 s)	10.63 h (38,287 s)	57.65 min (3,459 s)	1.70x	11.06x	18.90x
8 000 000	34.33 h (123,600 s)	19.22 h (69,221 s)	1.62 h (5,844 s)	1.78x	11.84x	21.14x
16 000 000	61.14 h (220,104 s)	32.98 h (118,736 s)	2.62 h (9,456 s)	1.85x	12.55x	23.27x

The performance of 1 core of Intel Core i7 represents an speedup factor of 1x

It should be noted that, when comparing the computational speed on multi-core architectures, the following model parameters have a significant impact on the acceleration:
– number of materials;
– the number of boundary conditions;
– mesh uniformity;
– multiplicity of mesh cells and computational cores;
– conformity of thermo-physical properties of materials.
It means that the maximum acceleration on parallel architectures could be achieved on the simplest models with a uniform computational mesh and the minimum number of materials and boundary conditions. In practice, however, computational models are more complicated, that’s why our speed analysis was based on the production wells simulation model for more objective results.

Conclusions:

The use of computational algorithms with a low degree of parallelization is inefficient on multi-core processors and video accelerators.
The major engineering analysis software packages on the market contain a high degree of serial code, significantly hampering the acceleration potential of parallel computing. This is largely due to the implementation of now dated mathematical solver algorithms, developed when there were no technologies such as CUDA and therefore not designed to take advantage of these parallelization technology enhancements.
Mathematical algorithms in the latest generation CAE software are designed basing on parallel processing technology. It allows achieving speedup by a factor of ten by transferring computation from one CPU core to multi-core graphics accelerators.

	Thermal Analysis of a Road on Permafrost
	Simulation of Groundwater Flow in Saturated Soil
	Computer Simulation of Ground Freezing under Oil Tank
	Thermal Analysis of Oil Pipeline in Permafrost
	Thermal Analysis of a Lengthy Section of a Gas Pipeline on Permfrost
	Computer Simulation of Artificial Ground Freezing during the Building of Underground Tunnel
	Simulation of Ground Freezing around the Perimeter of the Fukushima Nuclear Power Plant
	Prediction of Ground Thaw Formations around an Oil Well
	Thermal Analysis of a Frozen Dam with Filtration

Any Questions?
+7 495 772 54 07

Simmakers Limited

About Simmakers Ltd