conquest/benchmarks/matrix_multiply
David Bowler b2ede974d6
Update README.md
Correct number of processes with 64 atoms per process
2023-12-20 16:16:29 +00:00
..
Conquest_input Write output to stdout in all benchmarks 2023-10-27 10:17:52 +01:00
README.md Update README.md 2023-12-20 16:16:29 +00:00
Si.ion Add performance tests weve profiled so far 2023-08-30 10:40:17 +01:00
coords.dat Add performance tests weve profiled so far 2023-08-30 10:40:17 +01:00
si_222.xtl Add documentation and coordinates files to be used for weak scaling with matrix_multiply. 2023-09-14 10:41:23 +01:00
si_422.xtl Add documentation and coordinates files to be used for weak scaling with matrix_multiply. 2023-09-14 10:41:23 +01:00
si_442.xtl Add documentation and coordinates files to be used for weak scaling with matrix_multiply. 2023-09-14 10:41:23 +01:00
si_444.xtl Add documentation and coordinates files to be used for weak scaling with matrix_multiply. 2023-09-14 10:41:23 +01:00
si_844.xtl Add documentation and coordinates files to be used for weak scaling with matrix_multiply. 2023-09-14 10:41:23 +01:00
si_884.xtl Add documentation and coordinates files to be used for weak scaling with matrix_multiply. 2023-09-14 10:41:23 +01:00
si_888.xtl Add documentation and coordinates files to be used for weak scaling with matrix_multiply. 2023-09-14 10:41:23 +01:00
si_1688.xtl Extended scaling up to 262,144 atom case 2023-12-19 13:50:13 +00:00
si_16168.xtl Extended scaling up to 262,144 atom case 2023-12-19 13:50:13 +00:00
si_161616.xtl Extended scaling up to 262,144 atom case 2023-12-19 13:50:13 +00:00
si_321616.xtl Extended scaling up to 262,144 atom case 2023-12-19 13:50:13 +00:00
si_323216.xtl Extended scaling up to 262,144 atom case 2023-12-19 13:50:13 +00:00
si_323232.xtl Extended scaling up to 262,144 atom case 2023-12-19 13:50:13 +00:00

README.md

Testing different matrix multiplication kernels

This system can be used for profiling different matrix multiplication kernels. Those can be chosen with the MULT_KERN variable in system.make.

The additional coordinate files si_XYZ.xtl can be used to test weak scaling and would work well for increasing the number of nodes: si_222.xtl is the same as coords.dat and has 64 atoms. This means it would run well on anywhere from 2MPI/4OpenMP to 8MPI/1OpenMP. With the rest of the xtl files, we double the number of atoms each time, and would need to double the number of processes.

We now have systems from 64 atoms (222) to 262144 atoms (323232) which will scale from 8 MPI to 32,768 MPI processes (1 OpenMP thread) with 8 atoms per process or 1 MPI process to 4096 MPI processes with 64 atoms per process.