Questions and discussions regarding the use of Qbox
Forum rules
You must be a registered user to post in this forum. Registered users may also post new topics if they consider that their subject does not correspond to any topic already present on the forum.
Qbox seems to use the FFTW2/FFTW3 implementations inefficiently.
For example, the loop at line 884 (FourierTransform.C; Qbox 1.60.4) is less efficient than the following call.
How do I submit a patch for review?
Hi Evgueni,
Thanks for your post. Could you post some timing information (using e.g. the examples in the test directory) showing the change in performance.
Thanks.
Francois
According to our runs, Qbox linked to Intel MKL spends 12-13% of CPU cycles in the FFT copy routines.
Rewriting the loop is expected to halve the number of CPU cycles in the FFT copy routines.
The expected overall speedup is about 5% if Qbox is linked to Intel MKL.