Infficient use of FFTW

Questions and discussions regarding the use of Qbox
Forum rules
You must be a registered user to post in this forum. Registered users may also post new topics if they consider that their subject does not correspond to any topic already present on the forum.
Post Reply
espetrov
Posts: 2
Joined: Mon Sep 29, 2014 6:05 am

Infficient use of FFTW

Post by espetrov »

Dear Qbox Developers,

Qbox seems to use the FFTW2/FFTW3 implementations inefficiently.
For example, the loop at line 884 (FourierTransform.C; Qbox 1.60.4) is less efficient than the following call.
How do I submit a patch for review?

Thank you.
Evgueni.

Code: Select all

      fftw_threads(nthreads, bwplan1,np0_,(FFTW_COMPLEX*)&val[ibase],np0_,one,
                     (FFTW_COMPLEX*)0,0,0);
fgygi
Site Admin
Posts: 151
Joined: Tue Jun 17, 2008 7:03 pm

Re: Infficient use of FFTW

Post by fgygi »

Hi Evgueni,
Thanks for your post. Could you post some timing information (using e.g. the examples in the test directory) showing the change in performance.
Thanks.
Francois
espetrov
Posts: 2
Joined: Mon Sep 29, 2014 6:05 am

Re: Infficient use of FFTW

Post by espetrov »

Hi Francois,

According to our runs, Qbox linked to Intel MKL spends 12-13% of CPU cycles in the FFT copy routines.
Rewriting the loop is expected to halve the number of CPU cycles in the FFT copy routines.
The expected overall speedup is about 5% if Qbox is linked to Intel MKL.

Thank you.
Evgueni.
Post Reply