Qbox hangs when running more than 8 mpi processes.
Forum rules
You must be a registered user to post in this forum. Registered users may also post new topics if they consider that their subject does not correspond to any topic already present on the forum.
You must be a registered user to post in this forum. Registered users may also post new topics if they consider that their subject does not correspond to any topic already present on the forum.
-
- Posts: 5
- Joined: Thu Jan 31, 2013 5:19 pm
Qbox hangs when running more than 8 mpi processes.
Qbox Wizards,
I have built Qbox-1.56.2 on a Sandy-Bridge Cluster with a Infiniband interconnect. I used the Intel compilers (13.1), mkl(11.0), fftw-2.1.5, mvapich2(1.9) and xerces-2.8. My makefile is included in the attached zip file.
I was able to run all of the tests provided with the software on more than 8 mpi processes.
However, when I try to run my case on more than 8 mpi processes, Qbox hangs while reading the pseudopotential files. Qbox runs fine on 8 mpi processes.
The input for my case is in the attached zip file. Any help in resolving this issue would be appreciated.
Thanks,
John J. Low
Math and Computer Science
Argonne National Laboratory
Argonne, Illinois
I have built Qbox-1.56.2 on a Sandy-Bridge Cluster with a Infiniband interconnect. I used the Intel compilers (13.1), mkl(11.0), fftw-2.1.5, mvapich2(1.9) and xerces-2.8. My makefile is included in the attached zip file.
I was able to run all of the tests provided with the software on more than 8 mpi processes.
However, when I try to run my case on more than 8 mpi processes, Qbox hangs while reading the pseudopotential files. Qbox runs fine on 8 mpi processes.
The input for my case is in the attached zip file. Any help in resolving this issue would be appreciated.
Thanks,
John J. Low
Math and Computer Science
Argonne National Laboratory
Argonne, Illinois
- Attachments
-
- blues_icc.zip
- Input file which cause qbox to hang and makefile used to build qbox are in this zip file.
- (1.17 KiB) Downloaded 915 times
-
- Site Admin
- Posts: 167
- Joined: Tue Jun 17, 2008 7:03 pm
Re: Qbox hangs when running more than 8 mpi processes.
I seems that the input file "cristobolite.i" has CRLF line terminators. This is likely to confuse the Qbox line interpreter, which expects Unix ASCII text. (this may be just the result of cutting and pasting on a non-Unix machine though).
Also it seems that the name of the Na potential file has a typo in it.
After fixing these errors, I was able to run that script with 8 MPI tasks. It uses about 245 MB per task.
I could also run it on 16 tasks on an AMD cluster with Infiniband (4 tasks/node). The input and output files are attached (4 iterations only).
Could you attach the output file up to and including the point where it hangs?
Also it seems that the name of the Na potential file has a typo in it.
After fixing these errors, I was able to run that script with 8 MPI tasks. It uses about 245 MB per task.
I could also run it on 16 tasks on an AMD cluster with Infiniband (4 tasks/node). The input and output files are attached (4 iterations only).
Could you attach the output file up to and including the point where it hangs?
- Attachments
-
- gs1.tgz
- Unix gzipped tar file containing input file gs1.i and output file gs1.r
- (3.64 KiB) Downloaded 875 times
-
- Posts: 5
- Joined: Thu Jan 31, 2013 5:19 pm
Re: Qbox hangs when running more than 8 mpi processes.
fgygi,
Something must have gotten corrupted between the server, my windows desktop and the Qbox list. I don't see any carriage returns or line feeds in my input files on the server.
My original input data is the essentially same as yours. I have an comment card in my input which is missing in your input.
I get the same error with your input when I run on more 8 MPI processes. This case runs on less than 8 MPI processes.
I have attached all the files from a test with the input attached in your previous post.
Thanks for you help.
John
Something must have gotten corrupted between the server, my windows desktop and the Qbox list. I don't see any carriage returns or line feeds in my input files on the server.
My original input data is the essentially same as yours. I have an comment card in my input which is missing in your input.
I get the same error with your input when I run on more 8 MPI processes. This case runs on less than 8 MPI processes.
I have attached all the files from a test with the input attached in your previous post.
Thanks for you help.
John
- Attachments
-
- fgygi_test.tar.gz
- This contains the input and output files generated by fgygi's input.
- (4.34 KiB) Downloaded 924 times
-
- Site Admin
- Posts: 167
- Joined: Tue Jun 17, 2008 7:03 pm
Re: Qbox hangs when running more than 8 mpi processes.
I see that all 16 tasks are running on the same node in your test. What is the memory available on that node? It could be a problem with this calculation since it uses a large plane wave cutoff. However I would expect that this might cause a problem later in the execution, not when defining the species.
It appears that the hang occurs where Qbox uses the Xerces XML parser to read the species file. I don't see though how this could not work on more than 8 task and work properly on less than 8 tasks.
Could you attach the output you get in the case where it works (with 8 tasks)?
It appears that the hang occurs where Qbox uses the Xerces XML parser to read the species file. I don't see though how this could not work on more than 8 task and work properly on less than 8 tasks.
Could you attach the output you get in the case where it works (with 8 tasks)?
-
- Posts: 5
- Joined: Thu Jan 31, 2013 5:19 pm
Re: Qbox hangs when running more than 8 mpi processes.
Fgygi,
Each node has 16 cores (two eight core sandy-bridge processors) and 62 gigabytes of memory.
Are you suggesting I try to run eight cores per node on more than one node?
I have attached the output from a run which completed on 8 cores on one node.
John
Each node has 16 cores (two eight core sandy-bridge processors) and 62 gigabytes of memory.
Are you suggesting I try to run eight cores per node on more than one node?
I have attached the output from a run which completed on 8 cores on one node.
John
- Attachments
-
- test.log.gz
- gzipped output from a successful run on 8 cores.
- (502 Bytes) Downloaded 891 times
-
- Site Admin
- Posts: 167
- Joined: Tue Jun 17, 2008 7:03 pm
Re: Qbox hangs when running more than 8 mpi processes.
It seems that the attached file contains the output of the unsuccessful test on 16 cores.
Regarding memory usage, 62 GB is more than enough for this run (by a lot!).
Regarding memory usage, 62 GB is more than enough for this run (by a lot!).
-
- Posts: 5
- Joined: Thu Jan 31, 2013 5:19 pm
Re: Qbox hangs when running more than 8 mpi processes.
Fgygi,
The attached file contains the output from a successful 8 core run.
I did not include the huge "sample" xml file because it takes too long to upload.
John
The attached file contains the output from a successful 8 core run.
I did not include the huge "sample" xml file because it takes too long to upload.
John
- Attachments
-
- 8proc.tar.gz
- (4.96 KiB) Downloaded 900 times
-
- Site Admin
- Posts: 167
- Joined: Tue Jun 17, 2008 7:03 pm
Re: Qbox hangs when running more than 8 mpi processes.
John,
Thanks. I looked at the output and I can't see anything wrong with it. At this point I can only think that there could be a problem with the way Qbox was compiled. I enclose a makefile for my cluster on which I built with Intel icc, and used the MKL libraries, in case this could help identify a problem.
Francois
Thanks. I looked at the output and I can't see anything wrong with it. At this point I can only think that there could be a problem with the way Qbox was compiled. I enclose a makefile for my cluster on which I built with Intel icc, and used the MKL libraries, in case this could help identify a problem.
Francois
Code: Select all
#-------------------------------------------------------------------------------
#
# pencil.mk
#
#-------------------------------------------------------------------------------
#
PLT=x86_64
#-------------------------------------------------------------------------------
MPIDIR=/usr/mpi/qlogic
XERCESCDIR=$(HOME)/software/xerces/xerces-c-src_2_8_0
PLTOBJECTS = readTSC.o
CXX=icc
LD=$(CXX)
PLTFLAGS += -DIA32 -DUSE_FFTW -D_LARGEFILE_SOURCE \
-D_FILE_OFFSET_BITS=64 -DUSE_MPI -DSCALAPACK -DADD_ \
-DAPP_NO_THREADS -DXML_USE_NO_THREADS -DUSE_XERCES
FFTWDIR=$(HOME)/software/fftw/x86_64/fftw-2.1.5/fftw
INCLUDE = -I$(MPIDIR)/include -I$(FFTWDIR) -I$(XERCESCDIR)/include
CXXFLAGS= -g -O3 -vec-report1 -D$(PLT) $(INCLUDE) $(PLTFLAGS) $(DFLAGS)
LIBPATH = -L$(MPIDIR)/lib64 -L$(FFTWDIR)/.libs -L$(XERCESCDIR)/lib
LIBS = $(PLIBS) \
-lmkl_intel_lp64 \
-lmkl_lapack95_lp64 -lmkl_sequential -lmkl_core \
-lirc -lifcore -lsvml \
-lmpich -lfftw -luuid $(XERCESCDIR)/lib/libxerces-c.a -lpthread
# Parallel libraries
PLIBS = -lmkl_scalapack_lp64 -lmkl_blacs_lp64
LDFLAGS = $(LIBPATH) $(LIBS)
#-------------------------------------------------------------------------------
-
- Posts: 5
- Joined: Thu Jan 31, 2013 5:19 pm
Re: Qbox hangs when running more than 8 mpi processes.
Francois,
I get the same behavior when I use your .mk file and my makefile. Qbox will run to completion for 8 or less processors.
On more than eight processors Qbox appears to be in a infinite loop while creating the first species and runs (generating no output) until I stop it with a <ctrl-c>.
The energies for this test (on 8 cores) computed on my server are different than yours.
Could you tell me which version of the intel compilers and mkl you are using?
I have attached results for my cristobalite test from qbox built with pencil.mk (your makefile).
John
I get the same behavior when I use your .mk file and my makefile. Qbox will run to completion for 8 or less processors.
On more than eight processors Qbox appears to be in a infinite loop while creating the first species and runs (generating no output) until I stop it with a <ctrl-c>.
The energies for this test (on 8 cores) computed on my server are different than yours.
Could you tell me which version of the intel compilers and mkl you are using?
I have attached results for my cristobalite test from qbox built with pencil.mk (your makefile).
John
- Attachments
-
- log.tar.gz
- (3.62 KiB) Downloaded 851 times
-
- Site Admin
- Posts: 167
- Joined: Tue Jun 17, 2008 7:03 pm
Re: Qbox hangs when running more than 8 mpi processes.
John,
The results in file gs1.r were obtained using 16 MPI tasks. The energies differ from your results obtained on 8 MPI tasks because the random initialization of the wave functions depends on the number of tasks, and after only 4 iterations the energy is far from converged. I have rerun the same input on 8 MPI tasks and got the exact same energies as in your run 8proc.log (see attached file gs5.tar). Of course, all energies, when converged to the ground state, are independent of the number of tasks.
As a side comment, I note that this test uses PBE pseudopotentials but the input file does not specify the xc functional, which is therefore by default LDA. In order to get consistent physical quantities, make sure to add "set xc PBE" to the input file. Conversely, if you want to use LDA, you should use the LDA version of the pseudopotentials, and use the default xc value (LDA).
I am wondering about the possibility that there is a problem with your MPI setup. Which flavor of MPI do you use? Is there a file defining the nodes on which the program can run (i.e. "machinefile"), and possibly where a maximum number of tasks is defined?
I use icc 12.1.3 and MKL 10.3 update 9.
Francois
The results in file gs1.r were obtained using 16 MPI tasks. The energies differ from your results obtained on 8 MPI tasks because the random initialization of the wave functions depends on the number of tasks, and after only 4 iterations the energy is far from converged. I have rerun the same input on 8 MPI tasks and got the exact same energies as in your run 8proc.log (see attached file gs5.tar). Of course, all energies, when converged to the ground state, are independent of the number of tasks.
As a side comment, I note that this test uses PBE pseudopotentials but the input file does not specify the xc functional, which is therefore by default LDA. In order to get consistent physical quantities, make sure to add "set xc PBE" to the input file. Conversely, if you want to use LDA, you should use the LDA version of the pseudopotentials, and use the default xc value (LDA).
I am wondering about the possibility that there is a problem with your MPI setup. Which flavor of MPI do you use? Is there a file defining the nodes on which the program can run (i.e. "machinefile"), and possibly where a maximum number of tasks is defined?
I use icc 12.1.3 and MKL 10.3 update 9.
Francois
- Attachments
-
- gs5.tar
- output using 8 MPI tasks
- (17.5 KiB) Downloaded 895 times