Manually computing effective polarizabilities for water molecules

frankhu · Post by **frankhu** » Mon Jun 03, 2024 10:38 pm

Hello,

I am trying to compute the effective molecular polarizability for individual water molecules in a periodic system of 64 water molecules. I am using the following procedure based off the documentation found at http://qboxcode.org/doc/html/usage/vari ... larization:

Apply a small electric field perturbation in one direction to the system.
Compute the MLWFs using this perturbation.
Map the centers of the MLWFs onto the nearest water molecules, keeping in mind the periodic boundary conditions of the system (4 per molecule).
Compute, for each water molecule, a rough proxy to the dipole as mu = -2 * \sum_i w_i, where w_i are the positions of the MLWF centers. I do not factor in the positions of the O and H atoms weighted by their atomic numbers since these positions do not change between perturbation calculations and so they cancel out when computing a dipole difference.
Repeat this procedure six times, with a +/- perturbation in each direction.
Compute the finite centered difference in the dipole components and divide by the electric field perturbation to get the individual elements of the polarizability tensor.

Below is an excerpt from one of my input files where I apply an electric field perturbation:

Code: Select all

# Frame 0 polarizability
species oxygen   O_HSCV_PBE-1.0.xml
species hydrogen H_HSCV_PBE-1.0.xml
atom O1    oxygen       6.135376  49.088181  12.041645
... #More atom lines ...
set cell 23.470398 0 0 0 23.470398 0 0 0 23.470398
set ecut 85.000000
set wf_dyn PSDA
set xc PBE
set scf_tol 1.e-8
randomize_wf 0.01
set polarization MLWF
set e_field   0.000100   0.000000   0.000000
run 0 40 10
compute_mlwf

And I would have five additional input files of this format, for a +/- perturbation in each direction. I am using the version of Qbox compiled on the NERSC supercomputer, version 1.76.3.

My questions are as follows:

Is there a more direct method within Qbox to compute the polarizability tensors of individual molecules and not just the entire system? Specifically, something like what was done in this communications paper: https://doi.org/10.1063/1.5044636
Am I applying the electric field perturbation correctly by using the set e_field variable?
Would it be correct to specify six separate e-field perturbations and calling the run command six separate times in one file (i.e., in one file doing set e_field, then run, then compute_mlwf, and repeating that six times)? Or is it better to have six separate files with a different perturbation each time?
For the method of computing the system polarizability tensor specified at http://qboxcode.org/doc/html/usage/vari ... larization, how small are the perturbations to the electric field?

I am a very new user to Qbox so any help would be greatly appreciated.

Thank you very much!
Frank

Post by **fgygi** » Tue Jun 04, 2024 3:12 pm

Hello Frank,

The procedure you describe and your input file are correct. Yes, if you want to be able to associate polarizabilities with molecules, the only way is to apply a +/- electric field in the three directions and associate the dipoles from the MLWF centers with individual molecules.

Modifications I would suggest to your input file are:
1) You may use the O_ONCV_PBE-1.2.xml and H_ONCV_PBE-1.2.xml pseudopotentials which are more accurate and converge faster in terms of plane wave cutoff. You can find them at http://www.quantum-simulation.org/poten ... _oncv/xml/ .
2) You may use set polarization MLWF_REF instead of set polarization MLWF. This is more accurate since it adds a correction described by Stengel and Spaldin, Phys. Rev. B73, 075121 (2006). The MLWF_REF value converges faster to the exact value in the limit of large unit cell.

Choosing the value of the electric field takes a bit of trial and error. You want to use a value that is not so small as to cause numerical errors when computing the finite differences of the dipoles. On the other hand, you want to have it not too large to avoid including non-linear polarizability. The value of 0.0001 is a good choice for water. You may want to test that your results are not too sensitive to that choice.

It may save a bit of time to first compute the ground state without a field, and then turn on the field and redo the ground state.

It is possible to put all calculations in a single input file. I tried the following and it seems to work: (the file h2o64.i contains the definition of the unit cell, species, and atomic positions)

Code: Select all

h2o64.i
#species oxygen   O_HSCV_PBE-1.0.xml
#species hydrogen H_HSCV_PBE-1.0.xml
#atom O1    oxygen       6.135376  49.088181  12.041645
#... #More atom lines ...
#set cell 23.470398 0 0 0 23.470398 0 0 0 23.470398
set ecut 85.000000
set wf_dyn PSDA
set xc PBE
set scf_tol 1.e-8
randomize_wf 0.01
run 0 40 5 
set polarization MLWF_REF
set e_field   0.000100   0.000000   0.000000
run 0 40 5
set e_field  -0.000100   0.000000   0.000000
run 0 40 5
set e_field   0.000000   0.000100   0.000000
run 0 40 5
set e_field   0.000000  -0.000100   0.000000
run 0 40 5
set e_field   0.000000   0.000000   0.000100
run 0 40 5
set e_field   0.000000   0.000000  -0.000100
run 0 40 5

Note that the compute_mlwf command is not necessary since the dipoles and MLWF centers are printed at the end of each run command. I include in attachment the corresponding input and output files.

Don't hesitate to ask for help if you have any problem running these calculations.

Best,
Francois

frankhu · Post by **frankhu** » Tue Jun 04, 2024 9:00 pm

Hi Francois,

Thank you very much for your helpful reply!

A few followup questions based on the files you provided:

The MLWF blocks output by Qbox now have two entries for each mlwf center, both an mlwf element (with center and spread) and a mlwf_ref element (with only center). Which of these center coordinates should I use in my calculation of the molecular polarizability?
Is the state of the system retained between multiple calls to run in one calculation? For instance, when you do the first run call, presumably the electronic wave functions are computed so you have some state x_0. If you then apply an e-field perturbation to this and call run again, does that alter the state to some new set of electronic wave functions x_1? And then subsequent calls to run start from this point? Ideally, wouldn't we want to start with a fresh set of wave functions for each perturbation? Or does each run call "start from scratch", so to speak? I was uncertain about this, and that was why I structured my initial attempts around 6 separate input files, as I figured that gave the best guarantee that I was not unintentionally propagating a state throughout the calculation.
What value should the nstb parameter be set to for maximum efficiency? I am currently using a value of 2.

As always, thank you for your time!
Frank

Post by **fgygi** » Wed Jun 05, 2024 7:28 am

Hello Frank,

The MLWF_REF center is a "refined" estimate of the position of the MLWF center that tries to include the effects of the finite size of the cell. You may consider that as an improved value obtained by a perturbation theory correction. See the paper of Stengel and Spaldin http://dx.doi.org/10.1103/PhysRevB.73.075121 for details. Thus, the MLWF center and the MLWF_REF center are slightly different. The spread is the same for both since the wave function used to compute these center positions is the same. So to answer the question of which one to use, I would use the MLWF_REF value which is supposed to be more accurate, but comparing results with MLWF is useful as it gives a measure of the uncertainty due to the finite size of the cell.

The state of the system (i.e. the wave functions) is retained after every run command. This makes it possible to make changes in the parameters of the Kohn-Sham hamiltonian (e.g. moving atoms using the move command, or changing the value of the electric field) and update the ground state without restarting from scratch. Therefore there is no need to restart an entirely new calculation for each value of e_field. This exploits the fact that the modified ground state wave function is near the previously calculated one. Note that this is the same concept used in Born-Oppenheimer MD: wave functions are not recomputed from scratch every time atoms are moved. Instead they are just updated to reach the new ground state.

The -nstb command line argument is only an optimization parameter (the results should not depend on it). The optimal value can be found by running timing tests. It determines the shape of the MPI process grid used in the calculation (printed as MPIdata::comm in the output). The choice of the -nstb value depends on how many MPI tasks you are using for the calculation, and on the size of the Fourier transform grid used in the z direction (printed in output as <np2v>). If the number of MPI tasks is comparable or larger than <np2v> it is usually better to use -nstb > 1. For this 64-molecule H2O system, np2v=144. Assuming you are running on a single node of Perlmutter, I would recommend using 128 tasks and -nstb 2.

The corresponding job script for Perlmutter (pol4.job) is:

Code: Select all

#!/bin/bash -l
#SBATCH --time=00:30:00
#SBATCH --constraint cpu
#SBATCH --qos=debug
#SBATCH --nodes=1
#SBATCH --ntasks=128
#SBATCH --tasks-per-node=128
#SBATCH --cpus-per-task=2

export OMP_NUM_THREADS=1
export OMP_PLACES=threads
export OMP_PROC_BIND=spread

# default job name is job script file name
# To change job name, submit with: sbatch -n 64 -J jobname file.job
module load PrgEnv-gnu cray-fftw
PROJECTDIR=/global/cfs/projectdirs/qbox
exe=$PROJECTDIR/bin/qbox-1.76.4_prl
export XERCES_C_DIR=$PROJECTDIR/software/xerces/xerces-c-3.1.4_gnu
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$XERCES_C_DIR/src/.libs
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PROJECTDIR/software/AMD/amd-libflame/lib/LP64:$PROJECTDIR/software/AMD/amd-blis/lib/LP64

file=$(basename ${SLURM_JOB_NAME%.*} )
infile=$file.i
outfile=$file.r

export QBOX_OPTS=''

srun --cpu_bind=cores $exe $QBOX_OPTS $infile > $outfile

Best,
Francois

frankhu · Post by **frankhu** » Wed Jun 12, 2024 1:28 am

Hi Francois,

Thank you again for all of your incredibly helpful suggestions on this! I have another question, this time related more to post-processing and understanding the molecular polarizability tensors I have now computed.

In this paper https://pubs.acs.org/doi/full/10.1021/acs.jpcb.0c10732 , Figure 2, it seems that the components of the polarizability tensor shown there have the axes defined relative to the molecule, using the HOH bisector, a vector perpendicular to the bisector in the molecular plane, and an out of plane vector as the axes. This is similar to what was done in this paper here: https://pubs.aip.org/aip/jcp/article/14 ... -condensed , Figure 1, where the x direction is taken along the bisector, the y direction is perpendicular to the bisector in the plane, and the z direction is the out of plane component.

Right now, I apply the electric field perturbations along the Cartesian directions of the total unit cell and the dipole for each molecule is computed from the resulting MLWF centers, which I then use to compute a molecular polarizability tensor as per my initial post. However, I would like to visualize the distribution of my polarizability tensor components (specifically, alpha_xx, alpha_yy, and alpha_zz) in such a way where x, y, and z are now defined as the water bisector, a vector perpendicular to the bisector in the plane, and a vector out of the plane of the water molecule, respectively. Essentially, I would like to recover the trends shown for the effective polarizabilities in https://pubs.acs.org/doi/full/10.1021/acs.jpcb.0c10732 . I can compute these three mutually orthogonal vectors directly for each water molecule, as well as the polarizability tensor for each molecule in terms of the unit cell axes, but it's not clear to me how the polarizability is projected onto these molecular axes.

As always, any help on how to do this with my current calculation workflow would be greatly appreciated.

Thank you very much!

Frank

Post by **fgygi** » Wed Jun 12, 2024 9:16 pm

Hello Frank,

It should be reasonably easy to compute the polarizability tensor in the molecular frame. Assume you have defined the three basis vectors defining the molecular frame, rbis (bisector), rplane (in-plane, perpendicular to rbis), and rperp (perpendicular to rbis and rplane). After normalizing these vectors so that their 2-norm is 1, you can build the 3x3 rotation matrix R defined as:

R = [ rbis, rplane, rperp ]

i.e. packing the 3 column vectors to form a 3x3 matrix. This is the matrix transforming the (x,y,z) unit vectors into the molecular frame vectors. Assuming the polarizability tensor associated with that molecule in the (x,y,z) frame is the 3x3 symmetric matrix A, then the polarizability tensor in the molecular frame is

B = transpose(R) * A * R

Note that the tensor A should be symmetric, although due to numerical errors (e.g. finite differencing, incomplete convergence, etc.) it usually has a small antisymmetric component. The difference between elements A(i,j) and A(j,i) gives a measure of the accuracy of the calculation. You can of course force symmetry by replacing A by (A + transpose(A))/2.

Note also that there may be non-zero off-diagonal elements in B, due to the fact that the molecules extracted from an MD simulation do not have the full symmetry (C2v) of the isolated molecule at rest. In particular, the O-H1 and O-H2 distances differ in general.

Best,
Francois

Qbox List

Manually computing effective polarizabilities for water molecules

Manually computing effective polarizabilities for water molecules

Re: Manually computing effective polarizabilities for water molecules

Re: Manually computing effective polarizabilities for water molecules

Re: Manually computing effective polarizabilities for water molecules

Re: Manually computing effective polarizabilities for water molecules

Re: Manually computing effective polarizabilities for water molecules