mirror of https://gitlab.com/QEF/q-e.git
183 lines
10 KiB
Plaintext
183 lines
10 KiB
Plaintext
|
|
Hybrid Hartree-Fock+DFT functionals are a still evolving feature in PWscf.
|
|
Some of the following notes may be obsolete.
|
|
|
|
WHICH FUNCTIONALS ARE IMPLEMENTED ?
|
|
The following hybrid functionals are implemented: Hartree-Fock, PBE0,
|
|
B3LYP, HSE (see Modules/functionals.f90 for updated info and more details).
|
|
The GAU-PBE functional (free from divergences at q->0) is also implemented.
|
|
Usually in PWscf the functional to be used is read from pseudopotential
|
|
files but we do not have so far a pseudopotential generator for hybrid
|
|
functionals so one needs to use pseudopotentials generated with some other
|
|
functionals (eg. LDA, PBE, BLYP) and force the usage of a hybrid functional
|
|
using input variable "input_dft" in system namelist; for instance,
|
|
input_dft="pbe0" will force the usage of PBE0 irrespective of the
|
|
functional used in the pseudopotential generation.
|
|
|
|
HOW DOES THE ALGORITHM WORK ?
|
|
The algorithm is quite standard: see for instance Chawla and Voth,
|
|
JCP {bf 108}, 4697 (1998); Sorouri, Foulkes and Hine, JCP {\bf 124},
|
|
064105 (2006); Spencer and Alavi, PRB {\bf 77}, 193110 (2008).
|
|
Basically, one generates auxiliary densities $\rho_{-q}=\phi^{*}_{k+q}*\psi_k$
|
|
in real space and transforms them to reciprocal space using FFT;
|
|
the Poisson equation is solved and the resulting potential is transformed
|
|
back to real space using FFT, then multiplied by $\phi_{k+q}$ and the
|
|
results are accumulated.
|
|
The only tricky point is the treatment of the $q\rightarrow 0$ limit,
|
|
which is described below, and in the Appendix A.5 of the QE paper
|
|
(note the reference to the Gygi and Baldereschi paper).
|
|
See J. Comp. Chem. {\bf 29}, 2098 (2008); JACS {\bf 129}, 10402 (2007)
|
|
for examples of applications.
|
|
|
|
HOW DOES SELF-CONSISTENCY WORK ?
|
|
The usage of hybrid functionals is VERY expensive (see later). Moreover
|
|
self-consistency should be reached on the density-matrix, instead of the
|
|
charge density as in traditional DFT. This is not feasible with plane waves.
|
|
The strategy used here is to consider an auxiliary set of wavefunctions psi
|
|
in addition to the usual set phi and to minimize the auxiliary functional
|
|
(let us focus on HF for simplicity):
|
|
E[phi,psi] = T[phi] + E_ext[phi] + E_Hartree[phi] +
|
|
<phi|Vx[psi]|phi> - 0.5*<psi|Vx[psi]|psi>
|
|
where Vx[psi] is the fock operator defined with the auxiliary function psi.
|
|
Taking the functional derivatives w.r.t. phi it can be shown that the scf
|
|
condition for phi are the HF equation with fixed Fock operator, so Vx does
|
|
not enter in the scf procedure and one can mix density as usual. The
|
|
minimum condition w.r.t. psi is simply psi=phi so when both psi and phi
|
|
are minimized the standard HF energy is obtained.
|
|
Actually one can show that the functional E[phi,psi] above is
|
|
E[phi,psi] = E_HF[phi] + dexx[phi,psi]
|
|
where dexx is a positive definite addition to E_HF .
|
|
The scf procedure goes as follow.
|
|
0) a normal scf (with LDA or similar functionals) is performed
|
|
1) hybrid functional is switched on and psi = phi (the current best wfcs)
|
|
2) a new scf is performed w.r.t phi, keeping fixed Vx[psi]
|
|
3) dexx[phi,psi] is computed and if it exceeds the required tolerance
|
|
the proceedure is repeated from point 1)
|
|
HF may require several phi-scf cycles to reach full convergence. B3LYP
|
|
and PBE0, due to the smaller fraction of HF exchange included, require
|
|
usually a smaller number of phi-scf cycles
|
|
|
|
HOW EXPENSIVE IS THE CALCULATION ?
|
|
Very expensive.
|
|
Applying the Fock operator on a single vawefunction (phi_k,v) requires
|
|
the calculation of an integral over the whole BZ and all psi bands.
|
|
For each needed pair psi_k+q,v' and phi_k,v an auxiliary charge density
|
|
rho(-q+G) is built in real space and then FFT to reciprocal space where
|
|
the corresponding Poisson equation is solved. This auxiliary potential is
|
|
FFT back in real space where it is multiplied by psi_k+q,v' and added to
|
|
Vx[psi]phi...
|
|
The cost of the operation is therefore roughly NBND * NQS * ( 2 * FFT + ... )
|
|
where NQS is the number of q-points chosen to represent the BZ integration,
|
|
and depends in general on the localization of the Wannier functions of the
|
|
system.
|
|
For comparison non-local pseudopotentials in the KB formulation (without
|
|
exploiting the locality of the KB projetors) cost NKB * (2 * NPW) where
|
|
NKB is typically of the order of NBND but NPW cost at least an order
|
|
of magnitude less than an FFT.
|
|
Therefore even when one can take NQS=1 (for large non-metallic system
|
|
this should be ok) hybrid functionals will require at least an order of
|
|
magnitude more resources that a standard calculation.
|
|
For Gamma-only calculations, some speedup can be obtained by computing
|
|
auxiliary densities $\rho_q$ on a "custom" FFT grid corresponding to a
|
|
cutoff (variable "ecutfock") smaller than the usual value 4*ecutwfc.
|
|
|
|
HOW CAN I CHOSE NQS IN INPUT ?
|
|
In the system namelist there are three variables nqx1,nqx2,nqx3 that
|
|
define the regular q-grid in the BZ in a way similar to the automatic
|
|
k-points generation. Their value must be compatible with the k-points
|
|
used (that is k+q must be equivalent to some other k in the k-points list)
|
|
Their default value are nqx1=1,nqx2=1,nqx3=1 (BZ integration is
|
|
approximated by gamma point value only).
|
|
|
|
DIVERGENCE AT q->0
|
|
The BZ integral to be performed has a diverging kernel when (q+G)->0
|
|
(except GAU-PBE, for which there is no divergent term: set
|
|
exxdiv_treatment='none', x_gamma_extrapolation=.false.)
|
|
This is dealt with by adding and subtracting a term with the same
|
|
divergence that can be integrated analytically and performing
|
|
numerically the integration for the non divergent residue
|
|
(variable exxdiv_treatment='gygi-baldereschi', default)
|
|
[Gygi-Baldereschi, PRB 34, 4405 (1986)]
|
|
One problem is left: the now non divergent q=0 term is not easily determined
|
|
since it is a 0/0 (non analytic) limit. Several options have been considered:
|
|
1) just discard it ... this is not a good idea in general because it
|
|
induces an error proportional to 1/(NQS*Omega) in the total energy
|
|
where Omega is the volume of the Wigner-Seitz cell of the crystal.
|
|
As one wish to keep NQS as small as possible this may be large.
|
|
2) exploit the fact that the term has the above dependence and extract
|
|
it from a calculation with a given nqx1,nqx2,nqx3 and the one with
|
|
a grid twice as coarse in each direction. One does not really need to
|
|
perform two calculations but can do it internally (even when nqx? are
|
|
not even numbers...). This seems to work and it is set as the default.
|
|
In order to disable this feature [and get back to option 1)] set
|
|
x_gamma_extrapolation = .false.
|
|
3) perform calculations in q-grids that are shifted away from gamma so that
|
|
the 0/0 term is not needed. This create some extra complication in the
|
|
coding and cannot be used with Gamma-only k-point integration.
|
|
In some tests it didn't seem superior to option 2) ... it was never
|
|
fully implemented and now it has been removed.
|
|
4) use the value at small (q+G) to estimate the (q+G)->0 limit. This
|
|
again has been tried and found to offer, for low order numerical
|
|
differentiation, no better results that option 2). It is possible
|
|
than higher-order formulas yield better results but this has not been
|
|
explored. This option is currently not implemented but it would be
|
|
easy to re-implement it.
|
|
5) use a spherical cutoff for coulomb potential (exxdiv_treatement='vcut_spheric')
|
|
|
|
In the case of strongly anisotropic supercells, such that one (or two) of the
|
|
products nki*ai (nki=number of k-points along axis i, ai=cell length along i axis)
|
|
is much larger than the others (for instance: nkz*az=50 A >> nky*ay=nkx*ax=10 A),
|
|
it can be shown that the fourier transform of 1/(q+G)**2 does not behave as 1/|r-r'|
|
|
for small q+G, thus producing instabilities with respect to the k point sampling.
|
|
In order to avoid this problem you have 2 possibilities:
|
|
1) change your supercell to a cubic one (all nki*ai of similar value)
|
|
2) use a real-space Wigner-Seitz cutoff. For this you have to turn on
|
|
exxdiv_treatment="vcut_ws" and converge your results with respect to
|
|
ecutvcut (reciprocal space cutoff for the correction, i.e.:
|
|
coulomb=1/(q+G)**2 for (q+G)**2> ecutvcut,
|
|
coulomb=cutoffed_coulomb for (q+G)**2 < ecutvcut).
|
|
Typical values for ecutvcut range from 0.7 to 2.0.
|
|
|
|
OTHER LIMITATIONS
|
|
So far NORM-CONSERVING and ULTRASOFT pseudopotentials are implemented.
|
|
The latter case requires the computation of density-like objects, in
|
|
principle defined on the complete FFT grid (cutoff "ecutrho > 4*ecutwfc).
|
|
In order to spare CPU time, this is done instead on the "smooth" FFT grid
|
|
(cutoff = 4*ecutwfc) in either G-space (very slow!) or in real space
|
|
(faster: option "tqr"). Note that "ecutfock" is not currently implemented
|
|
for US pseudopotentials. Hybrids with PAW are not currently functional.
|
|
|
|
PARALLEL IMPLEMENTATION ?
|
|
At present, both plane-wave and k-point parallelization have been
|
|
implemented. This is what is mostly needed for large systems.
|
|
An experimental parallelization on the band structure is also available
|
|
(pw.x -nbgrp N)
|
|
|
|
WHAT PROPERTIES CAN I COMPUTE ?
|
|
Energy and forces (thanks to Hellmann-Feynman theorem forces do not
|
|
require extra calculations). In principle also stresses but the
|
|
corresponding formulas have not yet been coded.
|
|
So structural optimization is OK if the cell shape is kept fixed.
|
|
Band structure ? yes and no. Obviously one computes wfc during the scf
|
|
cycle and their eigenvalues are printed in output.
|
|
This can be sufficient to draw a band structure or a DOS, but the problem
|
|
arises when one wishes non-scf calculations in k-points different from
|
|
those computed during the scf cycle. At present it is not possible
|
|
because this would require the knowledge of all bands at k+q that we do
|
|
not have. I do not know how to by-pass this problem.
|
|
|
|
ELECTRIC FIELD
|
|
I did not dig into this issue but Paolo Umari is using EXX with
|
|
electric field. For details it would be better to ask him directly.
|
|
|
|
AN EXAMPLE
|
|
run_example script in this directory performs two series of calculations:
|
|
1) total energy of Silicon using different values for nqx,
|
|
2) calculation of binding energy of o2,co,n2 from calculations in a
|
|
12 au cubic box and gamma sampling.
|
|
Running it will generate directory "results" to be compared with directory
|
|
"reference"
|
|
|
|
Please report problems and suggestions to the mailing list of QE developers:
|
|
<q-e-developers@qe-forge.org>
|