qmcpack/manual/gaussian_orbitals_solids.tex

\chapter{Periodic LCAO for solids}
\label{chap:LCAO}

\section{Introduction}

QMCPACK implements linear combination of atomic orbitals (LCAO) and Gaussian
basis sets in periodic boundary conditions. This method uses orders of
magnitude less memory than the real space spline wavefunction. While
the spline scheme enables very fast evaluation of the wavefunction, it may
require too much on-node memory for a large complex cell. The periodic
Gaussian evaluation provides a fallback that will definitely fit in
available memory, but at significantly increased computational
expense. Well designed Gaussian basis sets should be used to accurately
represent the wavefunction, typically
including both diffuse and high angular momentum functions.

The initial implementation is limited to the $\Gamma$-point using
trial wavefunctions generated by PySCF\cite{Sun2018}, but other codes such as
Crystal can be interfaced on request.

%\subsection{Single Particle Orbitals}
%
%In QMC the many-body trial wavefunction is expressed as the product of an antisymmetric part and a correlating Jastrow factor:
% \begin{equation}
%\Psi_T(\vec{R}) = \mathcal{A}(\vec{R}) \exp\left[\sum_i J_i(\vec{R})\right]
%\end{equation}
%
%Where $\Psi_T(\vec{R})$ is the trial wave function, $\vec{R}$ is a space spin coordinates, $J(\vec{R})$ the jastrow function and $\mathcal{A}(\vec{R})$  the antisymmetric wavefunction. $\mathcal{A}(\vec{R})$  is traditionally obtained from methods such as DFT, Hartree Fock, MCSCF or CI expansion.  Many trial-wavefunctions forms have been explored, but the most popular and effective general form remains the Slater Jastrow form
% \begin{equation}
%\Psi_T(\vec{R}) = \exp\left[\sum_i J_i(\vec{R})\right]\sum_k^M C_kD_k^{\uparrow}(\varphi)D_k^{\downarrow}(\varphi)
%\end{equation}
%Where $D_k^{\downarrow}(\varphi)$ is a slater determinant expressed in terms of single particle orbitals (SPO) $\varphi_i=\sum^{N_b}_l C_l ^i \Phi_l$ . The choice of SPO representation is crucial for QMC as the cost of computing $\Phi_l$ scales linearly with the number of basis functions evaluation.  The scaling grows with the system size and the total evaluation of the N SPOs scales as  $ \mathcal{O}(N)^3$ per Monte Carlo step. In the QMCPACK parallelization scheme, SPOs are stored in read only memory replicated on each node or GPU, limiting the size of the systems to the available memory per node.
%
%In real space QMC methods it is standard to use a real space b-spline scheme or a closely related method, due to the considerable speedup over plane-waves while retaining simple convergence properties. Use of atomic orbitals and Gaussians that include more physics or chemistry results in much more efficient basis sets, but gives up easy convergence properties.
%
%\subsubsection{B-splines}
%  3D tricubic B-splines provide a basis in which only
%64 elements are nonzero at any given point in space.
%The one-dimensional cubic B-spline is given by,
%\begin{equation}
%f(x) = \sum_{i'=i-1}^{i+2} b^{i'\!,3}(x)\,\,  p_{i'},
%\label{eq:SplineFunc}
%\end{equation}
%where $b^{i}(x)$ are $p_i$ the piecewise cubic polynomial basis functions
%and $i = \text{floor}(\Delta^{-1} x)$ is the index of
%the first grid point $\le x$.  Constructing a tensor product in each Cartesian
%direction, we can represent a 3D orbital as
%\begin{equation}
%  \phi_n(x,y,z) =
%  \!\!\!\!\sum_{i'=i-1}^{i+2} \!\! b_x^{i'\!,3}(x)
%  \!\!\!\!\sum_{j'=j-1}^{j+2} \!\! b_y^{j'\!,3}(y)
%  \!\!\!\!\sum_{k'=k-1}^{k+2} \!\! b_z^{k'\!,3}(z) \,\, p_{i', j', k',n}.
%\label{eq:TricubicValue}
%\end{equation}
%This allows the rapid evaluation of each orbital in constant time.
%Furthermore, this basis is systematically improvable with a single spacing
%parameter, so that accuracy is not compromised and convergence checks are simple.
%
%Trial wavefunctions for materials are commonly produced using plane wave codes such as Quantum Espresso. The conversion to real space b-splines is straightforward. Compared to directly evaluating Fourier series, b-splines are approximately one order of magnitude faster, with the speedup increasing with system size.
%
%\subsubsection{Linear Combination of Atomic Orbitals (LCAO)}

LCAO schemes use physical considerations to construct a highly
efficient basis set compared to plane waves. Typically only a few tens
of basis functions per atom are required compared to thousands of
plane-waves. Many forms of LCAO schemes exist and are being
implemented in QMCPACK. The details of the already implemented methods
will be described in the following section of the manual.

\noindent \textbf{Gaussian Trial Orbitals (GTOs):}
 The Gaussian basis functions follow a radial-angular decomposition
\begin{equation}
     \phi (\mathbf{r} )=R_{l}(r)Y_{lm}(\theta ,\phi )
\end{equation}
where $ Y_{{lm}}(\theta ,\phi )$ is a spherical harmonic, $l$ and $m$
are the angular momentum and its $z$ component, and $r,\theta ,\phi $
are spherical coordinates. In practice they are atom centered and the
$l$ expansion typically includes 1-3 additional channels compared to
the formally occupied states of the atom. e.g. 4-6 for a nickel atom with
occupied $s$, $p$, and $d$ electron shells.

The evaluation of GTOs within PBC differs slightly from evaluating
GTOs in Open Boundary Conditions (OBC).  The orbitals are evaluated at
a distance $r$ in the primitive cell (similar to OBC) and then the
contributions of the periodic images are added by evaluating the
orbital at a distance $r+T$ where T is a translation of the cell
lattice vector. This requires loops over the periodic images until the
contributions are orbitals $\Phi$. In the current implementation, the
number of periodic images is an input parameter named
\textit{PBCimages}, which takes three integers corresponding to the
number of periodic images along the supercell axes (X, Y and Z axes
for a cubic cell). By default these parameters are set to
\textit{PBCimages= 5 5 5} but they \textbf{require manual convergence
  checks}. Convergence checks can be performed by checking the total
energy convergence with respect to \textit{PBCimages}, similar to checks
performed for plane wave cutoff energy and b-spline grids. Use of
diffuse Gaussians may require these parameters to be increased, while
sharply localized Gaussians may permit a decrease. The cost of
evaluating the wavefunction increases sharply as \textit{PBCimages} is
increased. This input parameter will be replaced by a tolerance
factor and numerical screening in future.

\section{Generating and using periodic gaussian trial wavefunctions
  using PySCF}

Similar to any QMC calculation, using periodic GTOs requires the
generation of a periodic trial wavefunction. QMCPACK is currently
interfaced to PySCF which is a multipurpose electronic structure
written mainly in Python with key numerical functionality implemented
via optimized C and C++ libraries\cite{Sun2018}. Such a wavefunction
can be generated following the example for a 2x1x1 supercell,
below. Note that the current implementation and examples cover only
the use of k-points where symmetry allows real coefficients to be
used.  This allows calculation at $\Gamma$) and, e.g., some high
symmetry k-points at the Brillouin zone edges.  More general k-points
requiring complex coefficients will be supported in future releases.

\begin{lstlisting}[caption=Example PySCF input for single k-point calculation for a 2x1x1 Carbon supercell.]
#!/usr/bin/env python

import numpy
from pyscf.pbc import gto, scf, dft
from mpi4pyscf.pbc import df
from pyscf.pbc.tools.pbc import super_cell

nmp = [2, 1, 1]

cell = gto.Cell()

cell.a = '''
         3.37316115       3.37316115       0.00000000
         0.00000000       3.37316115       3.37316115
         3.37316115       0.00000000       3.37316115'''
cell.atom = '''
   C        0.00000000       0.00000000       0.00000000
   C        1.686580575      1.686580575      1.686580575
            '''
cell.basis='bfd-vtz'
cell.ecp = 'bfd'

cell.unit='B'
cell.drop_exponent=0.1

cell.verbose = 5
cell.build()


supcell = super_cell(cell, nmp)
mydf = df.FFTDF(supcell)
mydf.auxbasis = 'weigend'

mf = dft.RKS(supcell)
mf.xc = 'lda'

mf.exxdiv = 'ewald'
mf.with_df = mydf

e_scf=mf.kernel()


print 'e_scf',e_scf

kpts=[]
title="C_Diamond"
from PyscfToQmcpack import savetoqmcpack
savetoqmcpack(supcell,mf,title=title,kpts=kpts)

\end{lstlisting}

Note that the last 4 lines of the file
\begin{lstlisting}
kpts=[]
title="C_Diamond"
from PyscfToQmcpack import savetoqmcpack
savetoqmcpack(supcell,mf,title=title,kpts=kpts)
\end{lstlisting}

contains an empty list of k-points (since this is a gamma point
calculation) and a title. The title variable will be the name of the
HDF5 file where all the data needed by QMCPACK will be stored.  The
function \textit{savetoqmcpack} will be called at the end of the
calculation and will generate the HDF5 similarly to the non-periodic
PySCF calculation in section~\ref{sec:convert4qmc} (convert4qmc). The
function is distributed with QMCPACK and located in the
qmcpack/src/QMCTools directory under the name
\textit{PyscfToQmcpack.py}. In order for the script to work, you need
to specify the path to the file in your PYTHONPATH such as

\begin{lstlisting}
export PYTHONPATH=QMCPACK_PATH/src/QMCTools:$PYTHONPATH
\end{lstlisting}

When using multiple k-points, it is necessary to expand the kpoints into the equivalent supercell, adjust for the phase factor in the coefficient's value due to the translation by the lattice vector and order the molecular coefficients from each k-point according to their occupation. These operations are all automated in the \textit{savetoqmcpack()} function.\\

The following example corresponds to the same carbon system (2x1x1) however, in this case, we use a primitive simulation cell and a 2x1x1 kpoint mesh.

\begin{lstlisting}[caption=Example PySCF input for single k-point calculation for a 2x1x1 Carbon supercell.]
#!/usr/bin/env python

import numpy
from pyscf.pbc import gto, scf, dft,df
kmesh = [2, 1, 1]

cell = gto.Cell()
cell.a = '''
         3.37316115       3.37316115       0.00000000
         0.00000000       3.37316115       3.37316115
         3.37316115       0.00000000       3.37316115'''
cell.atom = '''
   C        0.00000000       0.00000000       0.00000000
   C        1.686580575      1.686580575      1.686580575
            '''
cell.basis='bfd-vtz'
cell.ecp = 'bfd'

cell.unit='B'
cell.drop_exponent=0.1

cell.verbose = 5

cell.build()

kpts = cell.make_kpts(kmesh)
kpts -= kpts[0]

mydf = df.GDF(cell,kpts)
mydf.auxbasis = 'weigend'
mf = scf.KRHF(supcell,kpts).density_fit()

mf.exxdiv = 'ewald'
mf.with_df = mydf
e_scf=mf.kernel()


title="C_Diamond-211"

from PyscfToQmcpack import savetoqmcpack
savetoqmcpack(supcell,mf,title=title,kpts=kpts,kmesh=kmesh)

\end{lstlisting}


Note the difference between the 2 input files where:\\
\begin{lstlisting}
kmesh=[2,1,1]  #k-point mesh
\end{lstlisting}

\begin{lstlisting}
kpts = cell.make_kpts(kmesh)
kpts -= kpts[0]
\end{lstlisting}
Will generate k-points centered around the $\Gamma$-point and will insure the molecular coefficients are real.\\

\begin{lstlisting}
mf = scf.KRHF(supcell,kpts).density_fit()
\end{lstlisting}
The Computational algorithm chosen in PySCF is \textit{KRHF} instead of \textit{RHF}.

Finally, to generate the HDF5 file needed by \qmcpack we call the \textit{savetoqmcpack} function\\
\begin{lstlisting}
from PyscfToQmcpack import savetoqmcpack
savetoqmcpack(supcell,mf,title=title,kpts=kpts,kmesh=kmesh)
\end{lstlisting}
In this call, we simply specify the k-point mesh used in order to force the converter to generate the desired cell. Note that if the parameter \textit{kmesh} is omitted, the converter will still try to ``guess'' it.


In order to generate QMCPACK input files, you will need to run for both cases \textit{convert4qmc} exactly as specified in section ~\ref{sec:convert4qmc};
\begin{lstlisting}
convert4qmc -pyscf C_Diamond.h5
\end{lstlisting}

This tool can be used with any option described in convert4qmc. Since
the HDF5 contains all the information needed, there is no need to
specify any other specific tag for periodicity.A supercell at
$\Gamma$-point or using multiple k-points will work without further
modification..

Running convert4qmv will generate 3 input files;\\
\begin{lstlisting}[caption=CDiamond.structure.xml. This file contains the geometry of the system.]
<?xml version="1.0"?>
<qmcsystem>
  <simulationcell>
    <parameter name="lattice">
  6.74632230000000e+00  6.74632230000000e+00  0.00000000000000e+00
  0.00000000000000e+00  3.37316115000000e+00  3.37316115000000e+00
  3.37316115000000e+00  0.00000000000000e+00  3.37316115000000e+00
</parameter>
    <parameter name="bconds">p p p</parameter>
    <parameter name="LR_dim_cutoff">15</parameter>
  </simulationcell>
  <particleset name="ion0" size="4">
    <group name="C">
      <parameter name="charge">4</parameter>
      <parameter name="valence">4</parameter>
      <parameter name="atomicnumber">6</parameter>
    </group>
    <attrib name="position" datatype="posArray">
  0.0000000000e+00  0.0000000000e+00  0.0000000000e+00
  1.6865805750e+00  1.6865805750e+00  1.6865805750e+00
  3.3731611500e+00  3.3731611500e+00  0.0000000000e+00
  5.0597417250e+00  5.0597417250e+00  1.6865805750e+00
</attrib>
    <attrib name="ionid" datatype="stringArray">
 C C C C
</attrib>
  </particleset>
  <particleset name="e" random="yes" randomsrc="ion0">
    <group name="u" size="8">
      <parameter name="charge">-1</parameter>
    </group>
    <group name="d" size="8">
      <parameter name="charge">-1</parameter>
    </group>
  </particleset>
</qmcsystem>
  \end{lstlisting}

  As one can see, that for both examples the 2 atom primitive cell has been expanded to contain 4 atoms in a 2x1x1 carbon cell.
\begin{lstlisting}[caption=CDiamond.wfj-Twist0.xml. This file contains the trial wavefunction.]
<?xml version="1.0"?>
<qmcsystem>
  <wavefunction name="psi0" target="e">
    <determinantset type="MolecularOrbital" name="LCAOBSet" source="ion0" transform="yes" twist="0  0  0" href="C_Diamond.h5" PBCimages="5  5  5">
      <slaterdeterminant>
        <determinant id="updet" size="8">
          <occupation mode="ground"/>
          <coefficient size="116" spindataset="0"/>
        </determinant>
        <determinant id="downdet" size="8">
          <occupation mode="ground"/>
          <coefficient size="116" spindataset="0"/>
        </determinant>
      </slaterdeterminant>
    </determinantset>
    <jastrow name="J2" type="Two-Body" function="Bspline" print="yes">
      <correlation size="10" speciesA="u" speciesB="u">
        <coefficients id="uu" type="Array"> 0 0 0 0 0 0 0 0 0 0</coefficients>
      </correlation>
      <correlation size="10" speciesA="u" speciesB="d">
        <coefficients id="ud" type="Array"> 0 0 0 0 0 0 0 0 0 0</coefficients>
      </correlation>
    </jastrow>
    <jastrow name="J1" type="One-Body" function="Bspline" source="ion0" print="yes">
      <correlation size="10" cusp="0" elementType="C">
        <coefficients id="eC" type="Array"> 0 0 0 0 0 0 0 0 0 0</coefficients>
      </correlation>
    </jastrow>
  </wavefunction>
</qmcsystem>
 \end{lstlisting}
This files contains information related to the trial wavefunction. It is identical to the input file from an Open Boundary Conditions calculation to the exception of the following tags:\\
\begin{table}[h]
\begin{center}
\begin{tabularx}{\textwidth}{l l l l l }
\hline
\multicolumn{5}{l}{*.wfj.xml specific tags} \\
\hline
%\multicolumn{2}{l}{Outputfiles}  & \multicolumn{3}{l}{}\\
   &   \bfseries tag     & \bfseries tag type & \bfseries default   & \bfseries description \\
   &   \texttt{twist             } &  3 doubles  & Gamma ( 0 0 0)& coordinate of the twist to compute\\
   &   \texttt{href             } &  string  & default& name of the HDF5 file generated by\\
   &                              &          &        &  PySCF and used for convert4qmc\\
   &   \texttt{PBCimages            } &  3 Integer   & 5 5 5  & Number of periodic images to evaluate the orbitals\\
    \hline
    \end{tabularx}
\end{center}
\end{table}

Other files containing QMC methods (such as optimization, VMC and DMC blocks) will be generated and will behave in a similar fashion regardless of the type of SPO in the trial wavefunction.