mirror of https://gitlab.com/QEF/q-e.git
2859 lines
112 KiB
TeX
2859 lines
112 KiB
TeX
\documentclass[12pt,a4paper]{article}
|
|
\def\version{CVS}
|
|
|
|
\usepackage{epsfig}
|
|
\usepackage{html}
|
|
%\def\htmladdnormallink#1#2{#1}
|
|
|
|
\begin{document}
|
|
|
|
\author{}
|
|
\date{}
|
|
\title{
|
|
% PWscf and Democritos logos, raise the latter to align
|
|
\epsfig{figure=pwscf,width=4cm}\hfill%
|
|
\raisebox{0.5cm}{\epsfig{figure=democritos,width=8cm}}
|
|
\vspace{1.5cm}
|
|
\\
|
|
% title
|
|
\huge User's Guide for Quantum-ESPRESSO v.\version
|
|
}
|
|
\maketitle
|
|
|
|
\tableofcontents
|
|
|
|
\clearpage
|
|
|
|
\section{Introduction}
|
|
|
|
This guide covers the installation and usage of Quantum ($\nu$)
|
|
ESPRESSO (opEn-Source Package for Research in Electronic Structure,
|
|
Simulation, and Optimization), version \version. The $\nu-$ESPRESSO
|
|
package contains the
|
|
following codes for the calculation of electronic-structure properties
|
|
within Density-Functional Theory, using a Plane-Wave basis set and
|
|
pseudopotentials:
|
|
\begin{itemize}
|
|
\item PWscf (Plane-Wave Self-Consistent Field).
|
|
\item FPMD (First Principles Molecular Dynamics).
|
|
\item CP (Car-Parrinello).
|
|
\end{itemize}
|
|
Moreover it contains auxiliary codes:
|
|
\begin{itemize}
|
|
\item PWgui (Graphical User Interface for PWscf): a graphical
|
|
interface for producing input data files for PWscf.
|
|
\item atomic: a program for atomic calculations and generation of
|
|
pseudopotentials.
|
|
\end{itemize}
|
|
Documentation, in addition to what provided in this guide, can be
|
|
found in:
|
|
\begin{itemize}
|
|
\item the \texttt{Doc/} directory of the $\nu-$ESPRESSO distribution
|
|
|
|
In particular the \texttt{INPUT\_*} files contain the detailed
|
|
listing of available input variables and cards.
|
|
\item the various \texttt{README} files found in the distribution
|
|
\item the PWscf web site
|
|
(\htmladdnormallink{\texttt{http://www.pwscf.org/}}%
|
|
{http://www.pwscf.org/})
|
|
\item the Pw\_forum mailing list
|
|
(\htmladdnormallink{\texttt{pw\_forum@pwscf.org}}%
|
|
{mailto:pw_forum@pwscf.org})
|
|
|
|
See the PWscf web site for instructions on how to subscribe
|
|
and how to browse and search the archives of the mailing list.
|
|
Please search the archives before posting to the list: your
|
|
question might already have been answered.
|
|
\item the ``Scientific Software'' page of the Democritos web site
|
|
\hfill\break
|
|
(\htmladdnormallink%
|
|
{\texttt{http://www.democritos.it/scientific.php}}%
|
|
{http://www.democritos.it/scientific.php})
|
|
\end{itemize}
|
|
The $\nu-$ESPRESSO codes work on many different types of Unix machines,
|
|
including parallel machines using Message Passing Interface (MPI).
|
|
Running $\nu-$ESPRESSO on MS-Windows is possible, but not supported:
|
|
see section \ref{installation}, ``Installation''.
|
|
|
|
\subsection{Codes}
|
|
|
|
PWscf can currently perform the following kinds of calculations:
|
|
|
|
\begin{itemize}
|
|
\item ground-state energy and one-electron (Kohn-Sham) orbitals
|
|
\item atomic forces, stresses, and structural optimization
|
|
\item molecular dynamics on the ground-state Born-Oppenheimer
|
|
surface, also with variable-cell
|
|
\item Nudged Elastic Band (NEB) and Fourier String Method Dynamics (SMD)
|
|
for energy barriers and reaction paths
|
|
\item phonon frequencies and eigenvectors at a generic wave vector,
|
|
using Density-Functional Perturbation Theory
|
|
\item effective charges and dielectric tensors
|
|
\item electron-phonon interaction coefficients for metals
|
|
\item interatomic force constants in real space
|
|
\item third-order anharmonic phonon lifetimes
|
|
\item Infrared and Raman (nonresonant) cross section
|
|
\item macroscopic polarization via Berry Phase
|
|
\end{itemize}
|
|
All of the above work for both insulators and metals, in any crystal
|
|
structure, for many exchange-correlation functionals (including spin
|
|
polarization), for both norm-conserving (Hamann-Schl\"uter-Chiang)
|
|
pseudopotentials in separable form, and --- with very few exceptions
|
|
--- for Ultrasoft (Vanderbilt) pseudopotentials. Non-colinear
|
|
magnetism and spin-orbit interactions are also implemented, although
|
|
at an experimental stage. Various postprocessing programs are
|
|
available.
|
|
|
|
FPMD and CP can currently perform the following kinds of calculations:
|
|
|
|
\begin{itemize}
|
|
\item Car-Parrinello molecular dynamics simulation
|
|
\item geometry optimization by damped dynamics
|
|
\item constant-temperature simulation with Nos\`e thermostats
|
|
\item variable-cell (Parrinello-Rahman) dynamics
|
|
\item Nudged Elastic Band (NEB) for energy barriers and reaction
|
|
paths
|
|
\item String Method Dynamics (in real space) (CP only)
|
|
\item dynamics with Wannier functions (CP only)
|
|
\end{itemize}
|
|
Spin-polarized calculations and (for FPMD only) calculations with
|
|
multiple k-points can be performed.
|
|
CP works with both norm-conserving and Ultrasoft pseudopotentials,
|
|
while FPMD is currently limited to norm-conserving.
|
|
The restart files of the two programs are compatible: you can run FPMD
|
|
with a restart file from CP, and vice versa.
|
|
|
|
\subsection{People}
|
|
|
|
\hyphenation{gian-noz-zi}
|
|
The maintenance and further development of the $\nu-$ESPRESSO code is
|
|
promoted by the DEMOCRITOS National Simulation Center of the Italian
|
|
INFM
|
|
(\htmladdnormallink{\texttt{http://www.democritos.it/}}%
|
|
{http://www.democritos.it/})
|
|
under the coordination of Paolo Giannozzi
|
|
(\htmladdnormallink{\texttt{giannozz@nest.sns.it}}%
|
|
{mailto:giannozz@nest.sns.it})
|
|
(Scuola Normale Superiore, Pisa), with the strong support of the
|
|
CINECA National Supercomputing Center in Bologna
|
|
(\htmladdnormallink{\texttt{http://www.cineca.it/}}%
|
|
{http://www.cineca.it/}),
|
|
under the responsibility of Carlo Cavazzoni\break
|
|
(\htmladdnormallink{\texttt{c.cavazzoni@cineca.it}}%
|
|
{mailto:c.cavazzoni@cineca.it}).
|
|
Currently active developers include
|
|
Gerardo Ballabio (CINECA),
|
|
Stefano Fabris, Adriano Mosca Conte, Carlo Sbraccia
|
|
(SISSA, Trieste),
|
|
Anton Kokalj (Jo\v{z}ef Stefan Institute, Ljubljana).
|
|
|
|
The PWscf package was originally developed by Stefano Baroni, Stefano
|
|
de Gironcoli, Andrea Dal Corso (SISSA), Paolo Giannozzi, and others.
|
|
The web site for PWscf and related codes is:
|
|
\htmladdnormallink{\texttt{http://www.pwscf.org/}}%
|
|
{http://www.pwscf.org/}
|
|
|
|
The FPMD and CP codes are both based on the original code written by
|
|
Roberto Car and Michele Parrinello.
|
|
|
|
FPMD was developed by
|
|
Carlo Cavazzoni, Gerardo Ballabio (CINECA),
|
|
Sandro Scandolo (ICTP, Trieste),
|
|
Guido Chiarotti (SISSA),
|
|
Paolo Focher,
|
|
and others.
|
|
|
|
CP was developed by
|
|
Alfredo Pasquarello (IRRMA, Lausanne),
|
|
Kari Laasonen (Oulu),
|
|
Andrea Trave (LLNL),
|
|
Roberto Car (Princeton),
|
|
Nicola Marzari (MIT),
|
|
Paolo Giannozzi,
|
|
and others.
|
|
|
|
PWgui was written by Anton Kokalj and is based on his GUIB concept
|
|
(\htmladdnormallink{\texttt{http://www-k3.ijs.si/kokalj/guib/}}%
|
|
{http://www-k3.ijs.si/kokalj/guib/}).
|
|
|
|
The pseudopotential generation package ``atomic'' was written by
|
|
Andrea Dal Corso and it is the result of many additions to the
|
|
original code by Paolo Giannozzi.
|
|
|
|
A list of further contributors includes:
|
|
Dario Alf\`e,
|
|
Francesco Antoniella,
|
|
Mauro Boero,
|
|
Claudia Bungaro,
|
|
Paolo Cazzato,
|
|
Gabriele Cipriani,
|
|
Matteo Cococcioni,
|
|
Alberto Debernardi,
|
|
Gernot Deinzer.
|
|
Oswaldo Dieguez,
|
|
Guido Fratesi,
|
|
Ralph Gebauer,
|
|
Martin Hilgeman,
|
|
Yosuke Kanai,
|
|
Axel Kohlmeyer,
|
|
Konstantin Kudin,
|
|
Michele Lazzeri,
|
|
Kurt Maeder,
|
|
Francesco Mauri,
|
|
Nicolas Mounet,
|
|
Pasquale Pavone,
|
|
Mickael Profeta,
|
|
Guido Roma,
|
|
Manu Sharma,
|
|
Alexander Smogunov,
|
|
Kurt Stokbro,
|
|
Pascal Thibaudeau,
|
|
Antonio Tilocca,
|
|
Renata Wentzcovitch,
|
|
Yudong Wu,
|
|
Xiaofei Wang,
|
|
and let us apologize to everybody we have forgotten.
|
|
|
|
This guide was written (mostly) by Paolo Giannozzi, Gerardo Ballabio,
|
|
Carlo Cavazzoni.
|
|
|
|
\subsection{Terms of use}
|
|
|
|
$\nu-$ESPRESSO is free software, released under the GNU General Public
|
|
License
|
|
(\htmladdnormallink{\texttt{http://www.pwscf.org/License.txt}}%
|
|
{http://www.pwscf.org/License.txt},
|
|
or the file \texttt{License} in the distribution).
|
|
|
|
All trademarks mentioned in this guide belong to their respective
|
|
owners.
|
|
|
|
We shall greatly appreciate if scientific work done using this code
|
|
will contain an explicit acknowledgment and a reference to the
|
|
$\nu-$ESPRESSO web page.
|
|
Our preferred form for the acknowledgment is the following:
|
|
|
|
\begin{quote}
|
|
\emph{Acknowledgments:}
|
|
\par\noindent
|
|
Calculations in this work have been done using the $\nu-$ESPRESSO package
|
|
[\emph{ref}].
|
|
\par\noindent
|
|
\emph{Bibliography:}
|
|
\par\noindent
|
|
[\emph{ref}]
|
|
S.~Baroni, A.~Dal Corso, S.~de Gironcoli, P.~Giannozzi, % PWscf
|
|
C.~Cavazzoni, G.~Ballabio, S.~Scandolo, G.~Chiarotti, P.~Focher, % FPMD
|
|
A.~Pasquarello, K.~Laasonen, A.~Trave, R.~Car, N.~Marzari, % CP
|
|
A.~Kokalj, % PWgui
|
|
\texttt{http://www.pwscf.org/}.
|
|
\end{quote}
|
|
|
|
\clearpage
|
|
|
|
\section{Installation}
|
|
\label{installation}
|
|
|
|
Presently, the $\nu-$ESPRESSO package is only distributed in source form;
|
|
some precompiled executables (binary files) are provided only for
|
|
PWgui.
|
|
Providing binaries for $\nu-$ESPRESSO would require too much effort and
|
|
would work only for a small number of machines anyway.
|
|
|
|
To install $\nu-$ESPRESSO, you need working C and fortran-95 compilers
|
|
(fortran-90 is not sufficient, but most "fortran-90" compilers
|
|
are actually fortran-95-compliant). You will also need basic unix
|
|
facilities: a shell, the \texttt{make} and \texttt{awk} utilities.
|
|
|
|
The latest stable release of the $\nu-$ESPRESSO source package (currently
|
|
version \version) can be downloaded from this URL:
|
|
\medskip
|
|
|
|
\htmladdnormallink{\texttt{http://www.pwscf.org/download.htm}}%
|
|
{http://www.pwscf.org/download.htm}
|
|
\medskip
|
|
|
|
\noindent
|
|
To uncompress and unpack it, move it to an empty directory of your
|
|
choice, \texttt{cd} to that directory, and run the command:
|
|
\medskip
|
|
|
|
\texttt{tar zxvf pw.\version.tgz}
|
|
\medskip
|
|
|
|
\noindent
|
|
If your version of \texttt{tar} doesn't recognize the \texttt{z} flag,
|
|
use this instead:
|
|
\medskip
|
|
|
|
\texttt{gunzip -c pw.\version.tgz | tar xvf -}
|
|
\medskip
|
|
|
|
\noindent
|
|
The bravest may access the (unstable) development version via anonymous
|
|
CVS (Concurrent Version Sysstem): see the file \texttt{README.cvs}
|
|
contained in the distribution.
|
|
|
|
To install $\nu-$ESPRESSO, you must:
|
|
\begin{enumerate}
|
|
\item configure the source package for your system, compilers and
|
|
libraries;
|
|
\item compile some or all the executables in the package.
|
|
\end{enumerate}
|
|
For the impatient:
|
|
\begin{verbatim}
|
|
./configure
|
|
make all
|
|
\end{verbatim}
|
|
Executable programs (actually, symlinks to them) will be placed in the
|
|
\texttt{bin/} directory.
|
|
|
|
If you have problems or would like to tweak the default settings, read
|
|
the detailed instructions below.
|
|
|
|
\subsection{Configure}
|
|
|
|
To configure the $\nu-$ESPRESSO source package, run the \texttt{configure}
|
|
script. It will (try to) detect compilers and libraries available on
|
|
your machine, and set up things accordingly.
|
|
Presently it is expected to work on most Linux 32- and 64-bit (Itanium
|
|
and Opteron) PCs and clusters, IBM SP machines, SGI Origin, some
|
|
HP-Compaq Alpha machines, Cray X1, Mac OS X. It may work with some
|
|
assistance also on other architectures (see below).
|
|
|
|
Cross-compilation is theoretically supported, but has never been
|
|
tested; you have to specify the target machine with the
|
|
\texttt{--host} option (see below).
|
|
|
|
Specifically, \texttt{configure} generates the following files:
|
|
\begin{quote}
|
|
\texttt{make.sys}: compilation settings and flags\\
|
|
\texttt{make.rules}: compilation rules\\
|
|
\texttt{*/make.depend}: dependencies, per source directory
|
|
\end{quote}
|
|
|
|
\texttt{make.depend} files are actually generated by the
|
|
\texttt{makedeps.sh} shell script, that \texttt{configure} invokes.
|
|
If you modify the program sources, you might have to rerun it.
|
|
|
|
You should always be able to compile the $\nu-$ESPRESSO suite of programs
|
|
without having to edit any of the generated files. However you may
|
|
have to tune \texttt{configure} by specifying appropriate environment
|
|
variables and/or command-line options.
|
|
Usually the most tricky part is to get external libraries recognized
|
|
and used: see section \ref{libraries}, ``Libraries'', for details and
|
|
hints.
|
|
|
|
Environment variables may be set in any of these ways:
|
|
\begin{verbatim}
|
|
export VARIABLE=value # sh, bash, ksh
|
|
./configure
|
|
|
|
setenv VARIABLE value # csh, tcsh
|
|
./configure
|
|
|
|
./configure VARIABLE=value # any shell
|
|
\end{verbatim}
|
|
Some environment variables that are relevant to \texttt{configure} are:
|
|
\begin{quote}
|
|
\texttt{ARCH}:
|
|
label identifying the machine type (see below)\\
|
|
\texttt{F90}, \texttt{F77}, \texttt{CC}:
|
|
names of Fortran 95, Fortran 77, and C compilers\\
|
|
\texttt{CPP}:
|
|
source file preprocessor (defaults to \texttt{\$CC -E})\\
|
|
\texttt{LD}: linker (defaults to \texttt{\$F90})\\
|
|
\texttt{CFLAGS}, \texttt{FFLAGS}, \texttt{F90FLAGS},
|
|
\texttt{CPPFLAGS}, \texttt{LDFLAGS}:
|
|
compilation flags\\
|
|
\texttt{LIBDIRS}:
|
|
extra directories to search for libraries (see below)
|
|
\end{quote}
|
|
For example, the following command line:
|
|
\begin{verbatim}
|
|
./configure F90=ifort FFLAGS="-Vaxlib -O2 -assume byterecl" \
|
|
CC=gcc CFLAGS=-O3 LDFLAGS="-Vaxlib -static"
|
|
\end{verbatim}
|
|
instructs \texttt{configure} to use \texttt{ifort} as Fortran 95
|
|
compiler with flags \texttt{"-Vaxlib -O2 -assume byterecl"},
|
|
\texttt{gcc} as C compiler with flags \texttt{"-O3"}, and to use flags
|
|
\texttt{"-Vaxlib -static"} when linking. Note that the values
|
|
of \texttt{FFLAGS} and \texttt{LDFLAGS} must be quoted, because they
|
|
contain spaces.
|
|
|
|
If your machine type is unknown to \texttt{configure}, you may use the
|
|
\texttt{ARCH} variable to suggest an architecture among supported
|
|
ones. Try the one that looks more similar to your machine type
|
|
(you'll probably have to do some additional tweaking).
|
|
Currently supported architectures are:
|
|
\begin{quote}
|
|
\texttt{linux64}: Linux 64-bit machines (Itanium, Opteron)\\
|
|
\texttt{linux32}: Linux PCs\\
|
|
\texttt{aix}: IBM AIX machines\\
|
|
\texttt{mips}: SGI MIPS machines\\
|
|
\texttt{alpha}: HP-Compaq alpha machines\\
|
|
\texttt{sparc}: Sun SPARC machines\\
|
|
\texttt{crayx1}: Cray X1 machines\\
|
|
\texttt{mac}: Apple PowerPC running Mac OS X
|
|
\end{quote}
|
|
Finally, \texttt{configure} recognizes the following command-line
|
|
options:
|
|
\begin{quote}
|
|
\texttt{--disable-parallel}:
|
|
compile serial code, even if parallel environment is available\\
|
|
\texttt{--disable-shared}:
|
|
don't use shared libraries: generate static executables\\
|
|
\texttt{--enable-shared}:
|
|
use shared libraries\\
|
|
\texttt{--host=}\emph{target}:
|
|
specify target machine for cross-compilation.\break
|
|
\emph{Target} must be a string identifying the architecture that
|
|
you want to compile for; you can obtain it by running
|
|
\texttt{config.guess} on the target machine.
|
|
\end{quote}
|
|
If you want to modify the \texttt{configure} script (advanced users
|
|
only!), you'll need GNU Autoconf
|
|
(\htmladdnormallink{\texttt{http://www.gnu.org/software/autoconf/}}%
|
|
{http://www.gnu.org/software/autoconf/}).
|
|
Edit the source file \texttt{configure.ac}, then run Autoconf to
|
|
regenerate \texttt{configure}. If you edit \texttt{configure}
|
|
directly, all changes will be lost when you regenerate it.
|
|
You may also want to edit \texttt{make.sys.in} and
|
|
\texttt{make.rules.in}.
|
|
For more information, see \texttt{README.configure}.
|
|
|
|
\subsubsection{Libraries}
|
|
\label{libraries}
|
|
|
|
$\nu-$ESPRESSO makes use of the following external libraries:
|
|
\begin{itemize}
|
|
\item BLAS
|
|
(\htmladdnormallink{\texttt{http://www.netlib.org/blas/}}%
|
|
{http://www.netlib.org/blas/})
|
|
and LAPACK\hfill\break
|
|
(\htmladdnormallink{\texttt{http://www.netlib.org/lapack/}}%
|
|
{http://www.netlib.org/lapack/})
|
|
for linear algebra
|
|
\item FFTW
|
|
(\htmladdnormallink{\texttt{http://www.fftw.org/}}%
|
|
{http://www.fftw.org/})
|
|
for Fast Fourier Transforms
|
|
\end{itemize}
|
|
A copy of the needed routines is provided with the distribution.
|
|
However, when available, optimized vendor-specific libraries can be
|
|
used instead: this often yields huge performance gains.
|
|
|
|
$\nu-$ESPRESSO can use the following architecture-specific replacements for
|
|
BLAS and LAPACK:
|
|
\begin{quote}
|
|
\texttt{essl} for IBM machines\\
|
|
\texttt{complib.sgimath} for SGI Origin\\
|
|
\texttt{scilib} for Cray/T3e\\
|
|
\texttt{sunperf} for Sun\\
|
|
\texttt{MKL} for Intel Linux PCs\\
|
|
\texttt{ACML} for AMD Linux PCs\\
|
|
\texttt{cxml} for HP-Compaq Alphas.
|
|
\end{quote}
|
|
If none of these is available, we suggest that you use the optimized
|
|
ATLAS library
|
|
(\htmladdnormallink{\texttt{http://math-atlas.sourceforge.net/}}%
|
|
{http://math-atlas.sourceforge.net/}).
|
|
Note that ATLAS is not a complete replacement for LAPACK: it contains
|
|
all of the BLAS, plus the LU code, plus the full storage Cholesky
|
|
code. Follow the instructions in the ATLAS distributions to produce a
|
|
full LAPACK replacement.
|
|
|
|
Axel Kohlmeyer maintains a set of ATLAS libraries,
|
|
containing all of LAPACK and no external reference to fortran
|
|
libraries:\hfill\break
|
|
\htmladdnormallink%
|
|
{{\small\texttt{http://www.theochem.rub.de/\~{}axel.kohlmeyer/%
|
|
cpmd-linux.html\#atlas}}}%
|
|
{http://www.theochem.rub.de/~axel.kohlmeyer/cpmd-linux.html\#atlas}
|
|
|
|
Sergei Lisenkov reported success and good performances with
|
|
optimized BLAS by Kazushige Goto.
|
|
They can be downloaded freely (but not redistributed!) from:
|
|
\htmladdnormallink%
|
|
{\texttt{http://www.cs.utexas.edu/users/flame/goto/}}%
|
|
{http://www.cs.utexas.edu/users/flame/goto/}
|
|
|
|
The FFTW library can also be replaced by vendor-specific FFT
|
|
libraries, when available, or you can link to a precompiled FFTW
|
|
library. Please note that you must use FFTW version 2. Support for
|
|
version 3 is in progress: contact the developers if you want to try.
|
|
|
|
The \texttt{configure} script attempts to find optimized libraries,
|
|
but may fail if they have been installed in non-standard places.
|
|
You should examine the final value of \texttt{LIBS} (either in the
|
|
output of \texttt{configure}, or in the generated \texttt{make.sys})
|
|
to check whether it found all the libraries that you intend to use.
|
|
|
|
If any libraries weren't found, you can specify a list of directories
|
|
to search in the environment variable \texttt{LIBDIRS}, and rerun
|
|
\texttt{configure}; directories in the list must be separated by
|
|
spaces. For example:
|
|
\begin{verbatim}
|
|
./configure LIBDIRS="/opt/intel/mkl70/lib/32 /usr/lib/math"
|
|
\end{verbatim}
|
|
If this still fails, you may set the environment variable
|
|
\texttt{LIBS} manually and retry. For example:
|
|
\begin{verbatim}
|
|
./configure LIBS="-L/usr/lib/math -lfftw -lf77blas -latlas"
|
|
\end{verbatim}
|
|
Beware that in this case, you must specify \emph{all} the libraries
|
|
that you want to link to. \texttt{configure} will blindly accept the
|
|
specified value, and won't search for any extra libraries. (This is
|
|
so that if \texttt{configure} finds any library that you don't want to
|
|
use, you can override it.)
|
|
|
|
If you want to use a precompiled FFTW library, the corresponding
|
|
\texttt{fftw.h} include file is also required.
|
|
If \texttt{configure} wasn't able to find it, you may specify its
|
|
location in the \texttt{INCLUDEFFTW} environment variable.
|
|
For example:
|
|
\begin{verbatim}
|
|
./configure INCLUDEFFTW="/usr/lib/fftw-2.1.3/fftw"
|
|
\end{verbatim}
|
|
If everything else fails, you'll have to write the \texttt{make.sys}
|
|
file manually: see section \ref{manualconf}, ``Manual configuration''.
|
|
|
|
\paragraph{Please Note:}
|
|
|
|
If you change any settings after a previous (successful or failed)
|
|
compilation, you must run \texttt{make clean} before recompiling,
|
|
unless you know exactly which routines are affected by the changed
|
|
settings and how to force their recompilation.
|
|
|
|
\subsubsection{Manual configuration}
|
|
\label{manualconf}
|
|
|
|
To configure $\nu-$ESPRESSO manually, you have to write working
|
|
\texttt{make.sys} and \texttt{make.rules}, and generate
|
|
\texttt{*/make.depend} files yourself.
|
|
|
|
For \texttt{make.sys}, several templates (each for a different machine
|
|
type) to start with are provided in the \texttt{install/} directory:
|
|
they have names of the form \texttt{Make.}\emph{system}, where
|
|
\emph{system} is a string identifying the architecture and compiler.
|
|
Currently available systems are:
|
|
\begin{quote}
|
|
\texttt{alpha}: HP-Compaq alpha workstations\\
|
|
\texttt{alphaMPI}: HP-Compaq alpha parallel machines\\
|
|
\texttt{altix}: SGI Altix 350/3000 with Linux, Intel compiler\\
|
|
\texttt{cygwin}: Windows PC, Intel compiler (see below)\\
|
|
\texttt{fujitsu}: Fujitsu vector machines\\
|
|
\texttt{hitachi}: Hitachi SR8000\\
|
|
\texttt{hp}: HP PA-RISC workstations\\
|
|
\texttt{hpMPI}: HP PA-RISC parallel machines\\
|
|
\texttt{ia64}: HP Itanium workstations\\
|
|
\texttt{irix}: SGI workstations\\
|
|
\texttt{pc\_abs}: Linux PCs, Absoft compiler\\
|
|
\texttt{pc\_lahey}: Linux PCs, Lahey compiler\\
|
|
\texttt{pc\_pgi}: Linux PCs, Portland compiler\\
|
|
\texttt{sun}: Sun workstations\\
|
|
\texttt{sunmpi}: Sun parallel machines\\
|
|
\texttt{sxcross}: NEC SX-6\\
|
|
\texttt{t3e}: Cray T3E
|
|
\end{quote}
|
|
The \texttt{install/} directory also contains files \texttt{Rules.cpp}
|
|
and \texttt{Rules.nocpp}, which are templates for \texttt{make.rules}.
|
|
The former is to be used with Fortran compilers that support
|
|
the preprocessing of source files; otherwise you must use the latter.
|
|
They'll usually work without further editing.
|
|
|
|
To select the appropriate templates, you can run:
|
|
\medskip
|
|
|
|
\texttt{./configure.old} \emph{system}
|
|
\medskip
|
|
|
|
\noindent
|
|
where \emph{system} is the best match to your configuration;
|
|
\texttt{configure.old} with no arguments prints the up-to-date list of
|
|
available systems.
|
|
|
|
That will copy \texttt{Make.}\emph{system} to \texttt{make.sys}, and
|
|
either \texttt{Rules.*} file to \texttt{make.rules}; it will usually
|
|
pick the right one.
|
|
In addition, it'll run the \texttt{makedeps.sh} script to generate
|
|
\texttt{*/make.depend} files.
|
|
(If you don't run the \texttt{configure.old} script, you'll have to do
|
|
that yourself.)
|
|
|
|
Most probably (and even more so if there isn't an exact match to your
|
|
machine type), you'll have to tweak \texttt{make.sys} by hand until
|
|
you obtain successful compilation.
|
|
In particular, you must specify the full list of libraries that
|
|
you intend to link to.
|
|
You'll also have to set the \texttt{MYLIB} variable to:
|
|
\begin{quote}
|
|
\texttt{blas\_and\_lapack} to compile BLAS and LAPACK from source;\\
|
|
\texttt{lapack\_mkl} to use the Intel MKL library;\\
|
|
\texttt{lapack\_t3e} to use the LAPACK for Cray T3E;\\
|
|
otherwise, leave it empty.
|
|
\end{quote}
|
|
|
|
\paragraph{Note for HP PA-RISC users:}
|
|
|
|
The Makefile for HP PA-RISC workstations and parallel machines is
|
|
based on a Makefile contributed by Sergei Lysenkov.
|
|
It assumes that you have HP compiler with MLIB libraries installed on
|
|
a machine running HP-UX.
|
|
|
|
\paragraph{Note for MS-Windows users:}
|
|
|
|
The Makefile for Windows PCs is based on a Makefile written for an
|
|
earlier version of PWscf (1.2.0), contributed by Lu Fu-Fa, CCIT,
|
|
Taiwan.
|
|
Since there have been many changes to the installation procedure, the
|
|
provided Makefile --- which has never been tested --- may not work.
|
|
You will need the CygWin package (a UNIX environment for PC which runs
|
|
in Windows).
|
|
The provided Makefiles assumes that you have the Intel compiler with
|
|
MKL libraries installed.
|
|
|
|
Another possibility is to install Linux, either in dual-boot mode, or
|
|
running from a CD-ROM. You will need to create a partition for Linux
|
|
and to install a boot loader (LILO, GRUB). The latter step is not
|
|
necessary if you boot from CD-ROM. The former step could also be
|
|
avoided in principle (distributions like Knoppix run directly from the
|
|
CD-ROM) but for serious use you will need to have disk access.
|
|
|
|
\subsection{Compile}
|
|
|
|
There are a few adjustable parameters in
|
|
\texttt{Modules/parameters.f90}.
|
|
The present values will work for most cases. All other variables are
|
|
dynamically allocated: you do not need to recompile your code for a
|
|
different system.
|
|
|
|
At your option, you may compile the complete $\nu-$ESPRESSO suite of
|
|
programs (with \texttt{make all}), or only some specific programs.
|
|
|
|
\texttt{make} with no arguments yields a list of valid compilation
|
|
targets.
|
|
Here is a list:
|
|
|
|
\begin{itemize}
|
|
\item
|
|
\texttt{make pw} produces \texttt{PW/pw.x} and
|
|
\texttt{PW/memory.x}.
|
|
|
|
\texttt{pw.x} calculates electronic structure, structural
|
|
optimization, molecular dynamics, barriers with NEB.
|
|
\texttt{memory.x} is an auxiliary program that checks the input of
|
|
\texttt{pw.x} for correctness and yields a rough (under-) estimate
|
|
of the required memory.
|
|
\item
|
|
\texttt{make ph} produces \texttt{PH/ph.x}.
|
|
|
|
\texttt{ph.x} calculates phonon frequencies and displacement
|
|
patterns, dielectric tensors, effective charges (uses data
|
|
produced by \texttt{pw.x}).
|
|
\item
|
|
\texttt{make d3} produces \texttt{D3/d3.x}
|
|
|
|
\texttt{d3.x} calculates anharmonic phonon lifetimes (third-order
|
|
derivatives of the energy), using data produced by \texttt{pw.x}
|
|
and \texttt{ph.x}.
|
|
\item
|
|
\texttt{make gamma} produces \texttt{Gamma/phcg.x}.
|
|
|
|
\texttt{phcg.x} is a version of \texttt{ph.x} that calculates
|
|
phonons at $\mathbf{q}=0$ using conjugate-gradient minimization of
|
|
the density functional expanded to second-order.
|
|
Only the $\Gamma$ ($\mathbf{q}=0$) point is used for Brillouin
|
|
zone integration.
|
|
It is faster and takes less memory than \texttt{ph.x}, but does
|
|
not support Ultrasoft pseudopotentials.
|
|
\item
|
|
\texttt{make raman} produces \texttt{Raman/ram.x}.
|
|
|
|
\texttt{ram.x} calculates nonresonant Raman tensor coefficients
|
|
(derivatives of the polarizability wrt atomic displacements)
|
|
using the $(2n+1)$ theoremi.
|
|
\item
|
|
\texttt{make pp} produces several codes for data postprocessing, in
|
|
\texttt{PP/} (see list below).
|
|
\item
|
|
\texttt{make tools} produces several utility programs, mostly for
|
|
phonon calculations, in \texttt{pwtools/} (see list below).
|
|
\item
|
|
\texttt{make pwcond} produces \texttt{PWCOND/pwcond.x}, for
|
|
ballistic conductance calculations (experimental).
|
|
\item
|
|
\texttt{make pwall} produces all of the above.
|
|
\item
|
|
\texttt{make ld1} produces code \texttt{atomic/ld1.x} for
|
|
pseudopotential generationd (see the specific
|
|
documentation in \texttt{atomic\_doc/}).
|
|
\item
|
|
\texttt{make upf} produces utilities for pseudopotential
|
|
conversion in directory \texttt{upftools/} (see section
|
|
\ref{pseudopotentials}, ``Pseudopotentials'').
|
|
\item
|
|
\texttt{make cp} produces the Car-Parrinello code CP in
|
|
\texttt{CPV/cp.x}.
|
|
\item
|
|
\texttt{make fpmd} produces the Car-Parrinello code FPMD
|
|
\texttt{CPV/fpmd.x} and the postprocessing code
|
|
\texttt{CPV/fpmdpp.x}.
|
|
\item
|
|
\texttt{make all} produces all of the above.
|
|
\end{itemize}
|
|
For the setup of the GUI, refer to the
|
|
\texttt{PWgui-}\emph{X.Y.Z}\texttt{/INSTALL} file, where \emph{X.Y.Z}
|
|
stands for the version number of the GUI (presently 0.6.2).
|
|
If you are using the CVS-sources, then see the \texttt{GUI/README}
|
|
file instead.
|
|
|
|
The codes for data postprocessing in \texttt{PP/} are:
|
|
\begin{itemize}
|
|
\item \texttt{pp.x} extracts the specified data from files
|
|
produced by \texttt{pw.x} for further processing
|
|
\item \texttt{bands.x} extracts eigenvalues from files produced
|
|
by \texttt{pw.x} for band structure plotting
|
|
\item \texttt{projwfc.x} calculates projections of wavefunction
|
|
over atomic orbitals, performs L\"owdin population
|
|
analysis and calculates projected density of states
|
|
\item \texttt{chdens.x} plots data produced by \texttt{pp.x},
|
|
writing them into a format that is suitable for several
|
|
plotting programs
|
|
\item \texttt{plotrho.x} reads the output of \texttt{chdens.x},
|
|
produces PostScript 2-d contour plots
|
|
\item \texttt{plotband.x} reads the output of \texttt{bands.x},
|
|
produces band structure PostScript plots
|
|
\item \texttt{average.x} calculates planar averages of
|
|
potentials
|
|
\item \texttt{voronoy.x} divides the charge density into Voronoy
|
|
polyhedra (obsolete, use at your own risk)
|
|
\item \texttt{dos.x} calculates electronic Density of States
|
|
(DOS).
|
|
\item \texttt{pw2wan.x}: interface with code WanT for calculation of
|
|
transport properties via Wannier (also known as Boyd)
|
|
functions: see\hfill\break
|
|
\htmladdnormallink%
|
|
{\texttt{http://www.wannier-transport.org/}}%
|
|
{http://www.wannier-transport.org/}
|
|
\item \texttt{pw2casino.x}: interface with CASINO code for Quantum
|
|
Monte Carlo calculation
|
|
(\htmladdnormallink%
|
|
{\texttt{http://www.tcm.phy.cam.ac.uk/\~{}mdt26/casino.html}}%
|
|
{http://www.tcm.phy.cam.ac.uk/~mdt26/casino.html}).
|
|
\end{itemize}
|
|
|
|
The utility programs in \texttt{pwtools/} are:
|
|
\begin{itemize}
|
|
\item \texttt{dynmat.x} calculates LO-TO splitting at
|
|
$\mathbf{q}=0$ in insulator, IR cross sections, from the
|
|
dynamical matrix produced by \texttt{ph.x}
|
|
\item \texttt{q2r.x} calculates Interatomic Force Constants ion
|
|
real space from dynamical matrices produced by
|
|
\texttt{ph.x} on a regular \textbf{q}-grid
|
|
\item \texttt{matdyn.x} produces phonon frequencies at a generic
|
|
wave vector using the Interatomic Force Constants
|
|
calculated by \texttt{q2r.x}; may also calculate phonon
|
|
DOS
|
|
\item \texttt{fqha.x} for quasi-harmonic calculations
|
|
\item \texttt{lambda.x} calculates the electron-phonon coefficient
|
|
$\lambda$ and the function $\alpha^2F(\omega)$
|
|
\item \texttt{dist.x} calculates distances and angles between
|
|
atoms in a cell, taking into account periodicity
|
|
\item \texttt{ev.x} fits energy-vs-volume data to an equation of
|
|
state
|
|
\item \texttt{kpoints.x} produces lists of k-points
|
|
\item \texttt{pwi2xsf.sh}, \texttt{pwo2xsf.sh} process
|
|
respectively input and output files (not data files!) for
|
|
\texttt{pw.x} and produce an XSF-formatted file suitable
|
|
for plotting with XCrySDen, a powerful crystalline and
|
|
molecular structure visualization program
|
|
(\texttt{http://www.xcrysden.org/}).
|
|
BEWARE: the \texttt{pwi2xsf.sh} shell script requires the
|
|
\texttt{pwi2xsf.x} executables to be located somewhere in
|
|
your \texttt{\$PATH}.
|
|
\item \texttt{band\_plot.x}: undocumented and possibly obsolete
|
|
\item \texttt{bs.awk}, \texttt{mv.awk} are scripts that process
|
|
the output of \texttt{pw.x} (not data files!).
|
|
Usage:
|
|
\begin{verbatim}
|
|
awk -f bs.awk < my-pw-file > myfile.bs
|
|
awk -f mv.awk < my-pw-file > myfile.mv
|
|
\end{verbatim}
|
|
The files so produced are suitable for use with
|
|
\texttt{xbs}, a very simple X-windows utility to display
|
|
molecules, available at:\hfill\break
|
|
\htmladdnormallink%
|
|
{\texttt{http://www.ccl.net/cca/software/X-WINDOW/xbsa/README.shtml}}%
|
|
{http://www.ccl.net/cca/software/X-WINDOW/xbsa/README.shtml}
|
|
\end{itemize}
|
|
|
|
\subsection{Run examples}
|
|
\label{runexamples}
|
|
|
|
As a final check that compilation was successful, you may want to run
|
|
some or all of the examples contained within the \texttt{examples}
|
|
directory of the $\nu-$ESPRESSO distribution.
|
|
Those examples try to exercise all the programs and features of the
|
|
$\nu-$ESPRESSO package: for details, see the \texttt{README} file in each
|
|
example's directory.
|
|
If you find that any relevant feature isn't being tested, please
|
|
contact us (or even better, write and send us a new example
|
|
yourself!).
|
|
|
|
If you haven't downloaded the full $\nu-$ESPRESSO distribution and don't
|
|
have the examples, you can get them from the Test and Examples Page of
|
|
the $\nu-$ESPRESSO web site
|
|
(\htmladdnormallink{\texttt{http://www.pwscf.org/tests.htm}}%
|
|
{http://www.pwscf.org/tests.htm}).
|
|
The necessary pseudopotentials are included.
|
|
|
|
To run the examples, you should follow this procedure:
|
|
|
|
\begin{enumerate}
|
|
\item
|
|
Go to the \texttt{examples} directory and edit the
|
|
\texttt{environment\_variables} file, setting the following variables
|
|
as needed:
|
|
\begin{quote}
|
|
\texttt{BIN\_DIR=} directory where $\nu-$ESPRESSO executables reside\\
|
|
\texttt{PSEUDO\_DIR=} directory where pseudopotential files reside\\
|
|
\texttt{TMP\_DIR=} directory to be used as temporary storage area
|
|
\end{quote}
|
|
If you have downloaded the full $\nu-$ESPRESSO distribution, you may set
|
|
\texttt{BIN\_DIR=\$TOPDIR/bin} and
|
|
\texttt{PSEUDO\_DIR=\$TOPDIR/pseudo}, where \texttt{\$TOPDIR} is the
|
|
root of the $\nu-$ESPRESSO source tree.
|
|
|
|
The \texttt{PSEUDO\_DIR} directory must contain the following files:
|
|
\begin{quote}
|
|
\begin{flushleft}
|
|
%
|
|
% to regenerate this list:
|
|
% grep UPF */run_example | grep -v PSEUDO_LIST | grep -o "[^ ]*UPF" | \
|
|
% sed 's/_/\\_/g' | sort | uniq | awk '{print " \\texttt{" $0 "},"}'
|
|
%
|
|
\texttt{Al.vbc.UPF},
|
|
\texttt{As.gon.UPF},
|
|
\texttt{C.pz-rrkjus.UPF},
|
|
\texttt{Cu.pz-d-rrkjus.UPF},
|
|
\texttt{Fe.pz-nd-rrkjus.UPF},
|
|
\texttt{H.fpmd.UPF},
|
|
\texttt{H.vbc.UPF},
|
|
\texttt{N.BLYP.UPF},
|
|
\texttt{Ni.pbe-nd-rrkjus.UPF},
|
|
\texttt{NiUS.RRKJ3.UPF},
|
|
\texttt{O.BLYP.UPF},
|
|
\texttt{O.LDA.US.RRKJ3.UPF},
|
|
\texttt{O.pbe-rrkjus.UPF},
|
|
\texttt{O.vdb.UPF},
|
|
\texttt{OPBE\_nc.UPF},
|
|
\texttt{Pb.vdb.UPF},
|
|
\texttt{Ptrel.RRKJ3.UPF},
|
|
\texttt{Si.vbc.UPF},
|
|
\texttt{SiPBE\_nc.UPF},
|
|
\texttt{Ti.vdb.UPF}
|
|
\end{flushleft}
|
|
\end{quote}
|
|
|
|
If any of these are missing, you may not be able to run some of the
|
|
examples. You can download them from the Pseudopotentials Page of the
|
|
$\nu-$ESPRESSO web site
|
|
(\htmladdnormallink{\texttt{http://www.pwscf.org/pseudo.htm}}%
|
|
{http://www.pwscf.org/pseudo.htm}).
|
|
|
|
\texttt{TMP\_DIR} must be a directory you have read and write access
|
|
to, with enough available space to host the temporary files produced
|
|
by the example runs, and possibly offering high I/O performance (i.e.,
|
|
don't use an NFS-mounted directory).
|
|
|
|
\item
|
|
If you have compiled the parallel version of $\nu-$ESPRESSO (that is the
|
|
default), you'll usually have to specify a driver program (such as
|
|
\texttt{poe} or \texttt{mpiexec}) and the number of processors: read
|
|
section \ref{runparallel}, ``Running on parallel machines'' for
|
|
details.
|
|
|
|
In order to do that, edit again the \texttt{environment\_variables}
|
|
file and set the \texttt{PARA\_PREFIX} and \texttt{PARA\_POSTFIX}
|
|
variables as needed.
|
|
Parallel executables will be run by a command like this:
|
|
\begin{verbatim}
|
|
$PARA_PREFIX pw.x $PARA_POSTFIX < file.in > file.out
|
|
\end{verbatim}
|
|
|
|
For example, if the command line is like this (as for an IBM SP4):
|
|
\begin{verbatim}
|
|
poe pw.x -procs 4 < file.in > file.out
|
|
\end{verbatim}
|
|
you should set \texttt{PARA\_PREFIX="poe"},
|
|
\texttt{PARA\_POSTFIX="-procs 4"}.
|
|
|
|
Furthermore, if your machine does not support interactive use, you
|
|
must run the commands specified below through the batch queueing
|
|
system installed on that machine.
|
|
Ask your system administrator for instructions.
|
|
|
|
\item
|
|
To run a single example, go to the corresponding directory (for
|
|
instance, \texttt{example/example01}) and execute:
|
|
\begin{verbatim}
|
|
./run_example
|
|
\end{verbatim}
|
|
This will create a subdirectory \texttt{results}, containing the input
|
|
and output files generated by the calculation.
|
|
|
|
Some examples take only a few seconds to run, while others may require
|
|
several minutes depending on your system.
|
|
|
|
To run all the examples in one go, execute:
|
|
\begin{verbatim}
|
|
./run_all_examples
|
|
\end{verbatim}
|
|
from the \texttt{examples} directory.
|
|
On a single-processor machine, this typically takes one to three
|
|
hours.
|
|
|
|
The \texttt{make\_clean} script cleans the examples tree, by removing
|
|
all the \texttt{results} subdirectories. However, if additional
|
|
subdirectories have been created, they aren't deleted.
|
|
|
|
\item
|
|
In each example's directory, the \texttt{reference} subdirectory
|
|
contains verified output files, that you can check your results
|
|
against.
|
|
They were generated on a 1.7 GHz Pentium IV using Intel compiler
|
|
(\texttt{ifc}) v.6 and MKL libraries v.5.1.
|
|
|
|
On different architectures the precise numbers could be slightly
|
|
different, in particular if different FFT dimensions are automatically
|
|
selected. For this reason, a plain \texttt{diff} of your results
|
|
against the reference data doesn't work, or at least, it requires
|
|
human inspection of the results.
|
|
|
|
Instead, you can run the \texttt{check\_example} script in the
|
|
\texttt{examples} directory:
|
|
\medskip
|
|
|
|
\quad\texttt{./check\_example} \emph{example\_dir}
|
|
\medskip
|
|
|
|
\noindent
|
|
where \emph{example\_dir} is the directory of the example that you
|
|
want to check (e.g., \texttt{./check\_example example01}).
|
|
You can specify multiple directories.
|
|
|
|
Note: at the moment \texttt{check\_example} is in early development
|
|
and (should be) guaranteed to work only on examples 01 to 04.
|
|
\end{enumerate}
|
|
|
|
\subsection{Installation issues}
|
|
\label{installissues}
|
|
|
|
The main development platforms are IBM SP and Intel/AMD PC with Linux
|
|
and Intel compiler. For other machines, we rely on user's feedback.
|
|
|
|
\paragraph{All machines}
|
|
|
|
Working fortran-95 and C compilers are needed in order to compile
|
|
$\nu-$ESPRESSO. Most so-called ``fortran-90'' compilers implement the
|
|
fortran-95 standard, but older versions may not be fortran-95
|
|
compliant.
|
|
|
|
If you get ``Compiler Internal Error'' or similar messages, try to
|
|
lower the optimization level, or to remove optimization, just for the
|
|
routine that has problems. If it doesn't work, or if you experience
|
|
weird problems, try to install patches for your version of the
|
|
compiler (most vendors release at least a few patches for free), or to
|
|
upgrade to a more recent version.
|
|
|
|
If you get an error in the loading phase that looks like ``ld: file
|
|
XYZ.o: unknown (unrecognized, invalid, wrong, missing, \dots) file
|
|
type'', or ``While processing relocatable file XYZ.o, no relocatable
|
|
objects were found'' (T3E), one of the following things have happened:
|
|
|
|
\begin{enumerate}
|
|
\item you have leftover object files from a compilation with another
|
|
compiler: run \texttt{make clean} and recompile.
|
|
\item \texttt{make} does not stop at the first compilation error (it
|
|
happens with some compilers).
|
|
Remove file XYZ.o and look for the compilation error.
|
|
\end{enumerate}
|
|
|
|
If many symbols are missing in the loading phase, you did not specify
|
|
the location of all needed libraries (LAPACK, BLAS, FFTW,
|
|
machine-specific optimized libraries). If you did, but symbols are
|
|
still missing, see below (for Linux PC).
|
|
|
|
\paragraph{SGI machines with MIPS compiler}
|
|
|
|
Many versions of the MIPS compiler yield compilation errors in
|
|
conjunction with with \texttt{FORALL} constructs. There is no
|
|
known solution other than editing the \texttt{FORALL} construct
|
|
that gives a problem, or to replace it with an equivalent
|
|
\texttt{DO...END DO} construct.
|
|
|
|
\paragraph{Linux Alphas with Compaq compiler}
|
|
|
|
If at linking stage you get error messages like: ``undefined reference
|
|
to `for\_check\_mult\_overflow64' '' with Compaq/HP fortran compiler
|
|
on Linux Alphas, check the following page:
|
|
\htmladdnormallink%
|
|
{\texttt{http://linux.iol.unh.edu/linux/fortran/faq/cfal-X1.0.2.html}}%
|
|
{http://linux.iol.unh.edu/linux/fortran/faq/cfal-X1.0.2.html}.
|
|
|
|
\paragraph{Linux PC}
|
|
|
|
The web site of Axel Kohlmeyer contains a very informative section
|
|
on compiling and running CPMD on Linux.
|
|
Most of its contents applies to the $\nu-$ESPRESSO code as well:\hfill\break
|
|
\htmladdnormallink%
|
|
{\texttt{http://www.theochem.rub.de/\~{}axel.kohlmeyer/cpmd-linux.html}}%
|
|
{http://www.theochem.rub.de/~axel.kohlmeyer/cpmd-linux.html}.
|
|
|
|
On newer Linux machines, even statically linked binaries will try
|
|
to open some shared libraries, which will lead to crashes
|
|
if libc/libm/libpthreads are not linked dynamically. Machines
|
|
using glibc-2.2.4 and older seem ok: compile on these machines
|
|
if you want to share precompiled binaries. Crashes due to multithreading
|
|
(e.g. when using a multithreaded ATLAS or MKL) on machines with
|
|
the newer threads (nptl) can be worked around by setting the
|
|
environment variable \texttt{LD\_ASSUME\_KERNEL} to '2.2.5'. For
|
|
the newest Intel compilers, \texttt{-static-libcxa} does the
|
|
trick most of the time. (info from Axel Kohlmeyer)
|
|
|
|
Since there is no standard compiler for Linux, different compilers
|
|
have different ideas about the right way to call external libraries.
|
|
As a consequence you may have a mismatch between what your compiler
|
|
calls ("symbols") and the actual name of the required library call.
|
|
Use the \texttt{nm} command to determine the name of a library call,
|
|
as in the following examples:%
|
|
\begin{verbatim}
|
|
nm /usr/local/lib/libblas.a | grep T | grep -i daxpy
|
|
nm /usr/local/lib/liblapack.a | grep T | grep -i zhegv
|
|
\end{verbatim}
|
|
where typical location and name of libraries is assumed.
|
|
Most precompiled libraries have lowercase names with one or two
|
|
underscores (\_) appended. \texttt{configure} should select the
|
|
appropriate preprocessing options in \texttt{make.sys}, but in
|
|
case of trouble, be aware that:
|
|
\begin{itemize}
|
|
\item the Absoft compiler is case-sensitive (like C and unlike
|
|
other Fortran compilers) and does not add an underscore
|
|
to symbol names (note that if your libraries contain
|
|
uppercase or mixed case names, you are out of luck:
|
|
You must either recompile your own libraries, or change
|
|
the \texttt{\#define}'s in \texttt{include/f\_defs.h});
|
|
\item both Portland compiler (pgf90) and Intel compiler (ifort/ifc)
|
|
are case insensitive and add an underscore to symbol names.
|
|
\end{itemize}
|
|
|
|
With some precompiled lapack libraries, you may need to add
|
|
\texttt{-lg2c} or \texttt{-lm} or both.
|
|
|
|
\paragraph{Linux PCs with Portland Group compiler (pgf90)}
|
|
|
|
$\nu-$ESPRESSO does not work reliably, or not at all, with some versions of
|
|
the Portland Group compiler. In particular, with some versions PWscf
|
|
works only for small systems, but not for larger systems. We think
|
|
that this is a compiler bug. Use the latest version of each release
|
|
of the compiler, with patches if available: see the Portland Group web
|
|
site,\hfill\break
|
|
\htmladdnormallink%
|
|
{\texttt{http://www.pgroup.com/faq/install.htm\#release\_info}}%
|
|
{http://www.pgroup.com/faq/install.htm\#release\_info}
|
|
|
|
\paragraph{Linux PCs (Pentium) with Intel compiler (ifort, formerly
|
|
ifc)}
|
|
|
|
If \texttt{configure} doesn't find the compiler, or if you get ``Error
|
|
loading shared libraries...'' at run time, you have forgotten to
|
|
execute the script that sets up the correct path and library path.
|
|
Unless your system manager has done this for you, you should execute
|
|
the appropriate script --- located in the directory containing the
|
|
compiler executable --- in your initialization files.
|
|
Consult the documentation provided by Intel.
|
|
|
|
Each major release of the Intel compiler differs a lot from
|
|
the previous one. Do not mix compiled objects from different releases:
|
|
they are incompatible. Intel compiler v.~7 and later use a different
|
|
method to locate where modules are with respect to v.~$< 7$: if you
|
|
are using the manual configuration, choose the appropriate line
|
|
\texttt{MODULEFLAG=...} in \texttt{make.sys}.
|
|
|
|
Some releases of Intel compiler v.~7 and 8 yield ``Compiler Internal
|
|
Error''.
|
|
Update to the last version (presently 7.1.41, 8.0.046 or
|
|
8.1.018, respectively), available via Intel Premier support
|
|
(registration free of charge for Linux):
|
|
\htmladdnormallink%
|
|
{\texttt{http://developer.intel.com/software/products/support/\#premier}}%
|
|
{http://developer.intel.com/software/products/support/\#premier}
|
|
|
|
Note that \texttt{pwcond.x} does not work with some (but not all)
|
|
releases of Intel compiler v.~7 and 8, for no apparent good reason.
|
|
|
|
Warnings ``size of symbol ... changed ...'' are produced by ifc 7.1 at
|
|
the loading stage.
|
|
These seem to be harmless, but they may cause the loader to stop,
|
|
depending on your system configuration.
|
|
If this happens and no executable is produced, add the following to
|
|
\texttt{LDFLAGS}: \texttt{-Xlinker --noinhibit-exec}.
|
|
|
|
On Intel CPUs, it is very convenient to use Intel MKL libraries.
|
|
If \texttt{configure} doesn't find them, try
|
|
\texttt{configure --enable-shared}.
|
|
MKL also contains optimized FFT routines, but they are
|
|
presently not supported: use FFTW instead. Note that Intel
|
|
compiler v.~8 fails to load with MKL v.~5.2 or earlier versions,
|
|
because some symbols that are referenced by MKL are missing. There
|
|
is a fix for this (info from Konstantin Kudin): add libF90.a from
|
|
ifc 7.1 at the linking stage, as the last library.
|
|
Note that some combinations of not-so-recent versions of MKL
|
|
and ifc may yield a lot of "undefined references" when statically
|
|
loaded: use \texttt{configure --enable-shared},
|
|
or remove the \texttt{-static} option in \texttt{make.sys}.
|
|
|
|
When using/testing/benchmarking MKL on SMP (multiprocessor)
|
|
machines, one should set the environmental variable
|
|
\texttt{OMP\_NUM\_THREADS} to 1, unless the OpenMP
|
|
parallelization is desired. MKL by default sets the
|
|
variable to the number of CPUs installed and thus gives
|
|
the impression of a much better performance, as the CPUu time
|
|
is only measured for the master thread (info from Axel Kohlmeyer).
|
|
|
|
The I/O libraries used by the Intel compiler ifc are incompatible
|
|
with those called by most precompiled BLAS/LAPACK libraries
|
|
(including ATLAS): you get error messages at linking stage.
|
|
A workaround is to recompile BLAS/LAPACK with ifc, or (better) to
|
|
replace the BLAS routine \texttt{xerbla} and LAPACK routine
|
|
\texttt{dlamch} (the only two containing I/O calls) with recompiled
|
|
objects:
|
|
\begin{verbatim}
|
|
ifc -c xerbla.f
|
|
ifc -O0 -c dlamch.f
|
|
\end{verbatim}
|
|
(do not forget \texttt{-O0} --- \texttt{dlamch.f} \emph{must} be
|
|
compiled without optimization) and replace them into the library, as
|
|
in the following example:
|
|
\begin{verbatim}
|
|
ar rv libatlas.a xerbla.o dlamch.o
|
|
\end{verbatim}
|
|
(assuming that the library and the two object files are in the same
|
|
directory). See also Axel Kohlmeyer's web site.
|
|
|
|
Linux distributions using glibc 2.3 or later (such as e.g. RedHat 9)
|
|
may be incompatible with ifc 7.0 and 7.1.
|
|
The incompatibility shows up in the form of messages ``undefined
|
|
reference to `errno' '' at linking stage.
|
|
A workaround is available: see
|
|
\htmladdnormallink%
|
|
{\texttt{http://newweb.ices.utexas.edu/misc/ctype.c}}%
|
|
{http://newweb.ices.utexas.edu/misc/ctype.c}.
|
|
|
|
\paragraph{AMD CPUs, Intel Itanium}
|
|
|
|
AMD Athlon CPUs can be basically treated like Intel Pentium CPUs.
|
|
You can use the Intel compiler and MKL with Pentium-3 optimization.
|
|
|
|
Konstantin Kudin reports that the best results in terms of
|
|
performances are obtained with ATLAS optimized BLAS/LAPACK
|
|
libraries, using AMD Core Math Library (ACML) for the missing
|
|
libraries. ACML can be freely downloaded from AMD web site.
|
|
Beware: some versions of ACML -- i.e. the GCC version with SSE2 --
|
|
crash PWscf. The ``\_nosse2'' version appears to be stable.
|
|
Load first ATLAS, then ACML, then \texttt{-lg2c}, as in the
|
|
following example (replace what follows \texttt{-L} with
|
|
something appropriate to your configuration):
|
|
\begin{verbatim}
|
|
-L/location/of/fftw/lib/ -lfftw \
|
|
-L/location/of/atlas/lib -lf77blas -llapack -lcblas -latlas \
|
|
-L/location/of/gnu32_nosse2/lib -lacml -lg2c
|
|
\end{verbatim}
|
|
64-bit CPUs like the AMD Opteron and the Intel Itanium are
|
|
supported and should work both in 32-bit emulation and in
|
|
64-bit mode (in the latter case, \texttt{-D\_\_LINUX64} is
|
|
needed among the preprocessing flags). Both the PGI and the
|
|
Intel compiler (v8.1 EM64T-edition, available via Intel Premier
|
|
support) should work. 64-bit executables can address a
|
|
much larger memory space, but apparently they are not especially
|
|
faster than 32-bit executables. The Intel compiler has been
|
|
reported to be mor reliable and to produce faster executables
|
|
wrt the PGI compiler.
|
|
|
|
\paragraph{Linux PC clusters with MPI}
|
|
|
|
PC clusters running some version of MPI are a very popular
|
|
computational platform nowadays. Two major MPI implementations
|
|
(MPICH, LAM-MPI) are available. The number of possible
|
|
configurations, in terms of type and and version of the MPI
|
|
libraries, kernels, system libraries, compilers, is very large.
|
|
$\nu-$ESPRESSO compiles and works on all non-buggy, properly configured
|
|
configuration. You may have to recompile MPI libraries in order
|
|
to be able to use them with the Intel compiler. See Axel Kohlmeyer's
|
|
web site for precompiled versions of the MPI library.
|
|
|
|
If $\nu-$ESPRESSO does not work for some reason on a PC cluster, try first
|
|
if it works in serial execution. If the problem is clearly related to
|
|
parallelism, it is likely that your MPI libraries are buggy or not
|
|
properly configured: see Axel Kohlmeyer's web site for help.
|
|
A frequent problem is that $\nu-$ESPRESSO does not read from standard
|
|
output: see section ``Running on parallel machines''.
|
|
|
|
If you are dissatisfied with the performances in parallel
|
|
execution, read the ``Parallelization issues'' section.
|
|
|
|
\paragraph{T3E}
|
|
|
|
The following workaround is needed: in files \texttt{PW/bp\_zgefa.f}
|
|
and \texttt{PW/bp\_zgedi.f}, replace all occurrences of
|
|
\texttt{zscal}, \texttt{zaxpy}, \texttt{zswap}, \texttt{izamax} with
|
|
\texttt{cscal}, \texttt{caxpy}, \texttt{cswap}, \texttt{icamax}.
|
|
Also, in \texttt{PP/dist.f} you need to comment the call to
|
|
\texttt{getarg} and uncomment the call to \texttt{pxfgetarg}.
|
|
|
|
If you have a T3E with ``benchlib'' installed, you may want to use it
|
|
by adding \texttt{-D\_\_BENCHLIB} to preprocessing flags.
|
|
If you get errors at loading because symbols \texttt{LPUTP},
|
|
\texttt{LGETV}, \texttt{LSETV} are undefined, you either need to link
|
|
``benchlib'', or to remove \texttt{-D\_\_BENCHLIB} and recompile
|
|
(after a \texttt{make clean}).
|
|
|
|
\clearpage
|
|
|
|
\section{Running on parallel machines}
|
|
\label{runparallel}
|
|
|
|
Parallel execution is strongly system- and installation-dependent.
|
|
Typically one has to specify:
|
|
|
|
\begin{itemize}
|
|
\item a launcher program, such as \texttt{poe}, \texttt{mpirun}, or
|
|
\texttt{mpiexec};
|
|
\item the number of processors, typically as an option to the
|
|
launcher program, but in some cases \emph{after} the program
|
|
to be executed;
|
|
\item the program to be executed, with the proper path if needed:
|
|
for instance, \texttt{pw.x}, or \texttt{./pw.x}, or
|
|
\texttt{\$(HOME)/bin/pw.x}, or whatever applies;
|
|
\item the number of ``pools'' into which processors are to be
|
|
grouped (see section \ref{parissues}, ``Parallelization
|
|
Issues'', for an explanation of what a pool~is).
|
|
\end{itemize}
|
|
|
|
The last item is optional and is read by the code.
|
|
The first and second items are machine- and installation-dependent,
|
|
and may be different for interactive and batch execution.
|
|
|
|
\paragraph{Please note:}
|
|
Your machine might be configured so as to disallow interactive
|
|
execution: if in doubt, ask your system administrator.
|
|
\bigskip
|
|
|
|
For illustration, here's how to run \texttt{pw.x} on 16 processors
|
|
partitioned into 8 pools (2 processors each), for several typical
|
|
cases.
|
|
For convenience, we also give the corresponding values of
|
|
\texttt{PARA\_PREFIX}, \texttt{PARA\_POSTFIX} to be used in running
|
|
the examples distributed with $\nu-$ESPRESSO (see section \ref{runexamples},
|
|
``Run examples'').
|
|
|
|
\begin{description}
|
|
\item [IBM SP machines,] batch:
|
|
\begin{verbatim}
|
|
pw.x -npool 8 < input
|
|
|
|
PARA_PREFIX="", PARA_POSTFIX="-npool 8"
|
|
\end{verbatim}
|
|
This should also work interactively, with environment variables
|
|
\texttt{NPROC} set to 16, \texttt{MP\_HOSTFILE} set to the file
|
|
containing a list of processors.
|
|
\item [IBM SP machines,] interactive, using \texttt{poe}:
|
|
\begin{verbatim}
|
|
poe pw.x -procs 16 -npool 8 < input
|
|
|
|
PARA_PREFIX="poe", PARA_POSTFIX="-procs 16 -npool 8"
|
|
\end{verbatim}
|
|
\item [SGI Origin and PC clusters] using \texttt{mpirun}:
|
|
\begin{verbatim}
|
|
mpirun -np 16 pw.x -npool 8 < input
|
|
|
|
PARA_PREFIX="mpirun -np 16", PARA_POSTFIX="-npool 8"
|
|
\end{verbatim}
|
|
\item [PC clusters] using \texttt{mpiexec}:
|
|
\begin{verbatim}
|
|
mpiexec -n 16 pw.x -npool 8 < input
|
|
|
|
PARA_PREFIX="mpiexec -n 16", PARA_POSTFIX="-npool 8"
|
|
\end{verbatim}
|
|
\item [Cray T3E] (old):
|
|
\begin{verbatim}
|
|
mpprun -n 16 pw.x -npool 8 < input
|
|
|
|
PARA_PREFIX="mpprun -n 16", PARA_POSTFIX="-npool 8"
|
|
\end{verbatim}
|
|
\end{description}
|
|
|
|
Note that each processor writes its own set of temporary wavefunction
|
|
files during the calculation.
|
|
If \texttt{wf\_collect=.true.} (in namelist \texttt{control}), the
|
|
final result is collected into a single file, whose format is
|
|
independent on the number of processors; otherwise, one wavefunction
|
|
file per processor is left on the disk.
|
|
In the latter case, the files are readable only by a job running on
|
|
the same number of processors and pools, and if all files are on a
|
|
file system that is visible to all processors (i.e., you cannot use
|
|
local scratch directories: there is presently no way to ensure that
|
|
the distribution of processes on processors will follow the same
|
|
pattern for different jobs).
|
|
|
|
Some implementations of the MPI library may have problems with
|
|
input redirection in parallel.
|
|
If this happens, use the option \texttt{-in} (or \texttt{-inp} or
|
|
\texttt{-input}), followed by the input file name.
|
|
Example: \texttt{pw.x -in input -npool 4 > output}.
|
|
|
|
Please note that all postprocessing codes \emph{not} reading data
|
|
files produced by \texttt{pw.x} --- that is, \texttt{chdens.x},
|
|
\texttt{average.x}, \texttt{voronoy.x}, \texttt{dos.x} --- the
|
|
plotting codes \texttt{plotrho.x}, \texttt{plotband.x}, and all
|
|
executables in \texttt{pwtools/}, should be executed on just one
|
|
processor.
|
|
Unpredictable results may follow if those codes are run on more than
|
|
one processor.
|
|
|
|
\clearpage
|
|
|
|
\section{Pseudopotentials}
|
|
\label{pseudopotentials}
|
|
|
|
Currently PWscf and CP support both Ultrasoft (US) Vanderbilt
|
|
pseudopotentials (PPs) and Norm-Conserving (NC)
|
|
Hamann-Schl\"uter-Chiang PPs in separable Kleinman-Bylander form.
|
|
Note however that calculation of third-order derivatives is not (yet)
|
|
implemented with US PPs. Presently FPMD supports only NC PPs.
|
|
|
|
The $\nu-$ESPRESSO package uses a unified pseudopotential format (UPF)
|
|
(\htmladdnormallink{\texttt{http://www.pwscf.org/format.htm}}%
|
|
{http://www.pwscf.org/format.htm})
|
|
for all types of PPs, but still accepts a number of other formats:
|
|
(\htmladdnormallink{\texttt{http://www.pwscf.org/oldformat.htm}}%
|
|
{http://www.pwscf.org/oldformat.htm}):
|
|
\begin{enumerate}
|
|
\item the ``old PWscf'' format for NC PPs,
|
|
\item the ``old CP'' format for NC PPs,
|
|
\item the ``old FPMD'' format for NC PPs,
|
|
\item the ``new PWscf'' format for NC and US PPs,
|
|
\item the ``Vanderbilt'' format (formatted, not binary) for NC and
|
|
US PPs.
|
|
\end{enumerate}
|
|
Note however that PWscf accept the first, fourth and fifth in the
|
|
above list; CP the second, fourth and fifth; FPMD the third only.
|
|
|
|
PPs for selected elements can be downloaded from the Pseudopotentials
|
|
Page of the $\nu-$ESPRESSO web site
|
|
(\htmladdnormallink{\texttt{http://www.pwscf.org/pseudo.htm}}%
|
|
{http://www.pwscf.org/pseudo.htm}).
|
|
If you do not find there the PP you need (because there is no PP for
|
|
the atom you need or you need a different exchange-correlation
|
|
functional or a different core-valence partition or for whatever
|
|
reason may apply), it may be taken, if available, from published
|
|
tables, such as e.g.:
|
|
\begin{itemize}
|
|
\item G.B. Bachelet, D.R. Hamann and M. Schl\"uter, Phys. Rev. B
|
|
\textbf{26}, 4199 (1982)
|
|
\item X. Gonze, R. Stumpf, and M. Scheffler, Phys. Rev. B
|
|
\textbf{44}, 8503 (1991)
|
|
\item S. Goedecker, M. Teter, and J. Hutter, Phys. Rev. B
|
|
\textbf{54}, 1703 (1996)
|
|
\end{itemize}
|
|
or otherwise it must be generated. Since version 2.1, $\nu-$ESPRESSO
|
|
includes a PP generation package, in the
|
|
directory \texttt{atomic/} (sources) and \texttt{atomic\_doc/}
|
|
(documentation, tests and examples).
|
|
The package can generate both NC and US PPs in UPF (and older, not
|
|
recommended) format.
|
|
We refer to its documentation for instructions on how to generate PPs
|
|
with the \texttt{atomic/} code.
|
|
|
|
Other PP generation packages are available on-line:
|
|
|
|
\begin{itemize}
|
|
\item
|
|
David Vanderbilt's code (UltraSoft PPs):\hfill\break
|
|
\htmladdnormallink%
|
|
{\texttt{http://www.physics.rutgers.edu/\~{}dhv/uspp/index.html}}%
|
|
{http://www.physics.rutgers.edu/~dhv/uspp/index.html}
|
|
\item
|
|
Fritz Haber's code (Norm-Conserving PPs):\hfill\break
|
|
\htmladdnormallink%
|
|
{\texttt{http://www.fhi-berlin.mpg.de/th/fhi98md/fhi98PP}}%
|
|
{http://www.fhi-berlin.mpg.de/th/fhi98md/fhi98PP}
|
|
\item
|
|
Jos\'e-Lu\'\i{}s Martins' code (Norm-Conserving PPs):\hfill\break
|
|
\htmladdnormallink%
|
|
{\texttt{http://bohr.inesc-mn.pt/\~{}jlm/pseudo.html}}%
|
|
{http://bohr.inesc-mn.pt/~jlm/pseudo.html}
|
|
\end{itemize}
|
|
|
|
The first two codes produce PPs in UPF format, or in a format that
|
|
can be converted to unified format using the utilities of directory
|
|
\texttt{upftools/}.
|
|
|
|
Finally, other electronic-structure packages (CAMPOS, ABINIT)
|
|
provide tables of PPs that can be freely downloaded, but need
|
|
to be converted into a suitable format for use with $\nu-$ESPRESSO.
|
|
|
|
Remember: \emph{always} test the PPs on simple test systems before
|
|
proceeding to serious calculations.
|
|
|
|
\clearpage
|
|
|
|
\section{Using PWscf}
|
|
|
|
Input files for the PWscf codes may be either written by hand (the
|
|
good old way), or produced via the ``PWgui'' graphical user interface
|
|
by Anton Kokalj, included in the $\nu-$ESPRESSO distribution.
|
|
See \texttt{PWgui-}\emph{x.y.z}\texttt{/INSTALL} (where \emph{x.y.z}
|
|
is the version number) for more info on PWgui, or \texttt{GUI/README}
|
|
if you are using CVS sources.
|
|
|
|
You may take the examples distributed with $\nu-$ESPRESSO as templates for
|
|
writing your own input files: see section \ref{runexamples}, ``Run
|
|
examples''. In the following, whenever we mention ``Example N'', we
|
|
refer to those.
|
|
Input files are those in the \texttt{results} directories, with names
|
|
ending in \texttt{.in} (they'll appear after you've run the examples).
|
|
|
|
Note about exchange-correlation: the type of exchange-correlation used
|
|
in the calculation is read from PP files.
|
|
All PP's must have been generated using the same exchange-correlation.
|
|
|
|
\subsection{Electronic and ionic structure calculations}
|
|
|
|
Electronic and ionic structure calculations are performed by program
|
|
\texttt{pw.x}.
|
|
|
|
\subsubsection{Input data}
|
|
|
|
The input data is organized as several namelists, followed by other
|
|
fields introduced by keywords.
|
|
|
|
The namelists are
|
|
\begin{quote}
|
|
\texttt{\&CONTROL}: general variables controlling the run\\
|
|
\texttt{\&SYSTEM}: structural information on the system under
|
|
investigation\\
|
|
\texttt{\&ELECTRONS}: electronic variables: self-consistency,
|
|
smearing\\
|
|
\texttt{\&IONS} (optional): ionic variables: relaxation,
|
|
dynamics\\
|
|
\texttt{\&CELL} (optional): variable-cell dynamics\\
|
|
\texttt{\&PHONON} (optional): information required to produce
|
|
data for phonon calculations
|
|
\end{quote}
|
|
|
|
Optional namelist may be omitted if the calculation to be performed
|
|
does not require them.
|
|
This depends on the value of variable \texttt{calculation} in namelist
|
|
\texttt{\&CONTROL}.
|
|
Most variables in namelists have default values.
|
|
Only the following variables in \texttt{\&SYSTEM} must always be
|
|
specified:
|
|
\begin{quote}
|
|
\texttt{ibrav} (integer): bravais-lattice index\\
|
|
\texttt{celldm} (real, dimension 6): crystallographic constants\\
|
|
\texttt{nat} (integer): number of atoms in the unit cell\\
|
|
\texttt{ntyp} (integer): number of types of atoms in the unit cell\\
|
|
\texttt{ecutwfc} (real): kinetic energy cutoff (Ry) for
|
|
wavefunctions.
|
|
\end{quote}
|
|
For metallic systems, you have to specify how metallicity
|
|
is treated by setting variable \texttt{occupations}. If you choose
|
|
\texttt{occupations='smearing'}, you have to specify the
|
|
smearing width \texttt{degauss} and optionally the smearing
|
|
type \texttt{smearing}. If you choose \texttt{occupations='tetrahedra'},
|
|
you need to specify a suitable uniform k-point grid (card
|
|
\texttt{K\_POINTS} with option \texttt{automatic}).
|
|
Spin-polarized systems must be treated as metallic system,
|
|
except the special cases of a single k-point for which
|
|
occupancies can be fixed (\texttt{occupations='from\_input'}
|
|
and card \texttt{OCCUPATIONS}).
|
|
|
|
Explanations for the meaning of variables \texttt{ibrav} and
|
|
\texttt{celldm} are in file \texttt{INPUT\_PW}.
|
|
Please read them carefully.
|
|
There is a large number of other variables, having default values,
|
|
which may or may not fit your needs.
|
|
|
|
After the namelists, you have several fields introduced by keywords
|
|
with self-explanatory names:
|
|
|
|
\begin{quote}
|
|
\texttt{ATOMIC\_SPECIES}\\
|
|
\texttt{ATOMIC\_POSITIONS}\\
|
|
\texttt{K\_POINTS}\\
|
|
\texttt{CELL\_PARAMETERS} (optional)\\
|
|
\texttt{OCCUPATIONS} (optional) \\
|
|
\texttt{CLIMBING\_IMAGES} (optional)
|
|
\end{quote}
|
|
|
|
The keywords may be followed on the same line by an option.
|
|
Unknown fields (including some that are specific to CP and FPMD codes)
|
|
are ignored by PWscf.
|
|
See file \texttt{Doc/INPUT\_PW} for a detailed explanation of the
|
|
meaning and format of the various fields.
|
|
|
|
Note about k points:
|
|
The k-point grid can be either automatically generated or manually
|
|
provided as a list of k-points and a weight in the Irreducible
|
|
Brillouin Zone only of the \emph{Bravais lattice} of the crystal.
|
|
The code will generate (unless instructed not to do so: see variable
|
|
\texttt{nosym}) all required k-points and weights if the symmetry of
|
|
the system is lower than the symmetry of the Bravais lattice.
|
|
The automatic generation of k-points follows the convention of
|
|
Monkhorst and Pack.
|
|
|
|
\subsubsection{Typical cases}
|
|
|
|
We may distinguish the following typical cases for \texttt{pw.x}:
|
|
|
|
\begin{description}
|
|
|
|
\item [single-point (fixed-ion) SCF calculation.]
|
|
|
|
Set \texttt{calculation='scf'}.
|
|
|
|
Namelists \texttt{\&IONS} and \texttt{\&CELL} need not to be
|
|
present (this is the default). See Example 01.
|
|
|
|
\item [band structure calculation.]
|
|
|
|
First perform a SCF calculation as above; then do a non-SCF
|
|
calculation specifying \texttt{calculation='nscf'}, with the
|
|
desired k-point grid and number \texttt{nbnd} of bands.
|
|
|
|
Specify \texttt{nosym=.true.} to avoid generation of additional
|
|
k-points in low symmetry cases. Variables \texttt{prefix} and
|
|
\texttt{outdir}, which determine the names of input or output
|
|
files, should be the same in the two runs. See Example~01.
|
|
|
|
\item [structural optimization.]
|
|
|
|
\hyphenation{name-list}
|
|
Specify \texttt{calculation='relax'} and add namelist \texttt{\&IONS}.
|
|
|
|
All options for a single SCF calculation apply, plus a few others.
|
|
You may follow a structural optimization with a non-SCF
|
|
band-structure calculation, but do not forget to update the input
|
|
ionic coordinates. See Example 03.
|
|
|
|
\item [molecular dynamics.]
|
|
|
|
Specify \texttt{calculation='md'} and time step \texttt{dt}.
|
|
|
|
Use variable \texttt{ion\_dynamics} in namelist \texttt{\&IONS}
|
|
for a fine-grained control of the kind of dynamics. Other options
|
|
for setting the initial temperature and for thermalization using
|
|
velocity rescaling are available. Remember: this is MD on the
|
|
electronic ground state, not Car-Parrinello MD. See Example 04.
|
|
|
|
\item [polarization via Berry Phase.]
|
|
|
|
See Example 10, its \texttt{README}, and the documentation in the
|
|
header of \texttt{PW/bp\_c\_phase.f90}.
|
|
|
|
\item [Nudged Elastic Band calculation.]
|
|
|
|
\hfill Specify \texttt{calculation='neb'} and add namelist
|
|
\texttt{\&IONS}.
|
|
|
|
All options for a single SCF calculation apply, plus a few others.
|
|
In the namelist \texttt{\&IONS} the number of images used to
|
|
discretize the elastic band must be specified. All other
|
|
variables have a default value. Coordinates of the initial and
|
|
final image of the elastic band have to be specified in the
|
|
\texttt{ATOMIC\_POSITIONS} card. A detailed description of all
|
|
input variables is contained in the file \texttt{Doc/INPUT\_PW}.
|
|
See also Example 17.
|
|
|
|
\end{description}
|
|
|
|
The output data files are written in the directory specified by
|
|
variable \texttt{outdir}, with names specified by variable
|
|
\texttt{prefix} (a string that is prepended to all file names,
|
|
whose default value is: \texttt{prefix='pwscf'}).
|
|
|
|
The execution stops if you create a file \texttt{prefix.EXIT} in the
|
|
working directory. Note that just killing the process may leave the
|
|
output files in an unusable state.
|
|
|
|
\subsection{Phonon calculations}
|
|
|
|
The phonon code \texttt{ph.x} calculates normal modes at a given
|
|
\textbf{q}-vector, starting from data files produced by \texttt{pw.x}.
|
|
|
|
If $\mathbf{q}=0$, the data files can be produced directly by a simple
|
|
SCF calculation.
|
|
For phonons at a generic \textbf{q}-vector, you need to perform first
|
|
a SCF calculation, then a band-structure calculation (see above)
|
|
with
|
|
\texttt{calculation = 'phonon'}, specifying the \textbf{q}-vector
|
|
in variable \texttt{xq} of namelist \texttt{\&PHONON}.
|
|
|
|
The output data file appear in the directory specified by variables
|
|
\texttt{outdir}, with names specified by variable \texttt{prefix}.
|
|
After the output file(s) has been produced (do not remove any of the
|
|
files, unless you know which are used and which are not), you can run
|
|
\texttt{ph.x}.
|
|
|
|
The first input line of \texttt{ph.x} is a job identifier.
|
|
At the second line the namelist \texttt{\&INPUTPH} starts.
|
|
The meaning of the variables in the namelist (most of them having a
|
|
default value) is described in file \texttt{INPUT\_PH}.
|
|
Variables \texttt{outdir} and \texttt{prefix} must be the same as in
|
|
the input data of \texttt{pw.x}.
|
|
Presently you must also specify \texttt{amass} (real, dimension
|
|
\texttt{ntyp}): the atomic mass of each atomic type.
|
|
|
|
After the namelist you must specify the \textbf{q}-vector of the
|
|
phonon mode.
|
|
This must be the same \textbf{q}-vector given in the input of
|
|
\texttt{pw.x}.
|
|
|
|
A sample phonon calculation is performed in Example 02.
|
|
|
|
\subsubsection{Calculation of interatomic force constants in real
|
|
space}
|
|
|
|
First, dynamical matrices are calculated and saved for a suitable
|
|
uniform grid of \textbf{q}-vectors.
|
|
Only the \textbf{q}-vectors in the Irreducible Brillouin Zone of the
|
|
crystal are needed.
|
|
If the system is an insulator, effective charges and dielectric tensor
|
|
must be calculated (variable \texttt{epsil=.true}) at $\mathbf{q}=0$.
|
|
|
|
Second, all dynamical matrices are given as input to code
|
|
\texttt{q2r.x}.
|
|
The $\mathbf{q}=0$ file must be the first in the list.
|
|
This produces a file of Interatomic Force Constants in real space, up
|
|
to a distance that depends on the size of the grid of
|
|
\textbf{q}-vectors.
|
|
Program \texttt{matdyn.x} may be used to produce phonon modes and
|
|
frequencies at any \textbf{q} using the Interatomic Force Constants
|
|
file as input.
|
|
Note that if you want to calculate LO-TO splitting and IR cross
|
|
sections in insulators at $\mathbf{q}=0$ you should use program
|
|
\texttt{dynmat.x} instead.
|
|
|
|
See Example 06.
|
|
|
|
\subsubsection{Calculation of electron-phonon interaction
|
|
coefficients}
|
|
|
|
The calculation of electron-phonon coefficients in metals is made
|
|
difficult by the slow convergence of the sum at the Fermi energy.
|
|
It is convenient to calculate phonons, for each \textbf{q}-vector of a
|
|
suitable grid, using a smaller k-point grid, saving the dynamical
|
|
matrix and the self-consistent first-order variation of the potential
|
|
(variable \texttt{fildvscf}).
|
|
Then a non-SCF calculation with a larger k-point grid is performed.
|
|
Finally the electron-phonon calculation is performed by specifying
|
|
\texttt{elph=.true.}, \texttt{trans=.false.}, and the input files
|
|
\texttt{fildvscf}, \texttt{fildyn}.
|
|
The electron-phonon coefficients are calculated using several values
|
|
of gaussian broadening (see \texttt{PH/elphon.f90}) because this
|
|
quickly shows whether results are converged or not with respect to the
|
|
k-point grid and Gaussian broadening. See Example 07.
|
|
|
|
All of the above must be repeated for all desired \textbf{q}-vectors
|
|
and the final result is summed over all \textbf{q}-vectors, using
|
|
\texttt{pwtools/lambda.x}. The input data for the latter is
|
|
described in the header of \texttt{pwtools/lambda.f90}.
|
|
|
|
\subsection{Post-processing}
|
|
|
|
There are a number of auxiliary codes performing postprocessing tasks
|
|
such as plotting, averaging, and so on, on the various quantities
|
|
calculated by \texttt{pw.x}.
|
|
Such quantities are saved by \texttt{pw.x} into the output data
|
|
file(s).
|
|
|
|
The main postprocessing code \texttt{pp.x} reads data file(s) and may
|
|
produce on output either the projection of wavefunctions on atomic
|
|
wavefunctions, or another file containing one of the following
|
|
quantities:
|
|
|
|
\begin{quote}
|
|
charge\\
|
|
spin polarization\\
|
|
various potentials\\
|
|
local density of states at $E_F$\\
|
|
local density of electronic entropy\\
|
|
STM images\\
|
|
wavefunction squared\\
|
|
electron localization function\\
|
|
planar averages\\
|
|
integrated local density of states
|
|
\end{quote}
|
|
|
|
See file \texttt{INPUT\_PP} for a detailed description of the input
|
|
for code \texttt{pp.x}.
|
|
|
|
The file(s) produced by \texttt{pp.x} are processed by program
|
|
\texttt{chdens.x} for plotting.
|
|
The type of plotting (along a line, on a plane, three-dimensional,
|
|
polar) and the output format must be specified here.
|
|
The output file can be directly read by the free plotting system
|
|
Gnuplot (1D or 2D plots), or by code \texttt{plotrho.x} that comes
|
|
with PWscf (2D plots), or by advanced plotting software XCrySDen and
|
|
gOpenMol (3D plots).
|
|
More details on the input data are contained in file
|
|
\texttt{INPUT\_CHDENS}.
|
|
See Example 05 for a charge density plot.
|
|
|
|
The postprocessing code \texttt{bands.x} reads data file(s), extracts
|
|
eigenvalues, regroups them into bands (the algorithm used to order
|
|
bands and to resolve crossings may not work in all circumstances,
|
|
though).
|
|
The output is written to a file in a simple format that can be
|
|
directly read by plotting program \texttt{plotband.x}.
|
|
Unpredictable plots may results if \textbf{k}-points are not in
|
|
sequence along lines.
|
|
See Example 05 for a simple band plot.
|
|
|
|
The postprocessing code \texttt{projwfc.x} calculates projections of
|
|
wavefunction over atomic orbitals.
|
|
The atomic wavefunctions are those contained in the pseudopotential
|
|
file(s).
|
|
The L\"owdin population analysis (similar to Mulliken analysis) is
|
|
presently implemented.
|
|
The projected DOS (the DOS projected onto atomic orbitals) can also be
|
|
calculated.
|
|
More details on the input data are found in the header of file
|
|
\texttt{PP/projwfc.f90}.
|
|
The total electronic DOS is instead calculated by code
|
|
\texttt{PP/dos.x}.
|
|
See Example 08 for total and projected electronic DOS calculations.
|
|
|
|
The postprocessing code \texttt{path\_int.x} is intended to be used in
|
|
the framework of NEB calculations.
|
|
It is a tool to generate a new path (what is actually generated is the
|
|
restart file) starting from an old one through interpolation (cubic
|
|
splines).
|
|
The new path can be discretized with a different number of images
|
|
(this is its main purpose), images are equispaced and the
|
|
interpolation can be also performed on a subsection of the old path.
|
|
The input file needed by \texttt{path\_int.x} can be easily set up
|
|
with the help of the self explanatory \texttt{path\_int.sh} shell
|
|
script.
|
|
|
|
\clearpage
|
|
|
|
\section{Using FPMD and CP}
|
|
|
|
This section is intended to explain how to perform basic
|
|
Car-Parrinello (CP) simulations using the FPMD and CP codes.
|
|
|
|
It is important to understand that a CP simulation is a sequence of
|
|
different runs, some of them used to "prepare" the initial state
|
|
of the system, and other performed to collect statistics,
|
|
or to modify the state of the system itself, i.e. modify the temperature
|
|
or the pressure.
|
|
|
|
To prepare and run a CP simulation you should:
|
|
|
|
\begin{enumerate}
|
|
\item
|
|
define the system:
|
|
\begin{enumerate}
|
|
\item atomic positions
|
|
\item system cell
|
|
\item pseudopotentials
|
|
\item number of electrons and bands
|
|
\item cut-offs
|
|
\item FFT grids (CP code only)
|
|
\end{enumerate}
|
|
|
|
\item
|
|
The first run, when starting from scratch, is always an electronic
|
|
minimization, with fixed ions and cell, to bring the electronic
|
|
system on the ground state (GS) relative to the starting atomic
|
|
configuration.
|
|
Example of input file (Benzene Molecule):
|
|
\begin{verbatim}
|
|
&control
|
|
title = ' Benzene Molecule ',
|
|
calculation = 'cp',
|
|
restart_mode = 'from_scratch',
|
|
ndr = 51,
|
|
ndw = 51,
|
|
nstep = 100,
|
|
iprint = 10,
|
|
isave = 100,
|
|
tstress = .TRUE.,
|
|
tprnfor = .TRUE.,
|
|
dt = 5.0d0,
|
|
etot_conv_thr = 1.d-9,
|
|
ekin_conv_thr = 1.d-4,
|
|
prefix = 'c6h6'
|
|
pseudo_dir='/scratch/acv0/benzene/',
|
|
outdir='/scratch/acv0/benzene/Out/'
|
|
/
|
|
&system
|
|
ibrav = 14,
|
|
celldm(1) = 16.0,
|
|
celldm(2) = 1.0,
|
|
celldm(3) = 0.5,
|
|
celldm(4) = 0.0,
|
|
celldm(5) = 0.0,
|
|
celldm(6) = 0.0,
|
|
nat = 12,
|
|
ntyp = 2,
|
|
nbnd = 15,
|
|
nelec = 30,
|
|
ecutwfc = 40.0,
|
|
nr1b= 10, nr2b = 10, nr3b = 10,
|
|
xc_type = 'BLYP'
|
|
/
|
|
&electrons
|
|
emass = 400.d0,
|
|
emass_cutoff = 2.5d0,
|
|
electron_dynamics = 'sd',
|
|
/
|
|
&ions
|
|
ion_dynamics = 'none',
|
|
/
|
|
&cell
|
|
cell_dynamics = 'none',
|
|
press = 0.0d0,
|
|
/
|
|
ATOMIC_SPECIES
|
|
C 12.0d0 c_blyp_gia.pp
|
|
H 1.00d0 h.ps
|
|
ATOMIC_POSITIONS (bohr)
|
|
C 2.6 0.0 0.0
|
|
C 1.3 -1.3 0.0
|
|
C -1.3 -1.3 0.0
|
|
C -2.6 0.0 0.0
|
|
C -1.3 1.3 0.0
|
|
C 1.3 1.3 0.0
|
|
H 4.4 0.0 0.0
|
|
H 2.2 -2.2 0.0
|
|
H -2.2 -2.2 0.0
|
|
H -4.4 0.0 0.0
|
|
H -2.2 2.2 0.0
|
|
H 2.2 2.2 0.0
|
|
\end{verbatim}
|
|
|
|
You can find the description of the input variables in files
|
|
\texttt{INPUT.FPMD}, \texttt{INPUT.HOWTO} and \texttt{INPUT}.
|
|
|
|
\item
|
|
Sometimes a single run is not enough to reach the GS.
|
|
In this case, you need to re-run the electronic minimization
|
|
stage.
|
|
Use the input of the first run, changing \texttt{restart\_mode =
|
|
'from\_scratch'} to \texttt{restart\_mode = 'restart'}.
|
|
|
|
Important: unless you are already experienced with the system you
|
|
are studying or with the code internals, usually you need to tune
|
|
some input parameters, like \texttt{emass}, \texttt{dt}, and
|
|
cut-offs.
|
|
For this purpose, a few trial runs could be useful: you can
|
|
perform short minimizations (say, 10 steps) changing and adjusting
|
|
these parameters to your need.
|
|
|
|
You could specify the degree of convergence with these two
|
|
thresholds:
|
|
|
|
\texttt{etot\_conv\_thr}: total energy difference between two
|
|
consecutive steps
|
|
|
|
\texttt{ekin\_conv\_thr}: value of the fictitious kinetic energy
|
|
of the electrons
|
|
|
|
Usually we consider the system on the GS when
|
|
\texttt{ekin\_conv\_thr}${} < \sim 10^{-5}$.
|
|
You could check the value of the fictitious kinetic energy on the
|
|
standard output (column EKINC).
|
|
|
|
Different strategies are available to minimize electrons, but the
|
|
most used ones are:
|
|
\begin{itemize}
|
|
\item
|
|
steepest descent:
|
|
\begin{verbatim}
|
|
electron_dynamics = 'sd'
|
|
\end{verbatim}
|
|
\item
|
|
damped dynamics:
|
|
\begin{verbatim}
|
|
electron_dynamics = 'damp',
|
|
electron_damping = 0.1,
|
|
\end{verbatim}
|
|
See input description to compute damping factor, usually the
|
|
value is between 0.1 and 0.5.
|
|
\end{itemize}
|
|
|
|
\item
|
|
Once your system is in the GS, depending on how you have prepared
|
|
the starting atomic configuration, you should do several things:
|
|
\begin{itemize}
|
|
\item
|
|
if you have set the atomic positions ``by hand'' and/or from a
|
|
classical code, check the forces on atoms, and if they are
|
|
large ($\sim 0.1 - 1.0$ atomic units), you should perform an
|
|
ionic minimization, otherwise the sistem could break-up during
|
|
the dynamics.
|
|
\item
|
|
if you have taken the positions from a previous run or a
|
|
previous ab-initio simulation, check the forces, and if they
|
|
are too small ($\sim 10^{-4}$ atomic units), this means that
|
|
atoms are already in equilibrium positions and, even if left
|
|
free, they will not move.
|
|
Then you need to randomize positions a little bit. see below.
|
|
\end{itemize}
|
|
|
|
\item
|
|
Minimize ionic positions.
|
|
|
|
As we pointed out in 4) if the interatomic forces are too high,
|
|
the system could "explode" if we switch on the ionic dynamics.
|
|
To avoid that we need to relax the system.
|
|
|
|
Again there are different strategies to relax the system, but the
|
|
most used are again steepest descent or damped dynamics for ions
|
|
and electrons.
|
|
You could also mix electronic and ionic minimization scheme
|
|
freely, i.e. ions in steepest and electron in damping or vice
|
|
versa.
|
|
|
|
\begin{enumerate}
|
|
\item
|
|
suppose we want to perform a steepest for ions.
|
|
Then we should specify the following section for ions:
|
|
\begin{verbatim}
|
|
&ions
|
|
ion_dynamics = 'sd',
|
|
/
|
|
\end{verbatim}
|
|
Change also the ionic masses to accelerate the minimization:
|
|
\begin{verbatim}
|
|
ATOMIC_SPECIES
|
|
C 2.0d0 c_blyp_gia.pp
|
|
H 2.00d0 h.ps
|
|
\end{verbatim}
|
|
while leaving unchanged other input parameters.
|
|
|
|
Note that if the forces are really high ($> 1.0$ atomic
|
|
units), you should always use stepest descent for the first
|
|
relaxation steps ($\sim 100$).
|
|
|
|
\item
|
|
as the system approaches the equilibrium positions, the
|
|
steepest descent scheme slows down, so is better to switch to
|
|
damped dynamics:
|
|
\begin{verbatim}
|
|
&ions
|
|
ion_dynamics = 'damp',
|
|
ion_damping = 0.2,
|
|
ion_velocities = 'zero',
|
|
/
|
|
\end{verbatim}
|
|
A value of \texttt{ion\_damping} between 0.05 and 0.5 is
|
|
usually used for many systems.
|
|
It is also better to specify to restart with zero ionic and
|
|
electronic velocities, since we have changed the masses.
|
|
Change further the ionic masses to accelerate the
|
|
minimization:
|
|
\begin{verbatim}
|
|
ATOMIC_SPECIES
|
|
C 0.1d0 c_blyp_gia.pp
|
|
H 0.1d0 h.ps
|
|
\end{verbatim}
|
|
|
|
\item
|
|
when the system is really close to the equilibrium, the damped
|
|
dynamics slow down too, especially because, since we are
|
|
moving electron and ions together, the ionic forces are not
|
|
properly correct, then it is often better to perform a ionic
|
|
step every $N$ electronic steps, or to move ions only when
|
|
electron are in their GS (within the chosen threshold).
|
|
|
|
This can be specified adding, in the ionic section, the
|
|
\texttt{ion\_nstepe} parameter, then the ionic input section
|
|
become as follows:
|
|
\begin{verbatim}
|
|
&ions
|
|
ion_dynamics = 'damp',
|
|
ion_damping = 0.2,
|
|
ion_velocities = 'zero',
|
|
ion_nstepe = 10,
|
|
/
|
|
\end{verbatim}
|
|
Then we specify in the control input section:
|
|
\begin{verbatim}
|
|
etot_conv_thr = 1.d-6,
|
|
ekin_conv_thr = 1.d-5,
|
|
forc_conv_thr = 1.d-3
|
|
\end{verbatim}
|
|
As a result, the code checks every 10 electronic steps whether
|
|
the electronic system satisfies the two thresholds
|
|
\texttt{etot\_conv\_thr}, \texttt{ekin\_conv\_thr}: if it
|
|
does, the ions are advanced by one step.
|
|
The process thus continues until the forces become smaller
|
|
than \texttt{forc\_conv\_thr}.
|
|
|
|
Note that to fully relax the system you need many run, and
|
|
different strategies, that you shold mix and change in order
|
|
to speed-up the convergence.
|
|
The process is not automatic, but is strongly based on
|
|
experience, and trial and error.
|
|
|
|
Remember also that the convergence to the equilibrium
|
|
positions depends on the energy threshold for the electronic
|
|
GS, in fact correct forces (required to move ions toward the
|
|
minimum) are obtained only when electrons are in their GS.
|
|
Then a small threshold on forces could not be satisfied, if
|
|
you do not require an even smaller threshold on total energy.
|
|
\end{enumerate}
|
|
|
|
\item
|
|
randomization of positions.
|
|
|
|
If you have relaxed the system or if the starting system is
|
|
already in the equilibrium positions, then you need to move ions
|
|
from the equilibrium positions, otherwise they won't move in a
|
|
dynamics simulation.
|
|
After the randomization you should bring electrons on the GS
|
|
again, in order to start a dynamic with the correct forces and
|
|
with electrons in the GS.
|
|
Then you should switch off the ionic dynamics and activate the
|
|
randomization for each species, specifying the amplitude of the
|
|
randomization itself.
|
|
This could be done with the following ionic input section:
|
|
\begin{verbatim}
|
|
&ions
|
|
ion_dynamics = 'none',
|
|
tranp(1) = .TRUE.,
|
|
tranp(2) = .TRUE.,
|
|
amprp(1) = 0.01
|
|
amprp(2) = 0.01
|
|
/
|
|
\end{verbatim}
|
|
In this way a random displacement (of max 0.01 a.u.) is added to
|
|
atoms of specie 1 and 2.
|
|
All other input parameters could remain the same.
|
|
|
|
Note that the difference in the total energy (\texttt{etot})
|
|
between relaxed and randomized positions can be used to estimate
|
|
the temperature that will be reached by the system.
|
|
In fact, starting with zero ionic velocities, all the difference
|
|
is potential energy, but in a dynamics simulation, the energy will
|
|
be equipartitioned between kinetic and potential, then to estimate
|
|
the temperature take the difference in energy (de), convert it in
|
|
Kelvins, divide for the number of atoms and multiply by 2/3.
|
|
|
|
Randomization could be useful also while we are relaxing the
|
|
system, especially when we suspect that the ions are in a local
|
|
minimum or in an energy plateau.
|
|
|
|
\item
|
|
Start the Car-Parrinello dynamics.
|
|
|
|
At this point after having minimized the electrons, and with ions
|
|
displaced from their equilibrium positions, we are ready to start
|
|
a CP dynamics.
|
|
We need to specify \texttt{'verlet'} both in ionic and electronic
|
|
dynamics.
|
|
The threshold in control input section will be ignored, like any
|
|
parameter related to minimization strategy.
|
|
The first time we perform a CP run after a minimization, it is
|
|
always better to put velocities equal to zero, unless we have
|
|
velocities, from a previous simulation, to specify in the input
|
|
file.
|
|
Restore the proper masses for the ions.
|
|
In this way we will sample the microcanonical ensemble.
|
|
The input section changes as follow:
|
|
\begin{verbatim}
|
|
&electrons
|
|
emass = 400.d0,
|
|
emass_cutoff = 2.5d0,
|
|
electron_dynamics = 'verlet',
|
|
electron_velocities = 'zero',
|
|
/
|
|
&ions
|
|
ion_dynamics = 'verlet',
|
|
ion_velocities = 'zero',
|
|
/
|
|
ATOMIC_SPECIES
|
|
C 12.0d0 c_blyp_gia.pp
|
|
H 1.00d0 h.ps
|
|
\end{verbatim}
|
|
If you want to specify the initial velocities for ions, you have
|
|
to set \texttt{ion\_velocities = 'from\_input'}, and add the
|
|
\texttt{IONIC\_VELOCITIES}\break
|
|
card, with the list of velocities in atomic units.
|
|
|
|
IMPORTANT: in restarting the dynamics after the first CP run,
|
|
remember to remove or comment the velocities parameters:
|
|
\begin{verbatim}
|
|
&electrons
|
|
emass = 400.d0,
|
|
emass_cutoff = 2.5d0,
|
|
electron_dynamics = 'verlet',
|
|
! electron_velocities = 'zero',
|
|
/
|
|
&ions
|
|
ion_dynamics = 'verlet',
|
|
! ion_velocities = 'zero',
|
|
/
|
|
\end{verbatim}
|
|
otherwise you will quench the system interrupting the sampling of
|
|
the microcanonical ensemble.
|
|
|
|
\item
|
|
Changing the temperature of the system.
|
|
|
|
It is possible to change the temperature of the system or to
|
|
sample the canonical ensemble fixing the average temperature, this
|
|
is done using the Nos\`e thermostat.
|
|
To activate this thermostat for ions you have to specify in the
|
|
ions input section:
|
|
\begin{verbatim}
|
|
&ions
|
|
ion_dynamics = 'verlet',
|
|
ion_temperature = 'nose',
|
|
fnosep = 60.0,
|
|
tempw = 300.0,
|
|
! ion_velocities = 'zero',
|
|
/
|
|
\end{verbatim}
|
|
where \texttt{fnosep} is the frequency of the thermostat in THz,
|
|
this should be chosen to be comparable with the center of the
|
|
vibrational spectrum of the system, in order to excite as many
|
|
vibrational modes as possible.
|
|
\texttt{tempw} is the desired average temperature in Kelvin.
|
|
|
|
It is possible to specify also the thermostat for the electrons,
|
|
this is usually activated in metal or in system where we have a
|
|
transfer of energy between ionic and electronic degrees of
|
|
freedom.
|
|
\end{enumerate}
|
|
|
|
\clearpage
|
|
|
|
\section{Performance issues (PWscf)}
|
|
\label{performance}
|
|
|
|
\subsection{CPU time requirements}
|
|
|
|
The following holds for code {\tt pw.x} and for non-US PPs.
|
|
For US PPs there are additional terms to be calculated.
|
|
For phonon calculations, each of the $3 N_{at}$ modes requires a CPU
|
|
time of the same order of that required by a self-consistent
|
|
calculation in the same system.
|
|
|
|
The computer time required for the self-consistent solution at fixed
|
|
ionic positions, $T_{scf}$, is:
|
|
$$
|
|
T_{scf} = N_{iter} \cdot T_{iter} + T_{init}
|
|
$$
|
|
where $N_{iter}=\mathtt{niter}=$ number of self-consistency
|
|
iterations, $T_{iter}=$ CPU time for a single iteration,
|
|
$T_{sub}=$ initialization time for a single iteration.
|
|
Usually $T_{init} << N_{iter} \cdot T_{iter}$.
|
|
|
|
The time required for a single self-consistency iteration
|
|
$T_{iter}$ is:
|
|
$$
|
|
T_{iter} = N_k \cdot T_{diag} + T_{rho} + T_{scf}
|
|
$$
|
|
where $N_k=$ number of k-points, $T_{diag}=$ CPU time per hamiltonian
|
|
iterative diagonalization, $T_{rho}=$ CPU time for charge density
|
|
calculation, $T_{scf}=$ CPU time for Hartree and exchange-correlation
|
|
potential calculation.
|
|
|
|
The time for a Hamiltonian iterative diagonalization $T_{diag}$ is:
|
|
$$
|
|
T_{diag} = N_h \cdot T_h + T_{orth} + T_{sub}
|
|
$$
|
|
where $N_h=$ number of $H\psi$ products needed by iterative
|
|
diagonalization, $T_h=$ CPU time per $H\psi$ product, $T_{orth}=$ CPU
|
|
time for orthonormalization, $T_{sub}=$ CPU time for subspace
|
|
diagonalization.
|
|
|
|
The time $T_h$ required for a $H\psi$ product is
|
|
$$
|
|
T_h = a_1 \cdot M \cdot N
|
|
+ a_2 \cdot M \cdot N_1 \cdot N_2 \cdot N_3 \cdot
|
|
\log(N_1 \cdot N_2 \cdot N_3)
|
|
+ a_3 \cdot M \cdot P \cdot N.
|
|
$$
|
|
The first term comes from the kinetic term and is usually much smaller
|
|
than the others.
|
|
The second and third terms come respectively from local and nonlocal
|
|
potential.
|
|
$a_1$, $a_2$, $a_3$ are prefactors, $M=$ number of valence bands,
|
|
$N=$ number of plane waves (basis set dimension),
|
|
$N_1$, $N_2$, $N_3=$ dimensions of the FFT grid for wavefunctions
|
|
($N_1 \cdot N_2 \cdot N_3 \sim 8N$), $P=$ number of projectors for PPs
|
|
(summed on all atoms, on all values of the angular momentum $l$, and
|
|
$m=1,\dots,2l+1$)
|
|
|
|
The time $T_{orth}$ required by orthonormalization is
|
|
$$
|
|
T_{orth}=b_1*M_x^2*N
|
|
$$
|
|
and the time $T_{sub}$ required by subspace diagonalization is
|
|
$$
|
|
T_{sub}=b_2*M_x^3
|
|
$$
|
|
where $b_1$ and $b_2$ are prefactors, $M_x=$ number of trial
|
|
wavefunctions (this will vary between $M$ and a few times $M$,
|
|
depending on the algorithm).
|
|
|
|
The time $T_{rho}$ for the calculation of charge density from
|
|
wavefunctions is
|
|
$$
|
|
T_{rho} = c_1 \cdot M \cdot Nr_1 \cdot Nr_2 \cdot Nr_3 \cdot
|
|
\log(Nr_1 \cdot Nr_2 \cdot Nr_3)
|
|
+ c_2 \cdot M \cdot Nr_1 \cdot Nr_2 \cdot Nr_3 + T_{us}
|
|
$$
|
|
where $c_1$, $c_2$, $c_3$ are prefactors,
|
|
$Nr_1$, $Nr_2$, $Nr_3=$ dimensions of the FFT grid for charge density
|
|
($Nr_1 \cdot Nr_2 \cdot Nr_3 \sim 8N_g$, where $N_g=$ number of
|
|
G-vectors for the charge density), and $T_{us}=$ CPU time required by
|
|
ultrasoft contribution (if any).
|
|
|
|
The time $T_{scf}$ for calculation of potential from charge density is
|
|
$$
|
|
T_{scf} = d_2 \cdot Nr_1 \cdot Nr_2 \cdot Nr_3 + d_3 \cdot
|
|
Nr_1 \cdot Nr_2 \cdot Nr_3 \cdot
|
|
\log(Nr_1 \cdot Nr_2 \cdot Nr_3)
|
|
$$
|
|
where $d_1$, $d_2$ are prefactors.
|
|
|
|
\subsection{Memory requirements}
|
|
|
|
A typical self-consistency or molecular-dynamics run requires
|
|
a maximum memory in the order
|
|
of $O$ double precision complex numbers, where
|
|
$$
|
|
O = m \cdot M \cdot N + P \cdot N + p \cdot N_1 \cdot N_2 \cdot N_3
|
|
+ q \cdot Nr_1 \cdot Nr_2 \cdot Nr_3
|
|
$$
|
|
with $m$, $p$, $q=$ small factors; all other variables have the same
|
|
meaning as above.
|
|
Note that if the $\Gamma$-point only ($\mathbf{q}=0$) is used to
|
|
sample the Brillouin Zone, the value of $N$ will be cut into half.
|
|
|
|
Code \texttt{memory.x} yields a rough estimate of the memory required
|
|
by \texttt{pw.x} and checks for the validity of the input data file as
|
|
well. Use it exactly as \texttt{pw.x}.
|
|
|
|
The memory required by the phonon code follows the same patterns,
|
|
with somewhat larger factors $m$, $p$, $q$.
|
|
|
|
\subsection{File space requirements}
|
|
|
|
A typical \texttt{pw.x} run will require an amount of temporary disk
|
|
space in the order of $O$ double precision complex numbers:
|
|
$$
|
|
O = N_k \cdot M \cdot N + q \cdot Nr_1 \cdot Nr_2 \cdot Nr_3
|
|
$$
|
|
where $q=2 \cdot \mathtt{mixing\_ndim}$ (number of iterations used in
|
|
self-consistency, default value $=8$) if \texttt{disk\_io} is set to
|
|
\texttt{'high'} or not specified;
|
|
$q=0$ if \texttt{disk\_io='low'} or \texttt{'minimal'}.
|
|
|
|
\subsection{Parallelization issues}
|
|
\label{parissues}
|
|
|
|
\texttt{pw.x} can run in principle on any number of processors (up to
|
|
\texttt{maxproc}, presently fixed at 128 in \texttt{PW/para.f90}).
|
|
The $N_p$ processors can be divided into $N_{pk}$ pools of $N_{pr}$
|
|
processors, $N_p=N_{pk}*N_{pr}$.
|
|
The k-points are divided across $N_{pk}$ pools (``k-point
|
|
parallelization''), while both R- and G-space grids are divided across
|
|
the $N_{pr}$ processors of each pool (``PW parallelization'').
|
|
A third level of parallelization, on the number of bands, is
|
|
currently confined to the calculation of a few quantities that
|
|
would not be parallelized at all otherwise.
|
|
A fourth level of parallelization, on the number of NEB images,
|
|
is available for NEB calculation only.
|
|
|
|
The effectiveness of parallelization depends on the size and type of
|
|
the system and on a judicious choice of the $N_{pk}$ and $N_{pr}$:
|
|
|
|
\begin{itemize}
|
|
\item
|
|
k-point parallelization is very effective if $N_{pk}$ is a divisor
|
|
of the number of k-points (linear speedup guaranteed), \emph{but}
|
|
it does not reduce the amount of memory per processor taken by the
|
|
calculation.
|
|
As a consequence, large systems may not fit into memory.
|
|
The same applies to parallelization over NEB images.
|
|
\item
|
|
PW parallelization works well if $N_{pr}$ is a divisor of both
|
|
dimensions along the $z$ axis of the FFT grids, $N_3$ and $Nr_3$
|
|
(which may coincide).
|
|
It does not scale so well as k-point parallelization, but it
|
|
reduces both CPU time AND memory (the latter almost linearly).
|
|
\item
|
|
Optimal serial performances are achieved when the data are as much
|
|
as possible kept into the cache.
|
|
As a side effect, one can achieve better than linear scaling with
|
|
the number of processors, thanks to the increase in serial speed
|
|
coming from the reduction of data size (making it easier for the
|
|
machine to keep data in the cache).
|
|
\end{itemize}
|
|
|
|
Note that for each system there is an optimal range of number of
|
|
processors on which to run the job.
|
|
A too large number of processors will yield performance degradation,
|
|
or may cause the parallelization algorithm to fail in distributing
|
|
properly R- and G-space grids.
|
|
|
|
Note also that Beowulf-style machines (PC clusters) may have
|
|
disappointing parallelization performances unless they have a decent
|
|
communication hardware (at least Gigabit ethernet).
|
|
Do not expect good scaling with cheap hardware: plane-wave
|
|
calculations are not at all an "embarrassing parallel" problem.
|
|
Note that multiprocessor motherboards for Intel Pentium CPUs typically
|
|
have just one memory bus for all processors.
|
|
This dramatically slows down any code doing massive access to memory
|
|
(as most codes in the $\nu-$ESPRESSO package do) that runs on processors of
|
|
the same motherboard.
|
|
|
|
\clearpage
|
|
|
|
\section{Troubleshooting (PWscf)}
|
|
|
|
Almost all problems in PWscf arise from incorrect input data and
|
|
result in error stops. Error messages should be self-explanatory,
|
|
but unfortunately this is not always true. If the code issues a
|
|
warning messages and continues, pay attention to it but do not
|
|
assume that something is necessarily wrong in your calculation:
|
|
most warning messages signal harmless problems.
|
|
|
|
Note for PC Linux clusters in parallel execution: in at least some
|
|
versions of MPICH, the current directory is set to the directory where
|
|
the \emph{executable code} resides, instead of being set to the
|
|
directory where the code is executed.
|
|
This MPICH weirdness may cause unexpected failures in some
|
|
postprocessing codes (i.e., \texttt{chdens.x}) that expect a data file
|
|
in the current directory.
|
|
Workaround: use symbolic links, or copy the executable to the current
|
|
directory.
|
|
|
|
Typical \texttt{pw.x} and/or \texttt{ph.x} (mis-)behavior:
|
|
|
|
\paragraph{\texttt{pw.x} yields a message like ``error while loading
|
|
shared libraries: \dots{} cannot open shared object file''
|
|
and does not start.}
|
|
|
|
Possible reasons:
|
|
|
|
\begin{itemize}
|
|
\item
|
|
If you are running on the same machines on which the code was
|
|
compiled, this is a library configuration problem.
|
|
The solution is machine-dependent.
|
|
On Linux, find the path to the missing libraries; then either add
|
|
it to file \texttt{/etc/ld.so.conf} and run \texttt{ldconfig}
|
|
(must be done as root), or add it to variable
|
|
\texttt{LD\_LIBRARY\_PATH} and export it.
|
|
Another possibility is to load non-shared version of libraries
|
|
(ending with \texttt{.a}) instead of shared ones (ending with
|
|
\texttt{.so}).
|
|
\item
|
|
If you are \emph{not} running on the same machines on which the
|
|
code was compiled: you need either to have the same shared
|
|
libraries installed on both machines, or to load statically all
|
|
libraries (using appropriate \texttt{configure} or loader options).
|
|
The same applies to Beowulf-style parallel machines: the needed
|
|
shared libraries must be present on all PC's.
|
|
\end{itemize}
|
|
|
|
\paragraph{errors in examples with parallel execution}
|
|
|
|
If you get error messages in the example scripts -- i.e. not errors
|
|
in the codes -- on a parallel machine, such as e.g. :
|
|
``\texttt{run\_example: -n: command not found}''
|
|
you have forgotten the `''` in the definitions of
|
|
\texttt{PARA\_PREFIX} and \texttt{PARA\_POSTFIX}.
|
|
|
|
\paragraph{\texttt{pw.x} stops with error in reading.}
|
|
|
|
There is an error in the input data.
|
|
Usually it is a misspelled namelist variable, or an empty input file.
|
|
Note that out-of-bound indices in dimensioned variables read in the
|
|
namelist may cause the code to crash with really mysterious error
|
|
messages.
|
|
Also note that input data files containing \texttt{\^{}M} (Control-M)
|
|
characters at the end of lines (typically, files coming from Windows
|
|
PC) may yield error in reading.
|
|
If none of the above applies and the code stops at the first namelist
|
|
(``control'')
|
|
and you are running on a PC cluster: your communication library
|
|
(MPI) might not be properly configured to allow input
|
|
redirection (so that you are effectively reading an empty file).
|
|
See section ``Running on parallel machines'', or inquire with your
|
|
local computer wizard (if any).
|
|
|
|
\paragraph{\texttt{pw.x} mumbles something like ``cannot recover'' or
|
|
``error reading recover file''.}
|
|
|
|
You are trying to restart from a previous job that either produced
|
|
corrupted files, or did not do what you think it did. No luck:
|
|
you have to restart from scratch.
|
|
|
|
\paragraph{\texttt{pw.x} stops with error in cdiagh or cdiaghg.}
|
|
|
|
Possible reasons:
|
|
\begin{itemize}
|
|
\item
|
|
serious error in data, such as bad atomic positions or bad crystal
|
|
structure/supercell;
|
|
\item
|
|
a bad PP (for instance, with a ghost);
|
|
\item
|
|
a failure of the algorithm performing subspace diagonalization.
|
|
The LAPACK algorithms used by cdiagh or cdiaghg are very robust
|
|
and extensively tested. Still, it may seldom happen that such
|
|
algorithms fail. In at least one case the failures was tracked
|
|
to the non-positiveness of the S matrix appearing in the US-PP
|
|
formalism. In other cases, the error is found to be non reproducible
|
|
on different architectures and disappearing if the calculation
|
|
is repeated with even minimal changes in its parameters.
|
|
In both cases, the reasons for such behavior are unclear and
|
|
the only advice is to use conjugate-gradient diagonalization
|
|
(\texttt{diagonalization='cg'}), a slower but very robust
|
|
algorithm, and see what happens.
|
|
\item
|
|
HP-Compaq alphas with \texttt{cxml} libraries: try to use compiled
|
|
BLAS and LAPACK (or better, ATLAS) instead of those contained in
|
|
\texttt{cxml} (just load them before \texttt{cxml}).
|
|
\end{itemize}
|
|
|
|
\paragraph{\texttt{pw.x} crashes with ``floating invalid'' or ``floating divide by zero''.}
|
|
|
|
If this happens on HP-Compaq True64 Alpha machines with an old
|
|
version of the compiler: the compiler is most likely buggy.
|
|
Otherwise, move to next item.
|
|
|
|
\paragraph{\texttt{pw.x} crashes with no error message at all.}
|
|
|
|
This happens quite often in parallel execution, or under a batch
|
|
queue, or if you are writing the output to a file.
|
|
When the program crashes, part of the output, including the error
|
|
message, may be lost, or hidden into error files where nobody looks
|
|
into.
|
|
It is the fault of the operating system, not of the code.
|
|
Try to run interactively and to write to the screen.
|
|
If this doesn't help, move to next point.
|
|
|
|
\paragraph{\texttt{pw.x} crashes with ``segmentation fault'' or
|
|
similarly obscure messages.}
|
|
|
|
Possible reasons:
|
|
\begin{itemize}
|
|
\item
|
|
nonexistent or non accessible {\tt outdir}.
|
|
Note that in parallel execution, {\tt outdir} must exist and be
|
|
accessible to all active processors.
|
|
\item
|
|
too much RAM memory requested (see next item).
|
|
\item
|
|
if you are using highly optimized mathematical libraries, verify
|
|
that they are designed for your hardware.
|
|
In particular, for Intel compiler and MKL libraries, verify that
|
|
you loaded the correct set of CPU-specific MKL libraries.
|
|
\item
|
|
buggy compiler.
|
|
If you are using Portland or Intel compilers on Linux PC's or
|
|
clusters, see section \ref{installissues}, ``Installation
|
|
issues''.
|
|
\end{itemize}
|
|
|
|
\paragraph{\texttt{pw.x} works for simple systems, but not for large
|
|
systems or whenever more RAM is needed.}
|
|
|
|
Possible solutions:
|
|
\begin{itemize}
|
|
\item
|
|
increase the amount of RAM you are authorized to use (which may be
|
|
much smaller than the available RAM).
|
|
Ask your system administrator if you don't know what to do.
|
|
\item
|
|
reduce \texttt{nbnd} to the strict minimum, or reduce the cutoffs,
|
|
or the cell size.
|
|
\item
|
|
use conjugate-gradient (\texttt{diagonalization='cg'}: slow
|
|
but very robust) or DIIS (\texttt{diagonalization='diis'}:
|
|
fast but not very robust):
|
|
both requires less memory than the default Davidson algorithm.
|
|
\item
|
|
in parallel execution, use more processors, or use the same number
|
|
of processors with less pools.
|
|
Remember that parallelization with respect to k-points (pools)
|
|
does not distribute memory: parallelization with respect to
|
|
\textbf{R}- (and \textbf{G}-) space does.
|
|
\item
|
|
IBM only (32-bit machines): if you need more than 256 MB you must
|
|
specify it at link time (option \texttt{-bmaxdata}).
|
|
\item
|
|
buggy compiler.
|
|
Some versions of Portland compiler on Linux PC's or clusters have
|
|
this problem.
|
|
\end{itemize}
|
|
|
|
\paragraph{\texttt{pw.x} runs but nothing happens.}
|
|
|
|
Possible reasons:
|
|
\begin{itemize}
|
|
\item
|
|
in parallel execution, the code died on just one processor.
|
|
Unpredictable behavior may follow.
|
|
\item
|
|
in serial execution, the code encountered a floating-point error
|
|
and goes on producing NaN's (Not a Number) forever unless
|
|
exception handling is on (and usually it isn't).
|
|
In both cases, look for one of the reasons given above.
|
|
\item
|
|
maybe your calculation will take more time than you expect.
|
|
\end{itemize}
|
|
|
|
\paragraph{\texttt{pw.x} yields weird results.}
|
|
|
|
Possible solutions:
|
|
\begin{itemize}
|
|
\item
|
|
if this happen after a change in the code or in compilation or
|
|
preprocessing options, try \texttt{make clean} and recompile.
|
|
The \texttt{make} command should take care of all dependencies,
|
|
but do not rely too heavily on it.
|
|
If the problem persists, \texttt{make clean} and recompile with
|
|
reduced optimization level.
|
|
\item
|
|
maybe your input data are weird.
|
|
\end{itemize}
|
|
|
|
\paragraph{\texttt{pw.x} stops with error message ``the system is
|
|
metallic, specify occupations''.}
|
|
|
|
You did not specify state occupations, but you need to, since your
|
|
system appears to have an odd number of electrons.
|
|
The variable controlling how metallicity is treated is
|
|
\texttt{occupations} in namelist \texttt{\&SYSTEM}.
|
|
The default, \texttt{occupations='fixed'}, occupies the lowest
|
|
\texttt{nelec/2} states and works only for insulators with a gap.
|
|
In all other cases, use \texttt{'smearing'} or \texttt{'tetrahedra'}.
|
|
See file \texttt{INPUT\_PW} for more details.
|
|
|
|
\paragraph{\texttt{pw.x} stops with ``unexpected error'' in
|
|
\texttt{efermi}.}
|
|
|
|
Possible reasons:
|
|
\begin{itemize}
|
|
\item
|
|
serious error in data, such as bad number of electrons,
|
|
insufficient number of bands, absurd value of broadening, or too
|
|
few tetrahedra;
|
|
\item
|
|
the Fermi energy is found by bisection assuming that the
|
|
integrated DOS $N(E)$ is an increasing function of the energy.
|
|
This is {\em not} guaranteed for Methfessel-Paxton smearing of
|
|
order 1 and can give problems when very few k-points are used.
|
|
Use some other smearing function: simple Gaussian broadening or,
|
|
better, Marzari-Vanderbilt ``cold smearing''.
|
|
\end{itemize}
|
|
|
|
\paragraph{the FFT grids in \texttt{pw.x} are machine-dependent.}
|
|
|
|
Yes, they are!
|
|
The code automatically chooses the smallest grid that is compatible
|
|
with the specified cutoff in the specified cell, \emph{and} is an
|
|
allowed value for the FFT library used.
|
|
Most FFT libraries are implemented, or perform well, only with
|
|
dimensions that factors into products of small numers (2, 3, 5
|
|
typically, sometimes 7 and 11).
|
|
Different FFT libraries follow different rules and thus different
|
|
dimensions can result for the same system on different machines (or
|
|
even on the same machine, with a different FFT).
|
|
See function \texttt{allowed} in \texttt{Modules/fft\_scalar.f90}.
|
|
|
|
As a consequence, the energy may be slightly different on different
|
|
machines.
|
|
The only piece that depends explicitely on the grid parameters is the
|
|
XC part of the energy that is computed numerically on the grid.
|
|
The differences should be small, though, expecially for LDA
|
|
calculations.
|
|
|
|
Manually setting the FFT grids to a desired value is possible, but
|
|
slightly tricky, using input variables \texttt{nr1, nr2, nr3} and
|
|
\texttt{nr1s, nr2s, nr3s}.
|
|
The code will still increase them if not acceptable.
|
|
Automatic FFT grid dimensions are slightly overestimated, so one may
|
|
try --- very carefully --- to reduce them a little bit.
|
|
The code will stop if too small values are required, it will waste CPU
|
|
time and memory for too large values.
|
|
|
|
Note that in parallel execution, it is very convenient to have FFT
|
|
grid dimensions along z that are a multiple of the number of
|
|
processors.
|
|
|
|
\paragraph{``warning: symmetry operation \# N not allowed''.}
|
|
|
|
This is not an error.
|
|
\texttt{pw.x} determines first the symmetry operations (rotations)
|
|
of the Bravais lattice; then checks which of these are symmetry
|
|
operations of the system (including if needed fractional
|
|
translations).
|
|
This is done by rotating (and translating if needed) the atoms in
|
|
the unit cell and verifying if the rotated unit cell coincides
|
|
with the original one.
|
|
|
|
If a symmetry operation contains a
|
|
fractional translation that is incompatible with the FFT grid,
|
|
it is discarded in order to prevent problems with symmetrization.
|
|
Typical fractional translations are 1/2 or 1/3 of a lattice
|
|
vector. If the FFT grid dimension along that direction is not
|
|
divisible respectively by 2 or by 3, the symmetry operation will
|
|
not transform the FFT grid into itself.
|
|
|
|
\paragraph{\texttt{pw.x} doesn't find all the symmetries you
|
|
expected.}
|
|
|
|
See above to learn how PWscf finds symmetry operations.
|
|
Some of them might be missing because:
|
|
\begin{itemize}
|
|
\item
|
|
the number of significant figures in the atomic positions is not
|
|
large enough.
|
|
In file \texttt{PW/eqvect.f90}, the variable \texttt{accep} is
|
|
used to decide whether a rotation is a symmetry operation.
|
|
Its current value ($10^{-5}$) is quite strict: a rotated atom must
|
|
coincide with another atom to 5 significant digits.
|
|
You may change the value of \texttt{accep} and recompile.
|
|
\item
|
|
they are not acceptable symmetry operations of the Bravais
|
|
lattice.
|
|
This is the case for C$_{60}$, for instance: the $I_h$ icosahedral
|
|
group of C$_{60}$ contains 5-fold rotations that are incompatible
|
|
with translation symmetry.
|
|
\item
|
|
the system is rotated with respect to symmetry axis.
|
|
For instance: a C$_{60}$ molecule in the fcc lattice will have 24
|
|
symmetry operations ($T_h$ group) only if the double bond is
|
|
aligned along one of the crystal axis; if C$_{60}$ is rotated in
|
|
some arbitrary way, \texttt{pw.x} may not find any symmetry, apart
|
|
from inversion.
|
|
\item
|
|
they contain a fractional translation that is incompatible with
|
|
the FFT grid (see previous paragraph).
|
|
Note that if you change cutoff or unit cell volume, the
|
|
automatically computed FFT grid changes, and this may explain
|
|
changes in symmetry (and in the number of k-points as a
|
|
consequence) for no apparent good reason (only if you have
|
|
fractional translations in the system, though).
|
|
\item
|
|
a fractional translation, without rotation, is a symmetry
|
|
operation of the system. This means that the cell is actually
|
|
a supercell. In this case, all symmetry operations containing
|
|
fractional translations are disabled.
|
|
The reason is that in this rather exotic case there is no simple
|
|
way to select those symmetry operations forming a true group, in
|
|
the mathematical sense of the term.
|
|
\end{itemize}
|
|
|
|
\paragraph{the CPU time is time-dependent!}
|
|
|
|
Yes it is!
|
|
On most machines and on most operating systems, depending on machine
|
|
load, on communication load (for parallel machines), on various other
|
|
factors (including maybe the phase of the moon), reported CPU times
|
|
may vary quite a lot for the same job.
|
|
Also note that what is printed is supposed to be the CPU time per
|
|
process, but with some compilers it is actually the wall time.
|
|
|
|
\paragraph{on parallel execution, \texttt{pw.x} stops complaining that
|
|
``some processors have no planes'' or ``smooth planes'' or
|
|
some other strange error.}
|
|
|
|
Your system does not require that many processors: reduce the number
|
|
of processors to a more sensible value.
|
|
In particular, both $N_3$ and $Nr_3$ must be $\geq N_{pr}$ (see
|
|
section \ref{performance}, ``Performance Issues'', and in particular
|
|
section \ref{parissues}, ``Parallelization issues'', for the meaning
|
|
of these variables).
|
|
|
|
\paragraph{``warning : N eigenvectors not converged ...''}
|
|
|
|
This is a warning message that can be safely ignored if it
|
|
is not present in the last steps of self-consistency. If it
|
|
is still present in the last steps of self-consistency, and
|
|
if the number of unconverged eigevector is a significant
|
|
part of the total, it may signal serious trouble in self-consistency
|
|
(see next point) or something badly wrong in input data.
|
|
|
|
\paragraph{``warning : negative or imaginary charge...'', or
|
|
``...core charge ...''}
|
|
|
|
This is a warning message that can be safely ignored
|
|
unless the negative or imaginary charge is sizable,
|
|
let us say {\cal O(0.1)}. If it is, something seriously
|
|
wrong is going on. Otherwise, the origin of the negative
|
|
charge is the following. When one transforms a positive
|
|
function in real space to Fourier space and truncates at
|
|
some finite cutoff, the positive function is no longer
|
|
guaranteed to be positive when transformed back to real
|
|
space. This happens only with core corrections and with
|
|
ultrasoft pseudopotentials. In some cases it may be a
|
|
source of trouble (see next point) but it is usually
|
|
solved by increasing the cutoff for the charge density.
|
|
|
|
\paragraph{self-consistency is slow or does not converge.}
|
|
|
|
Reduce \texttt{mixing\_beta} from the default value (0.7) to $\sim
|
|
0.3-0.1$ or smaller, or try a different \texttt{mixing\_mode}.
|
|
You may also try to increase \texttt{mixing\_ndim} to more than 8
|
|
(default value).
|
|
Beware: the larger \texttt{mixing\_ndim}, the larger the amount of
|
|
memory you need.
|
|
|
|
If the above doesn't help: verify if your system is metallic or is
|
|
close to a metallic state, especially if you have few k-points.
|
|
If the highest occupied and lowest unoccupied state(s) keep exchanging
|
|
place during self-consistency, forget about reaching convergence. A
|
|
typical sign of such behavior is that the self-consistency error
|
|
goes down, down, down, than all of a sudden up again, and so on.
|
|
Usually one can solve the problem by adding a few empty bands and a
|
|
broadening.
|
|
|
|
Specific to US PP: the presence of negative charge density regions due
|
|
to either the pseudization procedure of the augmentation part or to
|
|
truncation at finite cutoff may give convergence problems.
|
|
Raising the \texttt{ecutrho} cutoff for charge density will usually
|
|
help, especially in gradient-corrected calculations.
|
|
|
|
|
|
\paragraph{structural optimization is slow or does not converge.}
|
|
|
|
Typical structural optimizations, based on the BFGS algorithm, converge to
|
|
the default thresholds ( \texttt{etot\_conv\_thr} and
|
|
\texttt{forc\_conv\_thr} ) in 15-25 BFGS steps (depending on the starting
|
|
configuration). This may not happen when your system is characterized by
|
|
``floppy'' low-energy modes, that make very difficult --- and of little use
|
|
anyway --- to reach a well converged structure, no matter what. Other
|
|
possible reasons for a problematic convergence are listed below.
|
|
|
|
Close to convergence the self-consistency error in forces may become
|
|
large with respect to the value of forces. The resulting mismatch
|
|
between forces and energies may confuse the line minimization
|
|
algorithm, which assumes consistency between the two. The code
|
|
reduces the starting self-consistency threshold
|
|
\texttt{conv\_thr} when approaching the minimum energy configuration,
|
|
up to a factor defined by \texttt{upscale}. Reducing
|
|
\texttt{conv\_thr} (or increasing \texttt{upscale}) yields a smoother
|
|
structural optimization, but if \texttt{conv\_thr} becomes too small,
|
|
electronic self-consistency may not converge. You may also increase
|
|
variables \texttt{etot\_conv\_thr} and
|
|
\texttt{forc\_conv\_thr} that determine the threshold for convergence
|
|
(the default values are quite strict).
|
|
|
|
A limitation to the accuracy of forces comes from the absence of
|
|
perfect translational invariance. If we had only the Hartree
|
|
potential, our PW calculation would be translationally invariant to
|
|
machine precision. The presence of an exchange-correlation potential
|
|
introduces Fourier components in the potential that are not in our
|
|
basis set. This loss of precision (more serious for
|
|
gradient-corrected functionals) translates into a slight but
|
|
detectable loss of translational invariance (the energy changes if all
|
|
atoms are displaced by the same quantity, not commensurate with the
|
|
FFT grid). This puts a limit to the accuracy of forces. The
|
|
situation improves somewhat by increasing the \texttt{ecutrho} cutoff.
|
|
|
|
\paragraph{\texttt{ph.x} stops with ``error reading file''.}
|
|
|
|
The data file produced by \texttt{pw.x} is bad or incomplete or
|
|
produced by an incompatible version of the code.
|
|
In parallel execution: if you did not set \texttt{wf\_collect=.true.},
|
|
the number of processors and pools for the phonon run should be the
|
|
same as for the self-consistent run; all files must be visible to all
|
|
processors.
|
|
|
|
\paragraph{\texttt{ph.x} mumbles something like ``cannot recover'' or
|
|
``error reading recover file''.}
|
|
|
|
You have a bad restart file from a preceding failed execution.
|
|
Remove all files \texttt{recover*} in \texttt{outdir}.
|
|
|
|
\paragraph{\texttt{ph.x} says ``occupation numbers probably wrong''
|
|
and continues; or ``phonon + tetrahedra not implemented'' and stops}
|
|
|
|
You have a metallic or spin-polarized system but occupations are not
|
|
set to ``smearing''. Note that the correct way to calculate occupancies
|
|
must be specified in the input data of the non-selfconsistent
|
|
calculation, if the phonon code reads data from it. The non-selfconsistent
|
|
calculation will not use this information but the phonon code will.
|
|
|
|
\paragraph{\texttt{ph.x} does not yield acoustic modes with $\omega=0$
|
|
at $\mathbf{q}=0$.}
|
|
|
|
This may not be an error: the Acoustic Sum Rule (ASR) is never exactly
|
|
verified, because the system is never exactly translationally
|
|
invariant as it should be (see the discussion above).
|
|
The calculated frequency of the acoustic mode is typically less than
|
|
10 cm$^{-1}$, but in some cases it may be much higher, up to 100
|
|
cm$^{-1}$.
|
|
The ultimate test is to diagonalize the dynamical matrix with program
|
|
\texttt{dynmat.x}, imposing the ASR.
|
|
If you obtain an acoustic mode with a much smaller $\omega$ (let's say
|
|
$<1 \textrm{cm}^{-1}$) with all other modes virtually unchanged, you
|
|
can trust your results.
|
|
|
|
\paragraph{\texttt{ph.x} yields really lousy phonons, with bad or
|
|
negative frequencies or wrong symmetries or gross ASR
|
|
violations.}
|
|
|
|
Possible reasons:
|
|
\begin{itemize}
|
|
\item
|
|
wrong data file file read.
|
|
\item
|
|
wrong atomic masses given in input will yield wrong frequencies
|
|
(but the content of file {\tt fildyn} should be valid, since the
|
|
force constants, not the dynamical matrix, are written to file).
|
|
\item
|
|
convergence threshold for either SCF ({\tt conv\_thr}) or phonon
|
|
calculation ({\tt tr2\_ph}) too large (try to reduce them).
|
|
\item
|
|
maybe your system \emph{does} have negative or strange phonon
|
|
frequencies, with the approximations you used.
|
|
A negative frequency signals a mechanical instability of the
|
|
chosen structure.
|
|
Check that the structure is reasonable, and check the following
|
|
parameters:
|
|
\begin{itemize}
|
|
\item The cutoff for wavefunctions, \texttt{ecutwfc}
|
|
\item For US PP: the cutoff for the charge density,
|
|
\texttt{ecutrho}
|
|
\item The k-point grid, especially for metallic systems!
|
|
\end{itemize}
|
|
\end{itemize}
|
|
|
|
\paragraph{``Wrong degeneracy'' error in star\_q.}
|
|
|
|
Verify the \textbf{q}-point for which you are calculating phonons.
|
|
In order to check whether a symmetry operation belongs to the small
|
|
group of \textbf{q}, the code compares \textbf{q} and the rotated
|
|
\textbf{q}, with an acceptance tolerance of $10^{-5}$ (set in routine
|
|
\texttt{PW/eqvect.f90}).
|
|
You may run into trouble if your \textbf{q}-point differs from a
|
|
high-symmetry point by an amount in that order of magnitude.
|
|
|
|
\end{document}
|