diff --git a/docs/groundstate.rst b/docs/groundstate.rst
index 9596a366..7867340a 100644
--- a/docs/groundstate.rst
+++ b/docs/groundstate.rst
@@ -157,6 +157,40 @@ where ``Diag.MPOrder`` specifies the order of the Methfessel-Paxton
 expansion.  It is recommended to start with the lowest order and
 increase gradually, testing the effects.
 
+Go to :ref:`top <groundstate>`.
+
+.. _gs_pad:
+
+Padding Hamiltonian matrix by setting block size
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+With the default setting, the size of Hamiltonian and overlap matrices 
+is determined by the total number of support functions. 
+It can be a prime number and timing of diagonalisation can be very slow 
+in such cases, since the division of the matrix into small pieces is difficult.
+
+By padding, we can change the size of Hamiltonian matrix to improve 
+the efficiency of the diagonalisation. To set an appropriate value 
+for the block size of the matrix, specify the following two variables.
+
+ ::
+
+  Diag.BlockSizeR       20
+  Diag.BlockSizeC       20
+
+Note that these two numbers should be the same when padding 
+(and when using ELPA which will be introduced to CONQUEST soon).
+We suggest that an appropriate value is between 20 and 200, but 
+this should be tested. 
+
+The option for padding was introduced after v1.2, and if you would 
+like to remove it, set the following variable. 
+
+ ::
+
+  Diag.PaddingHmatrix              F 
+
+
 Go to :ref:`top <groundstate>`.
 
 .. _gs_on:
diff --git a/docs/input_tags.rst b/docs/input_tags.rst
index e14203d5..c416d421 100644
--- a/docs/input_tags.rst
+++ b/docs/input_tags.rst
@@ -657,29 +657,27 @@ Diag.GammaCentred (*boolean*)
 
     *default*: F
     
-Diag.ProcRows (*integer*)
+Diag.PaddingHmatrix (*boolean*)
+    After v1.2, we have introduced a method to have an optimum value of 
+    block size for Hamiltonian and overlap matrices (See below) by padding.
+    By setting 'F', we do not use the method.
 
-    *default*:
-
-Diag.ProcCols (*integer*)
-
-    *default*:
+    *default*: T
 
 Diag.BlockSizeR (*integer*)
+    Block size for rows (See next).
 
-    *default*:
+    *default*: Determined automatically
 
 Diag.BlockSizeC (*integer*)
     R ... rows, C ... columns
-    These are ScaLAPACK parameters, and can be set heuristically by the code. Blocks
-    are sub-divisions of matrices, used to divide up the matrices between processors.
+    These are ScaLAPACK parameters, and can be set heuristically by the code. 
+    Blocks are sub-divisions of matrices, used to divide up the matrices between processors.
     The block sizes need to be factors of the square matrix size
     (i.e. :math:`\sum_{\mathrm{atoms}}\mathrm{NSF(atom)}`). A value of 64 is considered
-    optimal by the ScaLAPACK user’s guide. The rows and columns need to multiply
-    together to be less than or equal to the number of processors. If ProcRows
-    :math:`\times` ProcCols :math:`<` number of processors, some processors will be left idle.
+    optimal by the ScaLAPACK user’s guide. 
 
-    *default*:
+    *default*: Determined automatically
 
 Diag.MPShift[X/Y/Z] (*real*)
     Specifies the shift *s* of k-points along the x(y,z) axis, in fractional
@@ -742,7 +740,10 @@ Diag.ProcRows (*integer*)
 
 Diag.ProcCols (*integer*)
     Number of columns in the processor grid for SCALAPACK within each k-point
-    processor group 
+    processor group.  The rows and columns need to multiply
+    together to be less than or equal to the number of processors. If ProcRows
+    :math:`\times` ProcCols :math:`<` number of processors, some processors will be left idle.
+
 
     *default*: Determined automatically