diff --git a/jbmc/src/java_bytecode/README.md b/jbmc/src/java_bytecode/README.md index b590319830..5b13ce8c7a 100644 --- a/jbmc/src/java_bytecode/README.md +++ b/jbmc/src/java_bytecode/README.md @@ -25,7 +25,88 @@ To be documented. \subsection java-class-section How a java program / class is represented in a .class -To be documented. +Every Java class is compiled into a .class file. Inner classes, anonymous +classes or classes for tableswitches are also compiled into their own .class +files. + +There exists an [official +specification](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html) + +Each class files contains information about its version, the constant pool, +information about the contained class, its super-class, its implemented +interfaces, its fields, methods and finally additional attributes, such as +information about lambda functions used in the methods of the class or inner +classes the it contains. + +The content of a .class file can be inspected via the `javap` tool which is part +of the JDK installation. Useful options are `-c` for code, `-p` to display +protected / private methods `-l` for line numbers and local variable +information, as well as `-verbose` which prints a lot of additional information. + +In general, all variable length entries in a class file are represented by first +having an integer `n` that specifies the number of entries and then an array of +`n` such entries in the file. All variable length attribute also contain +information about their length in bytes. + +The integer format used in class files are unsigned integers of size 8/16/32 +bit, which are named as `u1`/`u2`/`u4`. + +\subsection java-class-access-flags Access Flags + +The JVM specification defines different access flags, e.g., `final`, `static`, +`protected`, `private` etc. where different ones are applicable to the class +itself, its fields or methods. All access flags are represented as bits, the set +of bits that are defined for one entity is represented as disjunction of those +values. Each of these values is defined as a constant with a name prefixed with +`ACC_` in JBMC, e.g., as `#define ACC_PUBLIC 0x0001` or `#define ACC_ENUM +0x4000`. + +\subsection java-class-constant-pool Constant Pool + +The constant pool contains all strings and referred values that are used in the +.class. This includes the names of the class itself and its super-class, as well +as the names and signatures of all fields and methods. All strings in the +constant pool are in UTF-16 format. + +\subsection java-class-fields Fields + +Each member variable of a class has a field entry with a corresponding field +information structure. This contains the name of the field, its raw JVM type +(called the descriptor) and an optional signature. + +A signature is an extension to the Java raw types and contains information about +the generic type of an object if applicable. + +The name of the field, the descriptor and the signature are all represented as +indices into the constant pool of the class file. + +\subsection java-class-methods Methods + +Methods are represented in a similar way as fields. Each method has an +associated name, descriptor and optional signature entry in the constant pool +table. + +An implemented method also has several attributes. One is the `Code` attribute +that stores the actual bytecode instructions. There is also an optional +`LocalVariableTable` which contains the names of local variables and +parameters. In addition to this there is also an optional +`LocalVariableTypeTable` that contains information about generic local variables +and parameters. Finally the exception table is defined as entries in the +`Exceptions` attribute. + +Note: most information about generic types is optional and exists mainly as +debugger information. This is because the Java compiler ensures that typing is +correct and creates code accordingly. The same holds true for information on +local variables. It is therefore advisable to compile Java projects with the +`-g` option that adds debugging information in all cases. + +\subsection java-class-attributes Attributes + +The last section contains additional attributes, e.g., `SourceFile` which +specified from which source file the .class was compiled, `BootstrapMethods` +which is used for lambda functions or `InnerClasses` which refers to inner +classes of the class in question. + \section java-bytecode-runtime-exceptions Adding runtime exceptions (java_bytecode_instrument)