Reading Java Bytecode – Reverse Engineering
In Java, we can read java .class files with java.io.DataInputStream and write class files with java.io.DataOutputStream. The Java API defines the DataInputStream as
“A data input stream lets an application read primitive Java data types from an underlying input stream in a machine-independent way. An application uses a data output stream to write data that can later be read by a data input stream.”
The Java class file is organized with its own defined data types, u1, u2, and u4. These data types represent one, two, and four-byte unsigned data. Since DataInputStream allows us to read primitive data types, we can use its methods to read portions or all of the .class file. Namely, readByte() which reads 8 bits (1 byte), readUnsignedShort() which reads 16 bits (2 bytes) at a time, and readInt() which reads 32 bits (4 bytes).
The structure of the class file is defined as follows:
ClassFile { u4 magic; u2 minor_version; u2 major_version; u2 constant_pool_count; cp_info constant_pool[constant_pool_count-1]; u2 access_flags; u2 this_class; u2 super_class; u2 interfaces_count; u2 interfaces[interfaces_count]; u2 fields_count; field_info fields[fields_count]; u2 methods_count; method_info methods[methods_count]; u2 attributes_count; attribute_info attributes[attributes_count]; }
From the vmspec on class file structure, we need to use the following sequence of methods to get to the defined Constant Pool:
readInt(); // magic number defined as 0xCAFEBABE
readUnsignedShort();//
minor version of the compiler
// that produced the file
readUnsignedShort();//
major version of the compiler
// that produced the file
readUnsignedShort();//
number of entries in the constant
// pool
Now we have the initial contents of the class file and the size of the constant pool. The work begins. Since the constant pool consists of different structures, we must first determine which structures we’re dealing with first, string constants, class names, field names, etc. Each of the structures is defined as follows:
|
Constant Type |
Value |
The entries are stored in a cp_info structure the has a byte tag that evaluates to one of the above values. We can read the tag with the DataInput method readByte().
Continue next week…