iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🖥

Reading the JVM | Hacking the JVM Part 4 - Arithmetic Operations and Constant Pool

に公開

This is a continuation from last time. You can find the previous article here:

https://zenn.dev/peyang/articles/reading-jvm-chapter-03-2

This series is designed as a guide to deciphering the JVM specification.
Since the JVM specification is extremely long and often complex, I will summarize the key points for each section.
By understanding the internal structure and operating principles of the JVM, we aim to gain a deeper understanding of Java's performance, security, and memory management mechanisms.

You can find the series here:

https://zenn.dev/peyang/articles/reading-jvm-chapter-00

Chapter 3: Compiling for the Java Virtual Machine

Chapter 3 of the JVM specification is titled "Compiling for the Java Virtual Machine."
This chapter explains how to compile Java source code into bytecode that the JVM can execute.

From this article onward, let's look at specific examples to understand JVM instructions and how they operate.

In this post, we will learn about handling constants and local variables, as well as using them as control variables.

You can find the handbook here:

https://zenn.dev/peyang/articles/reading-jvm-chapter-03-1

3.2 Arithmetic (› 3.2 Arithmetic)

The JVM typically performs arithmetic operations using the operand stack (the only exception is the iinc instruction, which directly manipulates the value of a local variable).

The following align2grain method takes int arguments x and y and returns the value of x rounded up to the power of 2 specified by y.

int align2grain(int i, int grain) {
    return ((i + grain - 1) & ~(grain - 1));
}

When converted to JVM bytecode, it looks like this:

int align2grain(int i, int grain) {
    // i + grain - 1
    iload_1             // Push the value of argument i onto the stack
    iload_2             // Push the value of argument grain onto the stack
    iadd                // Add them and push the result onto the stack
    iconst_1            // Push constant 1 onto the stack
    isub                // Subtract these and push the result onto the stack
    // grain - 1
    iload_2             // Push the value of argument grain onto the stack
    iconst_1            // Push constant 1 onto the stack
    isub                // Subtract these and push the result onto the stack
    // ~(grain - 1)
    iconst_m1           // Push constant -1 onto the stack
    ixor                // Take the bitwise exclusive OR (XOR) and push the result onto the stack
    iand                // Take the bitwise AND and push the result onto the stack
    ireturn             // Return the value at the top of the stack
}
Example of javap-style instructions
Method align2grain(int, int)
0   iload_1             // Push the value of argument i onto the stack
1   iload_2             // Push the value of argument grain onto the stack
2   iadd                // Add them and push the result onto the stack
3   iconst_1            // Push constant 1 onto the stack
4   isub                // Subtract these and push the result onto the stack
5   iload_2             // Push the value of argument grain onto the stack
6   iconst_1            // Push constant 1 onto the stack
7   isub                // Subtract these and push the result onto the stack
8   iconst_m1           // Push constant -1 onto the stack
9   ixor                // Take the bitwise exclusive OR (XOR) and push the result onto the stack
10  iand                // Take the bitwise AND and push the result onto the stack
11  ireturn             // Return the value at the top of the stack

The following diagram shows the flow of this sequence of instructions.

Operands for instructions that perform arithmetic operations are popped from the operand stack, and the results of those operations are pushed back onto the operand stack. From this, we can see that the result of one arithmetic operation can be used as an operand for another arithmetic instruction (without even having to store it in a local variable).

In other words, when performing multiple arithmetic operations in sequence, you can obtain the result simply by lining them up (which effectively creates a nested structure). For example, the ~(grain - 1) part can be calculated with the following instructions:

// Calculate grain - 1
0: iload_2         // Push grain onto the stack
1: iconst_1        // Push 1 onto the stack
2: isub            // Subtract these and push the result (grain - 1) onto the stack

// Calculate ~(grain - 1)
3: iconst_m1       // Push -1 onto the stack
4: ixor            // Take the exclusive OR (XOR) of these and push the result (~(grain - 1)) onto the stack

The following diagram shows the flow of this sequence of instructions.

The initial grain - 1 is obtained by subtracting the constant 1, generated by the iconst_1 instruction, from the contents of local variable 2. These operands are pushed onto the operand stack and subtracted by the isub instruction. The result is then pushed back onto the operand stack.

The value of grain - 1 obtained here is immediately used to perform an exclusive OR (XOR) with the constant -1 generated by the subsequent iconst_m1 instruction. (This leverages the property that ~x == -1 ^ x.)

Similarly, the result of this XOR becomes an operand for the subsequent iand instruction.

Utilization of this concept

This concept is used throughout JVM instructions; for example, it is applied similarly in method calls and field accesses.

For example, the following getObjectVoice method calls the toString method of the Object class and returns the result.

String getObjectVoice() {
    Object obj = new Object();
    return obj.toString();
}

When converted to JVM bytecode, it looks like this:

String getObjectVoice() {
    new java/lang/Object
    dup
    invokespecial java/lang/Object-><init>()V
    invokevirtual java/lang/Object->toString()Ljava/lang/String;
    areturn
}

In this example, an instance of the Object class is created with the new instruction, and its reference is pushed onto the operand stack. Next, the reference is duplicated with the dup instruction and pushed onto the operand stack again. The duplicated reference is used to call the constructor via the invokespecial instruction, and the original reference is used to call the toString method via the invokevirtual instruction.

In this way, JVM instructions use the operand stack to efficiently perform arithmetic operations and method calls.

3.5 Accessing the Run-Time Constant Pool (› 3.5 Accessing the Run-Time Constant Pool)

Many numeric constants, objects, fields, and even methods are stored in the runtime constant pool of the current class and are accessed from there.

Values of types int, long, float, and double, as well as references to strings, can be retrieved using the ldc family of instructions (ldc, ldc_w, and ldc2_w). (References to objects will be discussed later.)

The ldc and ldc_w instructions are used to retrieve Category 1 constants (constants other than double or long) and string references. Usually, ldc is used, but if the constant pool has a very large number of entries and the index exceeds 255, ldc_w is used instead.

On the other hand, the ldc2_w instruction is used to retrieve Category 2 constants (constants of type double or long).

Compilation is straightforward in any case, as shown below.

void useManyNumeric() {
    int i = 100;
    int j = 1000000;
    long l1 = 1;
    long l2 = 0xffffffff;
    double d = 2.2;
    
    // The parts using these values are omitted.
}

When this code is converted into JVM bytecode, it looks like the following.
Note that the following code is not in JAL language due to the nature of the explanation, but simply shows the output obtained by the javap command.

Method void useManyNumeric()
0   bipush 100      // Use bipush for values within the byte range
2   istore_1
3   ldc #1          // Use ldc for large integer values (1,000,000)
5   istore_2
6   lconst_1        // Use lconst_<i> for small long values (0, 1)
7   lstore_3
8   ldc2_w #6       // Use ldc2_w for large long values (0xffff_ffff)
                    // Note that any long value can be handled by the ldc2_w instruction
11  lstore 5
13  ldc2_w #8       // Use ldc2_w for double constants (2.200000)
                    // Note that any double value can be handled by the ldc2_w instruction
16  dstore 7

// The parts using these values are omitted.

Summary

How was it?
In this article, we learned about JVM arithmetic operations and how to access the constant pool.
Additionally, we introduced methods to streamline various operations using the operand stack.

Next time, we will learn about the various ways to handle control variables.
Until then, have a great bytecode life!

Next Article Link

https://zenn.dev/peyang/articles/reading-jvm-chapter-03-4

  • Lindholm, T., Yellin, F., Bracha, G., & Smith, W. M. D. (2025). The Java® Virtual Machine Specification: Java SE 24 Edition.
  • Lindholm, T., & Yellin, F. (1999). The Java™ Virtual Machine Specification (2nd ed.). Addison-Wesley. ISBN 978-0-201-43294-7
  • Otavio, S. (2024). Mastering the Java Virtual Machine. Packet Publishing. ISBN 978-1-835-46796-1
  • Godfrey, N., & Koichi , M. (2010). Decompiling Java: Reverse-analysis Techniques and Code Obfuscation ISBN 978-4-87311-449-1
GitHubで編集を提案

Discussion