Variables

Edit

Imagine we have a container labeled favStory, declared to store a Book that represents my favourite story.

Would it be right for us to (try to) store an instance of a Person in this container?

Absolutely not!

It is therefore important for us to appreciate a few things about variables:

  • A variable is a container that can store a value.
  • A variable is declared to be of a certain type:
    • Its type determines the values you may place inside the variable (container).
    • There are two categories of types:
      • Primitive types, and
      • Object reference types.
  • A variable is declared with a non-conflicting name:
    • Its name allows us to access what’s inside the variable (container).
  • Different values have different types, and different types need different amounts of space.
    • Therefore, the size of the variable depends on the size of its declared type.


Primitive variables

As you already know, containers come in different sizes.

In programming languages, variables also come in different sizes depending on their type.

The first category of types is known as primitives:

  • Primitives are predefined types that are already built into Java.
  • There are eight primitive types, and each has a size and range for the possible values:
Type Size Min to max values (inclusive)
byte 8 bits -128 to 127
short 16 bits -32,768 to 32,767
int 32 bits -231 to 231-1
long 64 bits -263 to 263-1
float 32 bits varies
double 64 bits varies
boolean varies true or false
char 16 bits 0 to 65,535


As you can see, types allow for:

  • Different kind of data to be represented:
    • e.g. true/false logic versus decimals vs whole numbers,
  • Larger ranges when required:
    • e.g. if you are richer than maximum value of an int, use a long
  • Different sizes to save space:
    • e.g. if you only need very small numbers, then byte will take up less space.

Here are just a few examples:

int testScore = 98;
boolean isSunny = true;
double pi = 3.14159265;
char myGrade = 'A';

Which you can visualise as follows:

Edit

For more information on primitive types, see the Java documentation about Primitive Data Types.

Another important thing that the documentations say is:

“Primitive values do not share state with other primitive values.”

Here’s what this means:


In this example, we see that num1 and num2 do not share state.

Even though the value of num2 was copied over from the value in num1, changing one later on will not affect the other.

Type Casting for Primitive Types

Since these different types have different ranges of values that they can store, we need to be aware when we are assigning the value of one to the other.

This is an easy read about Java Type Casting.

There are two types of casting in Java: widening, and narrowing.

Widening

This is easy to understand, and you don’t need to worry about it as Java handles it automatically.

In a nutshell, we are converting a smaller type to a larger type.

For example:

int anInt = 12345;

long aLong = anInt          // automatically widened
double aDouble = anInt;     // automatically widened

Narrowing

It’s when you are moving in the other direction that need to be careful, as potentially you will be losing information about the data saved.

In other words, we will be converting a larger type to a smaller type.

It’s not about the value stored (even if “it will obviously fit”), it’s about the type.

For example:

double aDouble = 23.0;     

int anInt = (int) aDouble;     // explicit narrowing 

You will see that we have to explicitly narrow from a double to an int type, even though 23.0 is the same as 23 and clearly that fits inside an int!

This is also required because the compiler requires you to explicitly acknowledge that you will potentially lose some information in the process of narrowing (i.e., if aDouble has the value 99.99, then anInt will have the value 99—losing the decimal component.


Object reference variables

In Java, if it’s not a primitive type – then it’s an object reference type!

Edit    Joke time: Back to the Person class

Java is a very ethically-conscious programming language…

It will never store a person inside a container. Variables aren’t big enough to contain them!


Edit

But seriously, jokes aside:

  • As we’ve seen with classes, we can create a custom type with plenty of different field types.
  • This means that instances need to be large enough to store the values of all their fields.

Instances are placed in a special part of the memory known as the heap (more on this very soon).

So, if the instance itself is not inside the variable – then what is inside the variable?

It’s a reference to where the instance is stored in memory!

A reference is effectively the address of that memory location!

Therefore, all non-primitive variables are actually object reference variables:

Edit

So if we have the following code:

Person bob = new Person("Bob");       // located at 0x837
Person jenny = new Person("Jenny");   // located at 0x819
Person vicky = new Person("Vicky");   // located at 0x792
Person joe = new Person("Joe");       // located at 0x810
Person alex = new Person("Alex");     // located at 0x846

Then in reality this is what we end up with:

Edit

Notice that the instances in the heap aren’t necessarily laid out in the order they were declared or instantiated.

It’s completely up to the Java runtime to decide where instances are stored.

All we care about, is getting the reference (i.e. location) of where they are as soon as the instance is created.

This is the job of the new keyword.

Consider the following code:

public static void main(String[] args) {
    Person jenny = new Person("Jenny", "Jones", 20);
    Person bob;
}

We can visualise it as follows:

Edit

The points of interest are:

Edit A variable named jenny is declared of type Person:

  • The variable is an object reference type.
  • The variable is placed on the stack, within the stack frame corresponding to the main() method.

Edit The new Person(...) bit is executed:

  • The Java runtime allocates space in the heap.
  • The amount of space allocated needs to fit all the corresponding information for an instance of Person.
  • The constructor is executed, which assigns the fields based on the arguments passed in.
  • The instance happens to be placed at 0x819.
  • This reference (address) is “returned” from the constructor.

Edit The = assignment operator is executed:

  • The reference value (i.e. 0x819) that is returned from the constructor is copied into the jenny variable.
  • If you don’t store the reference from the constructor, then we won’t have any idea where the instance was placed!

Edit A variable named bob is declared of type Person:

  • Like jenny, the variable is an object reference type.
  • The variable is placed on the stack, within the stack frame corresponding to the main() method.

Edit There is no value assigned inside the bob object reference variable:

  • As there is no = used here, there is no value stored inside the bob variable.
  • We say the variable is uninitialised.
  • As this is an object reference variable, the value it gets is null.
  • If this is what you intended (i.e. not to initialise it just yet), it’s always good practice to explicitly initialise it to null, like so:

      // explicitly initialise to null
      Person bob = null;
    


Object reference values share state

Java is always copy by value.

We saw this above, namely how the primitive values did not share state when copied.

Believe it or not, object reference variables are also copy by value.

  • In other words, the address (i.e. reference) is copied.
  • But since the address is copied, then in effect the object instance itself (i.e. the value) is “shared”.

Let’s see this in action by extending our code example above with an extra line of code:

public static void main(String[] args) {
    // code as we had before
    Person jenny = new Person("Jenny", "Jones", 20);
    Person bob;
    
    // extra line of code
    bob = jenny;
}

We can visualise it as follows:

Edit

The points of interest are:

Edit We read the value stored inside the jenny object reference variable:

  • The value is 0x819.
  • We don’t need to go into the heap or engage with the actual instance in any way.

Edit The value is copied:

  • The job of the = assignment operator is to copy. It’s copying the value 0x819.

Edit We write to the bob object reference variable:

  • The value we are writing is 0x819.
  • We save it into the bob variable.
  • We never went to the heap to interact with any of the fields etc.

What we have done is copy the reference (not copy the instance).

So, in effect, we have two object reference variables that refer to the exact same instance:

  • We only ever used the new keyword once (the first line of code).
  • There is therefore only one instance in existence.
  • Two variables are pointing to it.
  • If we change anything about the instance through one variable, we will see the changes through the other variable.


The low level of data storage: the “gridded wall”

  • Think of memory as a massive wall, with a single-column on it.
  • There are many rows stacked up on top each other to form this tall wall.
    • Every row in the wall is uniquely addressed, starting from zero.
    • Every row in the wall has exactly the same size as any other row.
    • If you had a bigger wall, the size of each row remains the same.
  • Within each row, there are 8 tiny storage boxes (or bits):
    • You store a value in one of these boxes/bits by painting it,
    • You can paint each of those boxes/bits as either black (0) or white (1),
    • The value of the row depends on the b/w combination of those 8 boxes/bits.
  • A single row can therefore store 28 different value combinations:
    • If you need less combinations of values, that one box/bit would suffice.
    • If you need more unique values, you will need to use more boxes/bits.
  • But recall that the addressing only happens at the row level.
    • In other words, the smallest addressable storage space is 1 byte (8 bits).

For example, if I wanted to store the value of whether Nasser is happy in this wall, I would need only 1 box (as there are only 2 possible value combinations: I’m either happy, or I’m not).

  • I would paint all bits of the corresponding row as bbbbbbbw (i.e. 00000001) to represent true, or bbbbbbbb (i.e. 00000000) to represent false.
  • I would also record the address of that row on the wall, so that I can recall or update the value later on.

Edit


The high level of data storage: Heap and Stack

  • At a higher level of abstraction, an important distinction in memory is:
    • Data stored on the heap, and
    • Data stored on the stack.
  • Within the stack, or within the heap, we still have rows and boxes as above.
  • What data goes to heap, and what goes to stack?

The Stack and Stack Frames

The stack is composed of stack frames stacked on top of each other.

Each stack frame is a portion of the memory temporarily being used, allocated to store data needed for a given method execution/call.

When a method is called, a stack frame such as this is created for it:

Edit

Each stack frame has space to store the following details to support method invocation:

  • Local variables (if any) that are declared within the method,
  • Parameters (if any) that are passed into the method,
  • The return address of the next instruction to execute (when the method ends),
    • This way, the program knows which statement to continue executing when it returns to the method that called the current method.
  • The return value (if any) that is being returned by the method once it ends.

A newly-created stack frame is always added to the top of the stack.

The data stored in the stack frame will persist until the method finishes executing.

When the method execution finishes, the stack frame associated with that particular method execution is trashed (i.e. removed from the stack).

All the data corresponding to the local variables will be removed with it.

  • However, note that object values will not be affected (as they are stored in the heap—see below)

The method completes when:

  • A return statement is encountered inside the method, or
  • The closing } is reached at the end of the method, or
  • An exception ends the method (more on exceptions later).

If another method call is called, then a new stack frame is allocated to store the data associated with the new method execution.

In summary, stack frames are what make it possible to pass values/data back and forth between methods (i.e. as parameters and return values).

The call stack

public static Person bar() {
    Person p = new Person("Bob", "Jones", 18);
    return p; 
}
 
public static int foo(int a, int b) {
    int c = 2*a;
    Person d = bar();
    return c + b + d.getAge();
}

public static void main(String[] args) { 	
    int v1 = 5;
    int v2 = 7;
    int a = foo(v1, v2);
    System.out.println(a);
}

Edit


The Heap

The heap allows us to store data that we wish to be more persistent across method calls.

In Java, all object instances go on the heap.

We see this in the example immediately above:

  • The bar() method created an instance of Person.
  • The local variable p (an object reference variable) is stored on bar()’s stack frame.
    • However, the object instance itself (i.e. the value) is stored on the heap.
  • bar() copies the reference inside p, and returns it to foo().
    • It is saved inside the local variable d.
  • The object reference variables p and d are both referring to the same Person instance stored in the heap.
    • There is therefore only one Person instance.

Using the new keyword along with the constructor of a class will create an instance of that class, and place it on the heap.

The size of the instance (in the heap) will depend on the instance fields declared in the class.

If we are going to pass object references as parameters, the same thing happens:

  • The object instances are always on the heap, and
  • Only the reference is copied over into the stack frame (never the instance itself).




Equality versus identity

Primitive types

For primitive types, it’s easy to understand what makes one primitive “equal” to another:

int secretNumber = 123;
int yourGuess = 123;

if (secretNumber == yourGuess) {
    System.out.println("Correct guess!");
} else {
    System.out.println("Sorry, game over.");
}

Even though they are two completely different variables, they are still clearly equal (because their values, 123, are the same).

When it comes to dealing with the primitive types in Java, there’s really no other (mis)interpretation possible. Nice and simple.

Object reference types

The following is a very brief overview of the core concept at play here:

When it comes to object reference variables, it is important to understand that “equal” can mean one of two things:

Identity

Note: Identity is sometimes also called “reference equality”. But we’ll stick to using the term “identity” as this avoids confusion.

Identity is when we have two object reference variables (or even the same object reference variable), and they are both pointing to the exact same instance on the heap.

In other words, the address stored is the same in both object reference variables:

In particular, have a close look at the Colour class above. You will see there’s nothing really in there, and that identity really only depends on the memory addresses.

Here’s how we visualise this situation:

Edit


Despite c1 and c2 having exactly the same values for all their respective fields, they are still referring to two different instances (and therefore the reference values are different). So we say they do not have the same identity.

Edit Get your hands dirty!

Edit the code above to make sure you’ve understood the concept of identity well.

Make a few more instances, with same and different field values, also see what happens when you compare an instance to itself.


Equality

Equality is when we have two object reference variables, and regardless of whether they are pointing to different instances on the heap or not, the attributes of the instances are such that we conceptually deem the two instances “equal” to each other. In other words, what the (different) instances represent is (by our design) deemed that they are “representing the same thing”.

Before looking at the code, let’s look at this visualisation:

Edit

It is clear to see that we have three different Colour instances—and so they have a different identity from each other.

But, let me ask you: According to the concept of colours, aren’t they strictly speaking “the same colour”? Even in the case of being French (r3) versus English (r1 and r2). Their meaning still represents “the same” equal colour!

This is what is known as equality (also known as object equality).

We, as the designers of our software, get to define (and dictate) what equality means—in a manner that makes sense!

In the example above, we might decide that the values of the red, green, and blue fields are all that matters in deeming two colours to be meaningfully the same—and that the name field doesn’t actually matter to us.

To do this, we need to define the equals() method for the Colour class:

public class Colour {
    ...
    @Override
    public boolean equals(Object obj) {
        // ...
    }
}

Notice this is an instance method. It takes an instance (of type Object) as parameter, and it should return a boolean that represents our definition of whether obj is deemed to be equal to this or not.

There’s a few concepts at play here that make all this magic work. But we will keep it simple for now, as we haven’t covered all the concepts just yet.

First, let’s list out all possible situations in determining if one Colour instance is meaningfully equivalent to another Colour instance:

  1. If the the two instances are actually the exact-same instance (i.e. they have the same identity), then this is an immediate true!
  2. If one of the instances is null, but the other is not null, then this is an immediate false. They can’t be equal if one exists and the other doesn’t!
    • In the context of the equals() method, one of the instances (namely this) clearly can’t be null (otherwise we wouldn’t be inside the equals() method for that instance)!
    • So, we only need to check to see if the Object parameter is null (since this will never be null).
  3. If the two instances are instances of different class types, then we can also immediately return false.
  4. Otherwise, we have a situation where we have two different non-null instances of the same class type. We therefore perform whatever checks we deem to make sense. In our example, we will say they are equal if they have:
    • The same value for the red field, and
    • The same value for the green field, and
    • The same value for the blue field.
    • We don’t bother checking the name field, because we have decided that it doesn’t distinguish one colour from another—and therefore shouldn’t be part of our definition of what makes one colour equal another.

The resulting code is as follows—have a look in the Colour.java file:

You will notice that we have also written some logic for the hashCode() method. This is beyond the scope of what you need to understand at this stage, but it is done here as a reminder that we always need to implement it in addition to the equals() method. In a nutshell, hash codes help improve performance in some data structures (second half of the course). The main point to be aware of, is that two instances that are equal() should also return the exact-same hashCode() value. It is not so much a problem (at least in terms of correctness) if non-equal() instances also return the same hashCode().

Edit Get your hands dirty!

Edit the code above to make sure you’ve understood the concept of equality well:

  • Make a few more instances, with same and different field values of name, red, green and blue fields.
  • Observe the differences of == compared to equals().
  • What happens if you compare an instance to itself? i.e. r1.equals(r1)
  • What happens when you use null for the object references?
  • Add print statements in the equals() method to trace the code when it is executed (you might want to make fewer comparisons from main()).
  • Change the logic of the equals() method to make colours equivalent only if their names are the same.
    • Hint: String isn’t a primitive type! So you should be using equals() for comparing one String to another!


Edit Did you know?

You can actually get the IDE (e.g. VS Code) to help you implement the equals() and hashCode() methods!

  • Right-click inside the class you are working with (in the code text area), then
  • Source Action… > Generate hashCode() and equals()…

For the purposes of this course, you still need to know how to implement the equals() method yourself. Even if you want to use VS Code to generate these, you need to understand the concepts above otherwise it will generate an incorrect implementation!

For this part of the course, we don’t need to understand the details of the hashCode() method yet.



Want to practice more? Let’s get rich with Money!

Here’s another exercise you can play with to practice the concept of equality and building your own class.

Edit Get your hands dirty!

Edit the code above to play with constructors, equality, fields, getters, printing, equality, etc:

  • First, think about the fields you need to represent a Money instance. For starters, let’s do this:
    • String for the currency (e.g. “NZD” for New Zealand Dollars)
    • int for the dollar component money (e.g., the 10 of $10.50)
    • int for the cent component of money (e.g., the 50 of $10.50)
  • Next, implement the corresponding constructors:
    • One that takes in all three fields
    • One that takes in only the currency and the dollar component, and sets the cent component to 0
    • One that takes in only the currency, and sets the dollar and cent components to 0
  • This you notice some redudancy in the constructors? We can delegate to another constructor to reduce the amount of code we need to write. For example, we can have the constructor that takes in only the currency and the dollar component, delegate to the constructor that takes in all three fields.
    • Hint: use the this() keyword to do this.
  • Next, implement the toString() method to print out the money in the format of NZD$10.50.
    • What about the case where the cent component is single digit? i.e., 1 dollar and 5 cents is NZD$1.05 — not NZD$1.5.
  • Next, implement the getters getCents() and getDollars().
  • Next, implement the addCents(int cents) method to add the given number of cents to the current instance.
    • What if it is more than 100?
    • What if the current instance has a cent component of 99, and we want to add 2 cents to it? What should the new cent and dollar components be?
    • In fact, this issue should strictly speaking be handled during the constructor stage also. Namely, creating a Money instance with a cent component of 150 should be treated as 1 dollar and 50 cents.
  • Rather than dealing with the headache of dealing with cent and dollar fields, how about only store the total amount of cents in the Money instance? Refactor the code to only have a single int field for the total amount of cents. Update the getters and addCents() method accordingly.
    • Note: The purpose of this step (changing the internal private implementation to only contain a single “cents” field (without “dollars”), yet there appears to be cents and dollars components at the public level with getCents() and getDollars(), is to emphasise the point of encapsulation in OOP.
  • Next, implement the add(Money other) method to combine the two Money instances into one.
    • Only the same currencies can be added together. If they are different, return false to signal that the addition failed.
    • The this instance is updated to have the sum of the two Money instances.
    • The other instance is not updated.
    • The add(Money other) method returns true to signal that the addition was successful.
  • Next, implement the equals() method to compare this instance with the Object obj instance.
    • Ultimately we want to compare that two Money instances are equal if they have the same currency, and the same total amount of cents. The first Money instance is this, and the second Money instance (should be) the obj. However, there’s nothing stopping the user from passing in a String or Integer instance instead of a Money instance for Object obj. So we need to check that obj is actually a Money instance before we can cast it to a Money instance.
    • If this instance and the obj instance are the same instance, then they are equal.
    • If obj is null, then they are not equal (since this is not null).
    • If obj is not a Money instance, then they are not equal (since they are different types).
    • If obj is a Money instance, we can safely compare the two Money instances in terms of their currency and total amount of cents.