Sunday, May 31, 2009

Strings & Java

Perhaps one of the most frequently used Class in Java is the String class. The major reason for Strings to be really poweful is the flexibility of Strings to interact with other objects with so much ease.

Strings in Java posses many unique characteristics that differentiates it from the rest. For example, it is a well-known fact that Strings are Immutable. For those who do not know what that means is, 'String objects one created, can never be changed.' An example would make it more clear.

String s1 = "Hello World";
Here, s1 holds the reference to the String object on the heap containing the value "Hello World". Now, if we type

String s2 = s1.concat(", here I come");
Now, s2 holds another object on the heap with the value "Hello World, here I come". But this doesn't change the object s1. What it rather does, it creates a new object concatenating the new value to it and storing its reference in s2.

If we would have typed,

s1.concat(", here I come");
then also a new object with value "Hello World, here I come" is created but is lost and cannot be referenced since we have not stored it somewhere. Here also, s1 remains the same and is not changed. It cannot be changed.

It would also be good to remember that to make Java Memory Model more efficient,  Java Runtime allocates memory for Strings from "String constant pool" rather than allocating the memory from the heap. 
Now, when the compiler encounters any string literal, it first searches the String constant pool for an identical string. 
  • If a match is found, the reference to the already existing identical string is returned for the new request. 
  • If a match is not found, then a new string literal is created in the pool of strings. This approach works without failure since Strings are immutable in nature.
An example can again smooth things out here:

String s1 = "Agraj";
String s2 = "Agraj";
System.out.println(s1 == s2);              // prints true !!
System.out.println(s1.equals(s2));     // prints true as expected

String s1 = "Agraj";
String s2 = new String("Agraj");
System.out.println(s1 == s2);              // prints false because of 'new'
System.out.println(s1.equals(s2));     // prints true as expected

The first example prints according to String constant pool approach. While, the second example allocates memory again to s2 because the new operator is used, which forces the compiler to allocate memory again.
It would not hurt to remember that the '==' operator is used to compare references while the 'equals' function actually compares the string value(character by character) stored inside the objects.

On some pondering, one might think that if identical strings are not allocated same memory, then if one might change one reference to the string, then all other references would also reflect the change. For example,

String s1 = "Agraj";
String s2 = "Agraj";
s1 = s1.toUpperCase();       // s1 = "AGRAJ" but s2="Agraj"

One might expect s2 also to be equal to "AGRAJ", which is not desired. But, since Strings are immutable in nature, so the toUpperCase() function creates a new string, rather than modifying the original string (it cannot modify the original string, nothing can) and assigns it to s1 while s2 still reference to the original string "Agraj".

StringBuffer and StringBuilder 

A person can argue that in a module that does heavy String manipulation, there would be excess of new Strings that would be created now and then and would not be used at the end, leading to wastage of memory. And you see, this person is not entirely wrong. Strings if used in a careless manner can cause huge performance bottleneck. 
So mostly, when we have to do String manipulation a lot in our application, it is advisable to use classes such as StringBuffer & StringBuilder. Both of these provide the same API and behave in the same manner as Strings do, the difference being the fact that they are mutable. Thus, you can change a StringBuffer / StringBuilder object once created. 

StringBuilder str1 = new StringBuilder("Hello");
str1.append(" World");

would change the existing object "Hello" and will not create a new object as normal String Object would have done. 

Difference b/w StringBuilder & StringBuffer
Both of them provide us with the same API and thus are similar except the fact that StringBuffer is thread-safe while StringBuilder is not. 
Thus it is always advisable to use StringBuilder class, where-ever thread-safety is not a issue (which is commonly the case always). It is advisable to use StringBuilder than StringBuffer because StringBuilder is much more faster than its thread-safe twin. This is so because it ignores the complications that comes along while deailing with stuff like synchronization and threads. 

There is much more to Strings usage in Java than what can be covered in a blog entry, but I guess I have covered some of the important points which are normally ignored by novices like me.