Yohan Liyanage

     

    24
    Oct

    Know the JVM Series -3- When Weaker is Better: Understanding Soft, Weak and Phantom References

    By Yohan Liyanage|Coding, Java, Know the JVM Series|12 Comments

    How many times have we created various object instances, and assign those to reference variables? We all know very well that Java has automatic garbage collection; so we just play around the reference variables, and once those variables are assigned null or falls out of scope, JVM takes care of it. No need to worry about ‘free’ as in C / C++. It’s a headache-less approach, which minimizes the risk of introducing memory leaks to our programs, and it works out great day in day out in billions of Java applications running out there 24×7. Kudos to John McCarthy for inventing GC for Lisp, and to  all the folks who implemented the concept in Java.

    But there are times, where we a little bit of more control over the process of garbage collection. I’m not talking about the dark art of tuning the garbage collector (which I might cover in a later article). This is about programmatic situations where we expect some object instances to be eligible for garbage collection, to release some unwanted memory that might get accumulated over the time. Well, the classic solution of explicitly assigning null could help us out; given that particular object is referred only through that particular ref variable. What if assigning null doesn’t work out for the problem at hand?

    Consider a scenario where you are required to implement an object cache. You have some objects which are pretty expensive to build. We would like to keep the objects in the cache as long as we can (just in case if a component of the application needs to use it), but we want un-used objects from our cache to be released when we need memory for other important operations of the application. If we are to implement this using standard (strong) references, this would be quite difficult to implement. The moment we add the object to a collection, we maintain a (strong) reference to the instance, making it not-eligible for garbage collection. If the cache continues to grow, we could run out of memory, making it a memory leak point for the application. Obvious solution would be to limit the size of the cache, and to drop off older objects from the cache, making those instances eligible for GC. Well, that is the mechanism used by many cache implementations out there, and it works out fine. But the drawback is that our cache is limited by size and therefore, even though we have more free memory, we cannot make use of it. Also, since the cache will always have some references in it, it will continue to allocate a significant block of memory for the lifetime of the application (ex. if the cache is fixed to 1024 references, memory consumed by those 1024 references will not be released). Yes, there are ways to address each of these problems (ex. dynamically grow / shrink cache), but that requires some fair deal of coding to get it done. This was a practical issue that I came across in one of my projects in the past.

    If only there was a better (and easier) way to get this done…

    The solution has been part of the JDK for a long time, from the days of Java 2. Meet java.lang.ref package, where Soft, Weak and Phantom references can be used to resolve such problems. The references that we create using the assignment operator are known as strong references, because the instance is strongly referred by the application, making it ineligible for garbage collection.

    1
    Object obj = new Object(); // Strong Ref

    Soft, Weak and Phantom references are the weaker counterparts of referencing, where the garbage collection algorithm is allowed to mark an instance to be garbage collected, even though such a reference exists. What this means is that, even though you hold a weak reference to a particular instance, the JVM can sweep it out of the memory if it needs to. This works out great for the problem we discussed before, since instances in our cache will be automatically released if the JVM thinks it needs more memory for other parts.

    A weak reference can be created to an instance as follows (all the reference types are available in java.lang.ref package).

    1
    WeakReference<Object> weakRef = new WeakReference<Object> (obj);

    When we create a weak reference like this, the instance referred by the ref variable obj will be eligible for garbage collection if no strong reference exists. But, if some part of the application needs to use this particular object instance, we can get a strong reference back to it as follows (given that it was not gc’d during the time in-between).

    1
    Object strongRef = weakRef.get();

    If the reference has been already garbage collected, calling the get method will return null.

    Below is a fully working example of using weak references to demonstrate what we have covered so far.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    public class TestRef {
     
        public static void main(String[] args) {
     
            // Initial Strong Ref
            Object obj = new Object();  
     
            System.out.println("Instance : " + obj);
     
            // Create a Weak Ref on obj
            WeakReference<Object> weakRef
                      = new WeakReference<Object>(obj);
            
            // Make obj eligible for GC !
            obj = null;    
            
            // Get a strong reference again. Now its not eligible for GC
            Object strongRef = weakRef.get();  
     
            System.out.println("Instance : " + strongRef);
     
            // Make the instance eligible for GC again
            strongRef = null;
     
            // Keep your fingers crossed    
            System.gc();    
     
            // should be null if GC collected
            System.out.println("Instance : " + weakRef.get());
        }
    }

    And the output of the program would be:

    1
    2
    3
    Instance : java.lang.Object@a90653
    Instance : java.lang.Object@a90653
    Instance : null

    Now that we have covered why we need weak references, and a practical example of using weak references, let’s cover some theory behind the five degrees of reachability. The following is based on the JDK API Docs.

    1. Strongly Reachable – If we have a strong reference to a particular instance, then it is said to be strongly reachable. Hence, it is not eligible for garbage collection.
    2. Softly Reachable – If we do not have a strong reference to an instance, but we can access the object through a SoftReference (more on that later) to it, then the instance is said to be softly reachable.
    3. Weakly Reachable – If we have neither a strong reference nor a soft reference, but the object can be accessed through a WeakReference, then the instance is said to be weakly reachable.
    4. Phantomly Reachable – If we don’t have any of the strong, soft or weak references to a particular instance (which has not been finalized), but, if we do have a PhantomReference (explained in a while) to the instance, then the instance is said to be phantomly reachable.
    5. Unreachable – If we do not have any of the above references to an instance, then it is unreachable from the program.

    At this point, you must be wondering about the difference, and the need, to have three different levels of weaker referencing mechanisms. In the order of strength, the references can be arranged as,

    Strong References > Soft References > Weak References > Phantom References

    Each of these referencing mechanisms serves a specific purpose. We will look at each of these references, and some related constructs in the API next.

    1. Soft References

    According to the Java API Specification, the JVM implementations are encouraged not to clear out a soft reference if the JVM has enough memory. That is, if free heap space is available, chances are that a soft reference will not be freed during a garbage collection cycle (so it survives from GC).  However, before throwing an OutOfMemoryError, JVM will attempt to reclaim memory by releasing instances that are softly reachable.  This makes Soft References ideal for implementing memory sensitive caches (as in our example problem).

    Consider the following example.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    public class TestSoftRef {
        public static void main(String[] args) {
     
            // Initial Strong Ref
            Object obj = new Object();  
            System.out.println("Instance : " + obj);
            
           // Make a Soft Reference on obj
            SoftReference<Object> softReference =
                        new SoftReference<Object>(obj);
     
            // Make obj eligible for GC !
            obj = null;    
            
            System.gc();    // Run GC
     
            // should be null if GC collected
            System.out.println("Instance : " + softReference.get());
        }
    }

    And the output will be…

    1
    2
    Instance : java.lang.Object@de6ced
    Instance : java.lang.Object@de6ced

    As we expected, since JVM had enough memory, it did not reclaim the memory consumed by our softly referenced instance.

    2. Weak References

    Unlike Soft References, Weak References can be reclaimed by the JVM during a GC cycle, even though there’s enough free memory available.  Our first example on weaker reference models was based on Weak References. As long as GC does not occur, we can retrieve a strong reference out of a weak reference by calling the ref.get() method.

    3. Phantom References

    Phantom references are the weakest form of referencing. Instances that are referred via a phantom reference cannot be accessed directly using a get() method (it always returns null), as in case of Soft / Weak references.

    Instead, we need to rely on Reference Queues to make use of Phantom References. We will take a look at reference queues in a while. One use case of Phantom references is to keep track of active references with in an application, and to know when those instances will be garbage collected. If we use strong references, then the instance will not be eligible for GC due to the strong reference we maintain. Instead, we could rely on a phantom reference with the support of a reference queue to handle the situation. An example of Phantom References is provided under Reference Queues below.

    4. Reference Queues

    ReferenceQueue is the mechanism provided by the JVM to be notified when a referenced instance is about to be garbage collected. Reference Queues can be used with all of the reference types by passing it to the constructor. When creating a PhantomReference, it is a must to provide a Reference Queue.

    The use of reference queue is as follows.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    public class TestPhantomRefQueue {
     
       public static void main(String[] args)
                throws InterruptedException {
     
          Object obj = new Object();
          final ReferenceQueue queue = new ReferenceQueue();
     
          PhantomReference pRef =
            new PhantomReference(obj,queue);
     
          obj = null;
     
          new Thread(new Runnable() {
             public void run() {
               try {
                 System.out.println("Awaiting for GC");
     
               // This will block till it is GCd
                 PhantomReference pRef =
            (PhantomReference) queue.remove();
     
                 System.out.println("Referenced GC'd");
     
                } catch (InterruptedException e) {
                  e.printStackTrace();
                }
              }
            }).start();
     
            // Wait for 2nd thread to start
            Thread.sleep(2000);
     
            System.out.println("Invoking GC");
            System.gc();      
        }
    }

    The output would be

    1
    2
    3
    Awaiting for GC
    Invoking GC
    Referenced GC'd

    5. WeakHashMap

    java.util.WeakHashMap is a special version of the HashMap, which uses weak references as the key. Therefore, when a particular key is not in use anymore, and it is garbage collected, the corresponding entry in the WeakHashMap will magically disappear from the map. And the magic relies on ReferenceQueue mechanism explained before to identify when a particular weak reference is to be garbage collected. This is useful when you want to build a cache based on weak references. In more sophisticated requirements, it is better to write your own cache implementation.

    In this rather long article, we have covered the basics of the Referencing API provided by Java Specification. The content that we have discussed here are the basics of the referencing API, and you might find it helpful to glance through the Java Docs for the Referencing API.

    Tagged as: Java, Know the JVM
    Add your comment →

    12 Comments

    1. Great write up!
      But I think your example is a little bit incorrect, I don’t think a call to System.gc(); will trigger the GC immediately. It depends on the JVM implementation and depends on the parameters passed to the JVM at startup. Moreover, the javadoc of System.gc() says:
      ——-
      Runs the garbage collector.
      Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects.
      ——
      Another point is the order (based on “strong reachability”) of the those kinds of references
      You said: Strong References > Soft References > Weak References > Phantom References
      It’s only correct on Sun JVM. On JRockit, soft, weak and phantom are treated the same (IIRC).

      By: Truong Xuan Tinh Reply →
      October 25, 2010 at 12:45 pm
      • Hi Truong,

        Great write up!

        Thanks !

        I don’t think a call to System.gc(); will trigger the GC immediately. It depends on the JVM implementation and depends on the parameters passed to the JVM at startup

        Yes, like you have mentioned, System.gc() does not guarantee that the JVM will do a GC cycle. That’s why I mentioned in the comment ‘keep your fingers crossed’ :) . But generally, JVM tries to honour the request, so for a small application like this, it is highly likely (depending on the VM implementation again. One can even ignore such a request, according to spec). So it works for most of the cases (if the VM is not busy with something else). That’s why I used it as an example.

        Another point is the order (based on “strong reachability”) of the those kinds of references
        You said: Strong References > Soft References > Weak References > Phantom References
        It’s only correct on Sun JVM. On JRockit, soft, weak and phantom are treated the same.

        You are correct again. The specification does not mandate that a VM should wait until all the memory has been consumed to finalize soft referenced instances. So VM implementers do not have to follow that, and in such cases, they can treat soft references as same as weak references. The Java Doc for Soft References reflects that. I was aware about the relaxed nature of the specification, but I wasn’t aware about specific implementations that ignored this (like you have mentioned, JRockit). Thanks for sharing this information.

        By: yohan Reply → Author
        October 25, 2010 at 1:22 pm
    2. Usually I don’t comment on blog, but because I’ve just finished the book JRockit – The Definitive Guide, it said clearly about everything you wrote here. I just want to emphasize some specific JVM implementations. Stuff like these are the corner cases for JVM implementor, they are varied by vendors.
      Anyway, many thanks for a great article.

      By: Truong Xuan Tinh Reply →
      October 25, 2010 at 6:35 pm
    3. Thanks for this and your previous posts in this series. There are a lot of interesting things in the JVM i’m not aware yet.
      Keep up the good work!

      Greetings Marcus

      By: Marcus Mattern Reply →
      October 26, 2010 at 12:12 pm
      • Thanks Marcus.

        By: Yohan Liyanage Reply → Author
        October 26, 2010 at 1:52 pm
    4. Hm, I haven’t programmed any thing related to these special references yet in my projects, but your article is exciting me to do that, :)

      Can you suggest some more areas where it or WeakHashMap can be taken advantage of?

      Too much thanks for this article.

      –Deepak

      By: Deepak Srivastava Reply →
      November 7, 2010 at 6:47 pm
    5. Hi Yohan Liyanage

      I have read all your Know the JVM Series and found it very useful many thanks for such a wonderful series.your way of explanation is crisp and clear looking forward for your upcomming articles.

      -Anil

      By: Anil Chandra Reply →
      November 18, 2010 at 2:31 pm
      • Thanks Anil.

        By: Yohan Liyanage Reply → Author
        November 19, 2010 at 12:22 pm
    6. Hi Truong,

      Thanks for the further updates and sharing your knowledge

      By: Anil Chandra Reply →
      November 18, 2010 at 2:32 pm

    Pingbacks

    1. Tweets that mention Know the JVM Series -3- When Weaker is Better: Understanding Soft, Weak and Phantom References -- Topsy.com
    2. Know the JVM Series -3- When Weaker is Better: Understanding Soft, Weak and Phantom References | Clear Illusion
    3. iPod

    Leave your comment below! Cancel Reply

    View More Posts:
    • ←
    • →

    Yohan Liyanage

    Technologist, Evangelist & Blogger

    Download Resume View Yohan Liyanage's profile on LinkedIn
    Ohloh profile for Yohan Liyanage

    Twitter - @yohanliyanage

    • No public Twitter messages.

    Tag Cloud

    AOP Architecture Aspect J Career Concurrency Design Design Patterns Dev Tools Eclipse EJB Experience Hibernate Humour Integration IoC JAAS Java Java EE Java Web Start JAX-WS JBoss JBossWS JCaptcha4Struts2 JKCS JMX Jobs Know the JVM Logging Maven Nebula Framework Ohloh Repository RMI Security SOA Software Industry Spring Struts 2 Threads Ubuntu Virus Web Services Worm Zhara Zhara POS

    Categories

    • Blog (1)
    • Career (1)
    • Coding (14)
    • Java (14)
      • JBoss (3)
      • Know the JVM Series (4)
      • Maven (1)
    • JKCS (1)
      • Zhara (1)
    • Mobility (1)
    • Open Source (2)
    • Other (4)
    • Personal (1)
    • Personal Projects (4)
      • JCaptcha4Struts2 (4)
      • Nebula Framework (1)
    • Software (1)
    • Software Architecture (2)
    • Software Design (2)
    • Spring (2)
    • Ubuntu (1)
    • Uncategorized (1)

    The Road Behind…

    • February 2013 (1)
    • November 2012 (2)
    • September 2012 (1)
    • June 2012 (1)
    • March 2012 (1)
    • November 2011 (2)
    • May 2011 (2)
    • November 2010 (2)
    • October 2010 (4)
    • September 2010 (2)
    • August 2010 (2)
    • June 2010 (1)
    • May 2010 (1)
    • November 2009 (1)
    • September 2009 (1)
    • August 2009 (4)
    • July 2009 (2)
    • November 2008 (1)
    • September 2008 (5)

    RSS RSS Feed

    • Get JellyBean on Your Xperia with CyanogenMod 10
    • Integration Testing with MongoDB & Spring Data
    • STS in OS X – Where’s the sts.ini?
    • Eventing with Spring Framework
    • JAX-WS: Working with .NET Web Services
    • Getting SOA Right – Thinking Beyond ‘The Right Angles’
    • Looking for JBoss Maven Repository?
    • JBoss JMX Console Vulnerability – Standard Security Is Not Enough !
    • Presentation – Java Web Start : How Zhara POS Works
    • JBoss – Changing RMI Remote Client Callback Address

    Blog Memberships



    Powered by the inLine Minimal WordPress Theme