Sergey Nivens - Fotolia
Java garbage collection interview questions and answers
From stop-the-world events to the impact of JVM pause times, these are the Java garbage collection questions and answers developers need to know before an interview.
If you're seeking a DevOps or developer position, where you will handle the runtime management of Java applications, you'll need to answer important Java garbage collection interview questions to land the job.
Here are 10 of the most common and important Java garbage collection interview questions that any technical DevOps engineer or developer applicant must be ready to answer.
1. Why is garbage collection necessary in Java?
In many programming languages, such as C and C++, when an object is no longer needed by a program, the developer must take programmatic steps to reclaim any space the object was allocated in memory.
This approach can be incredibly efficient when implemented properly. However, history has shown that when this process is done poorly, memory leaks can occur and crash an application.
When the Java language was created, Sun engineers decided that developers should not be responsible for managing the memory used by the objects they create. Instead, a garbage collection routine would be part of the JVM; this routine identifies objects that are no longer in use and deletes them from memory.
2. When does a Java object become available for garbage collection?
An object becomes available for garbage collection when it is marked as null, goes out of scope or is no longer referenced by any non-null objects within an application. In simple terms, a Java object becomes available for garbage collection when it is no longer in use by the application.
3. What does mark-and-sweep mean?
You can break garbage collection in Java into two major stages. The first is the mark stage, where the JVM looks at every object in memory and identifies whether it is still needed or not. If the object is not needed, it is marked for garbage collection.
The sweep is the second stage, where the JVM performs the garbage collection and memory reclamation.
Garbage collection algorithms that employ this sequence of events are known as mark-and-sweep garbage collectors.
4. What is the drawback to garbage collection?
The primary drawback to garbage collection is that it freezes all active threads when the memory reclamation phase takes place.
A full garbage collection cycle will run for several seconds -- or potentially even several minutes. Furthermore, garbage collection routines can't be scheduled. Imagine you're running a high-volume trading program. Now imagine a garbage collection routine happening two minutes before the stock market closes. A stop-the-world event on the JVM at that moment in time would lead to a large number of unhappy application users.
Poorly timed garbage collection can make an enterprise system look unpredictable and unreliable. Understandably, a great deal of work has been done in the Java garbage collection arena in order to minimize the impact a Java garbage collection cycle has on active systems.
5. What is generational garbage collection?
The JVM splits allocated memory into four separate spaces:
- eden
- survivor
- tenured
- metaspace
Low-level JVM components, such as the string buffer and compiled classes, are allocated memory in the metaspace. This space goes relatively unchanged over time. When people talk about garbage collection, the focus is typically on the eden, survivor and tenured spaces.
When an object is first created, it is placed in the eden space. If garbage collection occurs and the object is still referenced, it gets moved to the survivor space. If enough garbage collections happen and an object in the survivor space never gets collected, it is then moved to the tenured space.
Eden, survivor and the tenured space are all garbage collected separately, with eden collected the most often and the tenured space collected the least. This helps to improve performance, as the weak generational hypothesis tells us that long-lived objects are likely to remain active, making an inspection of their garbage collection eligibility a waste of time.
Furthermore, objects in the eden space are more likely to be short-lived and eligible for removal, so a scan of the eden space is more likely to free up a large block of memory.
Division of the garbage collector into eden, survivor, tenured and the metaspace areas greatly improves JVM performance.
6. What's the difference between a minor, major and full garbage collection?
There is no official specification that defines how a major, minor and full garbage collection cycle differ. However, it is commonly understood that:
- A minor garbage collection does a mark-and-sweep routine on the eden space.
- A major garbage collection works on the survivor space.
- A full garbage collection works on the tenured space.
Since an event that triggers a full garbage collection will normally trigger a sweep of the eden, survivor and metaspace, a full garbage collection cycle is often said to include these areas of the Java heap as well.
7. How does a Java memory leak affect garbage collection?
A memory leak increases memory consumption, and the JVM is forced to run more often to clear space for new objects. Garbage collection routines will run more frequently, and free up less memory each time they run, until eventually there is no heap space left.
8. When would you choose the parallel garbage collector (GC) over Concurrent Mark Sweep (CMS) or the G1 garbage collector?
The G1 garbage collector works best when a system can dedicate a large amount of memory to the heap.
CMS uses extra threads and processing power to perform garbage collection routines without any perceived impact on application performance. It also works best with heaps smaller than 32 GB in size.
If a system does not have an extensive amount of memory dedicated to the heap or surplus processing power to allocate to CMS, a simple parallel GC is the correct choice.
Also, a parallel GC will often collect more garbage over a given period when compared to other algorithms. However, the tradeoff is longer stop-the-world pauses. If pause times are not a concern, parallel garbage collection can be the best choice.
9. Can you trigger garbage collection from code?
The System.gc() command can issue a request to the JVM to prioritize garbage collection, but the non-deterministic nature of the garbage collection algorithms means there is no guarantee on when the JVM will respond to such a request.
Common wisdom is to avoid the System.gc() command in code and find other ways to configure the JVM's Java garbage collection algorithms to achieve optimal memory management performance.
However, it's worth mentioning that while the specification says the JVM may ignore a call to System.gc(), the reality is that no current implementation does unless specifically configured to do so. "And even then, there is a tooling path that ignores that config," says Java performance expert Kirk Pepperdine.
10. What strategy can you use to minimize the impact of stop-the-world garbage collection routines in enterprise systems?
One strategy is to cluster your server and assign more than enough memory to the Java heap than any individual cluster member would ever consume in a day.
Then, during non-peak hours, take one cluster member offline at a time, allowing the other members to handle the workload. At that point, either restart the JVM or force a Java garbage collection with the Java Diagnostic Command (JCMD):
C:>jdk11\bin\jcmd GC.run
When a tool like Java Mission Control or JConsole verifies that garbage collection successfully occurred, have the server rejoin the cluster and then perform the same steps on other cluster members.
Memory management is an important part of managing the runtime of Java-based applications and microservices. Those interested in landing a job that involves Java performance tuning need to be able to answer these Java garbage collection interview questions.