8. Debugging of Memory Allocation

8. Debugging of Memory Allocation

Browsing

Home: Bugloo 0.3.0 reference manual

Previous chapter: Debug sessions
Next chapter: Recording events during the execution

Debugging of Memory Allocation

8.1 The Heap Inspector
8.2 Querying a Back-References Path
8.3 Querying Incoming References

Chapters

1. Quick introduction
2. Starting the debugger
3. Break requests commands
4. Execution control
5. Name Manglers and Displayers
6. Getting informations during execution
7. Debug sessions
8. Debugging of Memory Allocation
9. Recording events during the execution
10. Miscellaneous commands
11. Manual Index

In languages with explicit memory deallocation such as C, when the programmer releases a pointer to a memory area, it may happen that this memory cannot be deallocated anymore, because it is no longer referenced anywhere in the program. This is called a memory leak. The principle of a garbage collector is to eliminate such leaks by automatically deallocating memory when it becomes inaccessible.

Garbage collected environments are vulnerable to another kind of memory leak, that occurs when memory still appears to be accessible for the collector while it is no longer needed by the program. Bugloo provides a heap inspector and analysis of objects references in order to detect and to understand the causes of these memory retentions.

8.1 The Heap Inspector

When the debuggee is suspended, the debugger can give statistics about memory consumption. The user can query the number of live objects (i.e. those still accessible by the program) for each type of class loaded in the virtual machine.
(info heap . <filter>)Bugloo command

Return alive objects in the JVM heap sortered by typename

(heap get <typename> <nth>)Bugloo command

Select the <nth> object of type <typename> from the heap dump

argumentdescription
<filter>::stringdisplay only classes that match this regular expression
<typename>::stringthe type of the object to select
<nth>::stringthe <nth> object of type <typename>, starting at index 0

The result of the query can be filtered according to the type of objects by means of regular expressions. For example, the user can filter the dump to show Bigloo classes only by typing 1:
(bugloo)  (info heap "::") 

Once that a dump has been returned to the debugger, the user can select any object and manipulate it with common displaying features described in section Arguments and Local variables. Instead of giving the name of the variable to display, the user has to give the special value %obj% as a parameter. Let us suppose the user queries live objects which name starts with "sun".
(bugloo)  (info heap "sun.*") 
sun.misc.Launcher$AppClassLoader => 1 instance
sun.misc.Launcher$ExtClassLoader => 1 instance
sun.misc.Launcher$Factory => 1 instance
sun.misc.Launcher => 1 instance
sun.misc.NativeSignalHandler => 2 instances
sun.misc.Signal => 3 instances
sun.misc.SoftCache => 2 instances
sun.misc.URLClassPath$FileLoader => 2 instances
sun.misc.URLClassPath$JarLoader => 8 instances
sun.misc.URLClassPath => 2 instances
sun.misc.Unsafe => 1 instance
sun.net.www.protocol.file.Handler => 1 instance
sun.net.www.protocol.jar.Handler => 2 instances
sun.nio.cs.ISO_8859_15$Decoder => 1 instance
sun.nio.cs.ISO_8859_15$Encoder => 3 instances
sun.nio.cs.ISO_8859_15 => 1 instance
sun.nio.cs.ISO_8859_1 => 1 instance
sun.nio.cs.StandardCharsets => 1 instance
sun.nio.cs.StreamEncoder$CharsetSE => 2 instances
sun.nio.cs.Surrogate$Parser => 3 instances
sun.reflect.DelegatingConstructorAccessorImpl => 5 instances
sun.reflect.NativeConstructorAccessorImpl => 5 instances
sun.reflect.ReflectionFactory => 1 instance

50 instances in 23 classes (out of 248 loaded in the debuggee JVM)
2031616 bytes in heap, 1068688 bytes used. (took 0.967s)

He can see the value of the second object of type sun.misc.Signal by typing the following:
(bugloo)  (heap get "sun.misc.Signal" 1) 
(bugloo)  (show %obj%) 
(sun.misc.Signal) = SIGTERM

The heap inspector allows to roughly verify that the GC freed all the memory allocated by the program during the execution. In conjunction with the back-references and the incoming references commands described in sections Querying a Back-References Path and Querying Incoming References, it helps to understand why an object is still alive.

8.2 Querying a Back-References Path

An object is alive for a garbage collector if it is a root (i.e. if it is contained in a class variable, in a local variable inside the stackframe, or in the JVM operand stack), or if it is referenced to by other live objects. The fact that an object remains alive after a garbage collection when it was supposed to be collected indicates that it is still accessible from at least one GC roots.

Bugloo can unveil one of these GC root which is responsible for the memory leak leak and by showing a complete chain of back-references from the target object up to this GC root.

(backref <varname> . <framenum>)Bugloo command

Return a back-reference path starting from <varname>

(backref %obj%)Bugloo command

Return a back-reference path starting from the currently selected object

(backref get <nth>)Bugloo command

Get the <nth> element in the last computed back-reference path

argumentdescription
<varname>::symbolthe name of a variable in the debuggee
<framenum>::intthe frame number in the call stack
<nth>::intthe <nth> element in the last computed back-reference path

To compute a back-references path, Bugloo starts from every GC root R, and follows every object it can reach until it finds the target object T. The algorithm is a simple depth first search that marks an object as seen (to avoid cycles) and that recursively searches into all its references. When T is found, the computations that remain in the search stack represent the nodes of the back-references path. This feature is deterministic: the same root is discovered each time the command is called. Even if T is reachable from many GC roots, only one root can be exhibited. This limitation simplifies the implementation of this debugging feature and improves its speed. In practice, it is rare that many roots are responsible of the same leak. If needed, it is possible to discover more roots by querying a back-references path for every incoming references of T (see section Querying Incoming References).

To understand the process of finding a memory leak with the back-references paths, let us consider the following program, that symbolizes a mini-language compiler.

  1:
  2:(module leak2
  3:   (export (class ast-node
  4:              type::symbol
  5:              value::obj))
  6:   (main compile))
  7:
  8:(define *nodes-cache* (make-hashtable))
  9:
 10:(define (compile args)
 11:   (let ((obj (file->ast (car args))))
 12:      (set! obj (ast->il obj))
 13:      (set! obj (il->bytecode obj))
 14:      (bytecode->file obj (cadr args))))

When running the program, the mini compiler loads a file and stores it into an AST at line 10. It compiles the AST into an intermediate language at line 11. It then runs out of memory at line 12, failing to produce bytecode. It is likely that some computation done in file->ast or ast->il is responsible of the memory leak. To verify this assertion, we run the program again and set a breakpoint at line 12. At this point, we can trigger a GC and then query a heap dump to see live objects that remain in the heap.

(bugloo)  (gc) 
(bugloo)  (info heap "::") 
Debuggee VM heap dump. Please wait...
Sorting result of heap dump. Please wait...
::ast-node => 29989 instances
::bbool => 2 instances
::bigloo.runtime.Ieee.equiv => 3 instances
::bigloo.runtime.Ieee.fixnum => 37 instances
::bigloo.runtime.Ieee.input => 17 instances
::bigloo.runtime.Ieee.number => 53 instances
::bigloo.runtime.Ieee.output => 22 instances
::bigloo.runtime.Ieee.pairlist => 70 instances
::bigloo.runtime.Ieee.port => 48 instances
::bigloo.runtime.Ieee.symbol => 13 instances
::bigloo.runtime.Ieee.vector => 16 instances
::bigloo.runtime.Llib.bexit => 4 instances
::bigloo.runtime.Llib.bigloo => 14 instances
::bigloo.runtime.Llib.bit => 7 instances
::bigloo.runtime.Llib.error => 30 instances
::bigloo.runtime.Llib.hash => 27 instances
::bigloo.runtime.Llib.object => 68 instances
::bigloo.runtime.Llib.os => 31 instances
::bigloo.runtime.Llib.struct => 9 instances
::bigloo.runtime.Pp.circle => 4 instances
::bigloo.runtime.Read.reader => 10 instances
::bigloo.runtime.Rgc.rgc => 21 instances
::bint => 26088 instances
::cell => 1 instance
::eof => 1 instance
::key => 1 instance
::leak2 => 11 instances
::nil => 1 instance
::optional => 1 instance
::pair => 91050 instances
::procedure => 1 instance
::real => 2 instances
::rest => 1 instance
::struct => 1 instance
::symbol => 863 instances
::unspecified => 1 instance

148518 instances in 36 classes (out of 170 loaded in the debuggee JVM)
6836224 bytes in heap, 5065424 bytes used. (took 0.929s)

The output of the dump clearly shows that instances of ::ast-node still resides in the heap while they are no longer used. To find out the object which is responsible for the retention, we select the first ::ast-node object returned by the dump, and then we query a back-references path for this object:

(bugloo)  (heap get "::ast-node" 0) 
(bugloo)  (backref %obj%) 
#0 ::ast-node
      | field car
#1 ::pair
      | field car
#2 ::pair
      | at index 4082
#3 ::vector
      | at index 2
#4 ::vector
      | field values
#5 ::struct  ====>  module leak2 : *nodes-cache*
command took 0.743s.

The returned path starts from the instance and goes down to the GC root *nodes-cache*. It is now obvious that the problem comes from file->ast: it uses the hashtable defined at line 7 to cache the AST nodes, and does not clear it on exit.

Note that the value of intermediate objects can be obtained just like a normal program variable with commands presented in section Arguments and Local variables. User has to select one object among those returned in the back-references path and set it as the new current object by the mean of the command (backref get ...). Later it can be referenced by the special value %obj% and passed in argument to any other variable inspection command.

8.3 Querying Incoming References

In addition to the back-references paths, Bugloo can show all the incoming references of a particular object $O$, i.e. all objects and roots that directly point to $O$. This is useful to debug programs where computations involve the sharing of many data structures.
(incoming <varname> . <framenum>)Bugloo command

Compute the direct reachability of variable <varname>

(incoming %obj%)Bugloo command

Compute the direct reachability of the currently selected object

(incoming get <nth>)Bugloo command

Get the <nth> element in the last computed set of incoming references

(info incoming)Bugloo command

Display the last computed set of incoming references

argumentdescription
<varname>::symbolthe name of a variable in the debuggee
<framenum>::intthe frame number in the call stack
<nth>::intthe <nth> element in the last computed set of incoming references

Lets consider the following example, where the debuggee has been suspended by a breakpoint set at line 8:
  1:(module incoming
  2:   (main go))
  3:
  4:(define *foo* #unspecified)
  5:
  6:(define (fun1 x)
  7:   (set! *foo* x)
  8:   (print x))
  9:
 10:(define (go args)
 11:   (let ((dummy (cons 1 args)))
 12:      (fun1 args)
 13:      (print dummy args)))

The user can know all the objects that directly references variable x (or more precisely the value of variable x) by typing the following:
(bugloo)  (incoming x) 
#0  frame 0, (fun1 ::pair) in thread main => x
#1  frame 1, (go ::pair) in thread main => args
#2  frame 2, (bigloo_main ::obj) in thread main
#3  object ::pair => cdr
#4  module incoming => *foo*
#5  class bigloo.foreign => command_line

In the result of the computation, different results are printed depending on the type of incoming references:

  • line #0 to #2 represent local variables that are currently referencing object X. Dumped informations include the name of the method into which the local variable is defined, the thread in which the method was called (and its position in the stackframe), and the name of the local variable. line #2 doesn't exhibit a name, because the method bigloo_main belongs to the bigloo runtime and was not compiled in debug-mode and thus doesn't contains any name informations.
  • line #3 represent an object in the JVM whose got a field that directly references object X. This field is always an instance field and its name is show right after the arrow whenever the class was compile in debug mode or not.
  • line #4 tells the user that a Scheme global variable is actually referencing object X. Dumped informations include the name of the global variable and the module into which it was defined.
  • at last, line #5 represent a JVM class whose got a field that directly references object X. This field is always a static field and its name is show right after the arrow whenever the class was compile in debug mode or not.

In fact line #3 probably represents the local variable dummy which is defined at line 11. To be convinced of this assertion, the user can select the object shown line #3 to be the new current object, and request all incoming references for it:
(bugloo)  (incoming get 3) 
(bugloo)  (incoming %obj%) 
#0  frame 1, (go ::pair) in thread main => dummy




1: All Bigloo types begin with characters ::, hence the regexp.

This page has been generated by Scribe.
Last update Thu Aug 28 17:12:18 2003