In languages with explicit memory deallocation such as C, when
the programmer releases a pointer to a memory area, it may happen that
this memory cannot be deallocated anymore, because it is no longer
referenced anywhere in the program. This is called a memory
leak. The principle of a garbage collector is to eliminate
such leaks by automatically deallocating memory when it becomes
inaccessible.
Garbage collected environments are vulnerable to another kind of memory
leak, that occurs when memory still appears to be accessible for the
collector while it is no longer needed by the program. Bugloo provides
a heap inspector and analysis of objects references in order to detect
and to understand the causes of these memory retentions.
When the debuggee is suspended, the debugger can give statistics
about memory consumption. The user can query the number of live objects (i.e. those still accessible by the program) for
each type of class loaded in the virtual machine.
(info heap . <filter>) | Bugloo command |
Return alive objects in the JVM heap sortered by
typename
(heap get <typename> <nth>) | Bugloo command |
Select the <nth> object of type <typename> from the heap dump
argument | description |
<filter>::string | display only classes that match this
regular expression |
<typename>::string | the type of the object to select |
<nth>::string | the <nth> object of type <typename>, starting at index 0 |
The result of the query can be filtered according to the type of
objects by means of regular expressions. For example, the user can
filter the dump to show Bigloo classes only by typing 1:
(bugloo) (info heap "::")
Once that a dump has been returned to the debugger, the user can
select any object and manipulate it with common displaying features
described in section Arguments and Local variables. Instead of giving the name of the variable to display,
the user has to give the special value %obj% as a
parameter. Let us suppose the user queries live objects which
name starts with "sun".
(bugloo) (info heap "sun.*")
sun.misc.Launcher$AppClassLoader => 1 instance
sun.misc.Launcher$ExtClassLoader => 1 instance
sun.misc.Launcher$Factory => 1 instance
sun.misc.Launcher => 1 instance
sun.misc.NativeSignalHandler => 2 instances
sun.misc.Signal => 3 instances
sun.misc.SoftCache => 2 instances
sun.misc.URLClassPath$FileLoader => 2 instances
sun.misc.URLClassPath$JarLoader => 8 instances
sun.misc.URLClassPath => 2 instances
sun.misc.Unsafe => 1 instance
sun.net.www.protocol.file.Handler => 1 instance
sun.net.www.protocol.jar.Handler => 2 instances
sun.nio.cs.ISO_8859_15$Decoder => 1 instance
sun.nio.cs.ISO_8859_15$Encoder => 3 instances
sun.nio.cs.ISO_8859_15 => 1 instance
sun.nio.cs.ISO_8859_1 => 1 instance
sun.nio.cs.StandardCharsets => 1 instance
sun.nio.cs.StreamEncoder$CharsetSE => 2 instances
sun.nio.cs.Surrogate$Parser => 3 instances
sun.reflect.DelegatingConstructorAccessorImpl => 5 instances
sun.reflect.NativeConstructorAccessorImpl => 5 instances
sun.reflect.ReflectionFactory => 1 instance
50 instances in 23 classes (out of 248 loaded in the debuggee JVM)
2031616 bytes in heap, 1068688 bytes used. (took 0.967s) |
He can see the value of the second object of type sun.misc.Signal by typing the following:
(bugloo) (heap get "sun.misc.Signal" 1)
(bugloo) (show %obj%)
(sun.misc.Signal) = SIGTERM |
The heap inspector allows to roughly verify that the GC
freed all the memory allocated by the program during the execution. In
conjunction with the back-references and the incoming references
commands described in sections Querying a Back-References Path and Querying Incoming References, it helps to understand why an object is still alive.
8.2 Querying a Back-References Path
|
An object is alive for a garbage collector if it is a
root (i.e. if it is contained in a class variable, in a
local variable inside the stackframe, or in the JVM operand stack), or
if it is referenced to by other live objects. The fact that an object
remains alive after a garbage collection when it was supposed to be
collected indicates that it is still accessible from at least one GC
roots.
Bugloo can unveil one of these GC root which is responsible for the
memory leak leak and by showing a complete chain of back-references
from the target object up to this GC root.
(backref <varname> . <framenum>) | Bugloo command |
Return a back-reference path starting from <varname>
(backref %obj%) | Bugloo command |
Return a back-reference path starting from the currently
selected object
(backref get <nth>) | Bugloo command |
Get the <nth> element in the last computed
back-reference path
argument | description |
<varname>::symbol | the name of a variable in the debuggee |
<framenum>::int | the frame number in the call stack |
<nth>::int | the <nth> element in the last computed
back-reference path |
To compute a back-references path, Bugloo starts from every GC root
R, and follows every object it can reach until it finds
the target object T. The algorithm is a simple depth first
search that marks an object as seen (to avoid cycles) and that
recursively searches into all its references. When T is
found, the computations that remain in the search stack represent the
nodes of the back-references path. This feature is deterministic: the
same root is discovered each time the command is called. Even if T is reachable from many GC roots, only one root can be
exhibited. This limitation simplifies the implementation of this
debugging feature and improves its speed. In practice, it is rare that
many roots are responsible of the same leak. If needed, it is possible
to discover more roots by querying a back-references path for every
incoming references of T (see section Querying Incoming References).
To understand the process of finding a memory leak with the
back-references paths, let us consider the following program, that
symbolizes a mini-language compiler.
1:
2:(module leak2
3: (export (class ast-node
4: type::symbol
5: value::obj))
6: (main compile))
7:
8:(define *nodes-cache* (make-hashtable))
9:
10:(define (compile args)
11: (let ((obj (file->ast (car args))))
12: (set! obj (ast->il obj))
13: (set! obj (il->bytecode obj))
14: (bytecode->file obj (cadr args)))) |
When running the program, the mini compiler loads a file
and stores it into an AST at line 10. It compiles the AST into
an intermediate language at line 11. It then runs out of memory at
line 12, failing to produce bytecode. It is likely that some
computation done in file->ast or ast->il is responsible
of the memory leak. To verify this assertion, we run the program again
and set a breakpoint at line 12. At this point, we can trigger a GC
and then query a heap dump to see live objects that remain in the
heap.
(bugloo) (gc)
(bugloo) (info heap "::")
Debuggee VM heap dump. Please wait...
Sorting result of heap dump. Please wait...
::ast-node => 29989 instances
::bbool => 2 instances
::bigloo.runtime.Ieee.equiv => 3 instances
::bigloo.runtime.Ieee.fixnum => 37 instances
::bigloo.runtime.Ieee.input => 17 instances
::bigloo.runtime.Ieee.number => 53 instances
::bigloo.runtime.Ieee.output => 22 instances
::bigloo.runtime.Ieee.pairlist => 70 instances
::bigloo.runtime.Ieee.port => 48 instances
::bigloo.runtime.Ieee.symbol => 13 instances
::bigloo.runtime.Ieee.vector => 16 instances
::bigloo.runtime.Llib.bexit => 4 instances
::bigloo.runtime.Llib.bigloo => 14 instances
::bigloo.runtime.Llib.bit => 7 instances
::bigloo.runtime.Llib.error => 30 instances
::bigloo.runtime.Llib.hash => 27 instances
::bigloo.runtime.Llib.object => 68 instances
::bigloo.runtime.Llib.os => 31 instances
::bigloo.runtime.Llib.struct => 9 instances
::bigloo.runtime.Pp.circle => 4 instances
::bigloo.runtime.Read.reader => 10 instances
::bigloo.runtime.Rgc.rgc => 21 instances
::bint => 26088 instances
::cell => 1 instance
::eof => 1 instance
::key => 1 instance
::leak2 => 11 instances
::nil => 1 instance
::optional => 1 instance
::pair => 91050 instances
::procedure => 1 instance
::real => 2 instances
::rest => 1 instance
::struct => 1 instance
::symbol => 863 instances
::unspecified => 1 instance
148518 instances in 36 classes (out of 170 loaded in the debuggee JVM)
6836224 bytes in heap, 5065424 bytes used. (took 0.929s) |
The output of the dump clearly shows that instances of ::ast-node still resides in the heap while they are no longer
used. To find out the object which is responsible for the retention,
we select the first ::ast-node object returned by the dump,
and then we query a back-references path for this object:
(bugloo) (heap get "::ast-node" 0)
(bugloo) (backref %obj%)
#0 ::ast-node
| field car
#1 ::pair
| field car
#2 ::pair
| at index 4082
#3 ::vector
| at index 2
#4 ::vector
| field values
#5 ::struct ====> module leak2 : *nodes-cache*
command took 0.743s. |
The returned path starts from the instance and goes down to the GC root
*nodes-cache*. It is now obvious that the problem comes from file->ast : it uses the hashtable defined at line 7 to cache the AST
nodes, and does not clear it on exit.
Note that the value of intermediate objects can be
obtained just like a normal program variable with commands presented
in section Arguments and Local variables. User has
to select one object among those returned in the back-references path
and set it as the new current object by the mean of the command (backref get ...). Later it can be referenced by the special value
%obj% and passed in argument to any other variable inspection
command.
8.3 Querying Incoming References
|
In addition to the back-references paths, Bugloo can show all the
incoming references of a particular object $O$,
i.e. all objects and roots that directly point to $O$. This
is useful to debug programs where computations involve the sharing of
many data structures.
(incoming <varname> . <framenum>) | Bugloo command |
Compute the direct reachability of variable <varname>
(incoming %obj%) | Bugloo command |
Compute the direct reachability of the currently
selected object
(incoming get <nth>) | Bugloo command |
Get the <nth> element in the last computed
set of incoming references
(info incoming) | Bugloo command |
Display the last computed set of incoming references
argument | description |
<varname>::symbol | the name of a variable in the debuggee |
<framenum>::int | the frame number in the call stack |
<nth>::int | the <nth> element in the last computed
set of incoming references |
Lets consider the following example, where the debuggee
has been suspended by a breakpoint set at line 8:
1:(module incoming
2: (main go))
3:
4:(define *foo* #unspecified)
5:
6:(define (fun1 x)
7: (set! *foo* x)
8: (print x))
9:
10:(define (go args)
11: (let ((dummy (cons 1 args)))
12: (fun1 args)
13: (print dummy args))) |
The user can know all the objects that directly references variable
x (or more precisely the value of variable x) by
typing the following:
(bugloo) (incoming x)
#0 frame 0, (fun1 ::pair) in thread main => x
#1 frame 1, (go ::pair) in thread main => args
#2 frame 2, (bigloo_main ::obj) in thread main
#3 object ::pair => cdr
#4 module incoming => *foo*
#5 class bigloo.foreign => command_line |
In the result of the computation, different results are printed
depending on the type of incoming references:
- line #0 to #2 represent local
variables that are currently referencing object X. Dumped informations include the name of the method into which the
local variable is defined, the thread in which the method was called
(and its position in the stackframe), and the name of the local
variable. line #2 doesn't exhibit a name, because the method
bigloo_main belongs to the bigloo runtime and was not compiled
in debug-mode and thus doesn't contains any name informations.
- line #3 represent an object in the JVM whose got a
field that directly references object X. This field is always
an instance field and its name is show right after the arrow
whenever the class was compile in debug mode or not.
- line #4 tells the user that a Scheme global variable is
actually referencing object X. Dumped informations include the
name of the global variable and the module into which it was
defined.
- at last, line #5 represent a JVM class whose got a
field that directly references object X. This field is always
a static field and its name is show right after the arrow
whenever the class was compile in debug mode or not.
In fact line #3 probably represents the local
variable dummy which is defined at line 11. To be convinced of
this assertion, the user can select the object shown line #3
to be the new current object, and request all incoming references for
it:
(bugloo) (incoming get 3)
(bugloo) (incoming %obj%)
#0 frame 1, (go ::pair) in thread main => dummy |
1: All Bigloo types begin with characters ::, hence the
regexp.
|