Multithreaded Lisps that allow passing a Lisp array directly (without
copy in and out) into a C function need some means to prevent the GC
from moving the Lisp array around in memory during the C function
call. There are at least three ways to accomplish this. Some Lisps
(e.g. ACL) have extra MAKE-ARRAY parameters that specify that an array
should not be moved by the GC (is "static"). With a multithreaded
Lisp, this would (in theory at least) free other threads to execute
garbage collection during the execution of the foreign function.
Other Lisps (e.g. SCL)
can pin a specific array during the body of a macro; SCL's strategy in
particular allows other threads to execute garbage collection while
the current thread is executing the foreign function. Finally, some
Lisps (e.g. SBCL) have user functions to pause WITH-GC-OFF form:
(WITH-GC-OFF
(c-function lisp-array ...))
or GC-ON and GC-OFF functions (which can be used to implement a
WITH-GC-OFF macro).
The absence of any of these features limits the implementation of our matrix library.
MAKE-ARRAY
call) arrays are, then users are
limited in which arrays they can pass into our interface. In
particular, they must either use our own "matrix" / "vector" types
which handle static allocation internally, or declare their own
arrays static via the appropriate implementation-specific extensions
to MAKE-ARRAY. There generally isn't a way to tell if
a particular array was declared static, so we can't warn users if
they mess up.
Finalization is another important feature that determines how we store data in our matrix library. "Finalization" refers to a user-level hook into the GC that lets us call a function when an object is garbage-collected. This is very useful if the Lisp object has a pointer to some external resource (e.g. memory that was dynamically allocated in C) that should be freed.
Some Lisps, including SBCL, don't call finalizations on exit of the Lisp system. This isn't important for matrix storage because the OS will reclaim heap space on exit of the Lisp process. Where it does become important is if we use handles from external libraries that save state on disk. We need to finalize those handles on exit.
SBCL in particular has hooks so that you can call a function on exit of the Lisp system. This can be used to finalize named handles.
The CFFI source is a good way to learn how to do finalization. Actually you can use CFFI to do finalization, in Lisps that support it. We used a CFFI source tarball (the first with ECL support), version 061012, to help us figure out which Lisps support finalization. The GCL port of CFFI is not complete; read the info manual (internal.texi) to learn more about the GCL FFI. You won't find a GCL manual online; you'll have to download the GCL package and build the info manual, or examine the .texi files by hand (as of 15 Oct 2006).
The presence of finalization allows us to use C arrays (allocated on
the heap) to represent vector and matrix data, as long as the C arrays
are wrapped in a Lisp object. A finalization can be set on the Lisp
object so that when Lisp GC's it, free (or whatever C
or Lisp function you want) is called on the C array.
There are other techniques besides finalization for freeing a C
pointer automatically upon garbage collection of a wrapper Lisp
object. In particular, Bruno Haible has written an
essay that describes so-called "weak" data structures, such as
weak pointers and weak hash tables. For example, we could keep a hash
table that keys from weak pointer to a vector object, to the C pointer
of the associated array. Periodically, we could search the keys of
the hash table for weak pointers that have been broken; if they have,
then free the associated C array. For example, we
could keep track of the amount of C memory used every time we allocate
an array. If the C memory usage passes a certain threshold, we can
initiate a "collection." This introduces some overhead at object
allocation time, which may be significant if we have to collect a
large number of objects.
Note however that the classic technique of reference counting won't
work with CLOS, because CLOS doesn't have destructors (as in C++ or
Java) or destroy-instance methods. (The classic
koan about
Moon's warning to the student doesn't seem to apply here because
there are no circular references: any view has reference only to the
underlying array, and the array itself doesn't have reference to any
other arrays. However, it
| Lisp implementation | Static arrays or array pinning? | WITH-GC-OFF? | Finalizations? | Weak pointers / hash tables? |
|---|---|---|---|---|
| ACL | Static | No (6) | Yes | Yes |
| Clisp (4) | No | No | Yes | Yes |
| CMUCL | No | Yes (10) | Yes | Yes |
| Corman | No? (8) | No? | Yes | Yes |
| ECL | No? (3) | No | Yes | No |
| GCL | Static | Yes | ??? (5) | No |
| LispWorks | Static | Not really (7) | Yes | Yes |
| OpenMCL | No | Yes (1) | Yes | Yes |
| SBCL | No | Yes (2) | Yes | Yes |
| SCL | Pinned (9) | No? | Yes | ? |
(EGC T)
enables the GC and (ECG nil) disables it, though this
is a moot point because OpenMCL probably boxes Lisp arrays.
(SI:SGC-ON nil) and turn
it back on by calling (SI:SGC-ON T). GCL has a C
function interface but no CFFI port yet; you can learn more about
the interface in the Info documentation. The syntax is
significantly different from that of CFFI. For example, DEFCFUN
requires an entire C function header, rather than just the name
of the function. Furthermore, the example in the info manual
suggests that it's only meant to work with GCL-defined C objects.
DEFENTRY seems to support more general C functions and could
presumably help us. The internal.texi info file also gives an
example of calling MAKE-ARRAY with :STATIC
T. Note that the finalization interface isn't supported in
CFFI; I don't know if GCL supports finalization.
:STATIC-RECLAIMABLE
keyword is what you want. It's supported for all four datatypes
(real and complex double and single float) that LAPACK 3.0 supports.
If we move to arbitrary-precision floating-point, we'll want to use
:LISPSTATIC-RECLAIMABLE for those arrays, since they
can hold objects of any type (are they boxed?). The ACL 8.0
documentation specifically says that
there is no way to stop scavenges from occurring, even temporarily.
:ALLOCATION :STATIC.
AVOID-GC function (use NORMAL-GC
afterwards to restore things) but it isn't MAKE-WEAK-POINTER
function creates a weak pointer to an object, and
WEAK-POINTER-OBJ gets the object to which it points). There
is a way to suspend the GC but it's only meant to work for a few
instructions.
The recommended approach is to use the 'ext:with-pinned-object macro which pins the address of a lisp object within the context of the macro, and by default also inhibits write protection of pages occupied by the object. This allows other threads to execute garbage collection while the current thread is executing the foreign function. For example:
(defun example (array) (ext:with-pinned-object (array) (let ((sap (sys:vector-sap array))) ... foreign call that may read or write directly to the lisp array ... )))If the foreign function does not write to the lisp object then the following faster form may be used:(ext:with-pinned-object (array :inhibit-write-protection nil) ...)
EXT:GC-OFF and EXT:GC-ON work, but
SYS:WITHOUT-GCING is the preferred idiom. No static
arrays. Weak pointers are supported but weak hash table support
is currently (16 Oct 2006) experimental. Finalizers are implemented
with weak pointers to the object you want to finalize.