Выбрать главу

The first reason that Lisp is an excellent language for writing high-performance code is, ironically enough, the dynamic nature of Lisp programming—the very thing that originally made it hard to bring Lisp's performance up to the levels achieved by FORTRAN compilers. The reason Common Lisp's dynamic features make it easier to write high-performance code is that the first step to writing efficient code is to find the right algorithms and data structures.

Common Lisp's dynamic features keep code flexible, which makes it easier to try different approaches. Given a finite amount of time to write a program, you're much more likely to end up with a high-performance version if you don't spend a lot of time getting into and out of dead ends. In Common Lisp, you can try an idea, see it's going nowhere, and move on without having spent a ton of time convincing the compiler your code is worthy of being run and then waiting for it to finish compiling. You can write a straightforward but inefficient version of a function—a code sketch—to determine whether your basic approach is sound and then replace that function with a more complex but more efficient implementation if you determine that it is. And if the overall approach turns out to be flawed, then you haven't wasted a bunch of time tuning a function that's no longer needed, which means you have more time to find a better approach.

The next reason Common Lisp is a good language for developing high-performance software is that most Common Lisp implementations come with mature compilers that generate quite efficient machine code. I'll talk in a moment about how to help these compilers generate code that will be competitive with code generated by C compilers, but these implementations already are quite a bit faster than those of languages whose implementations are less mature and use simpler compilers or interpreters. Also, since the Lisp compiler is available at runtime, the Lisp programmer has some possibilities that would be hard to emulate in other languages—your programs can generate Lisp code at runtime that's then compiled into machine code and run. If the generated code is going to run enough times, this can be a big win. Or, even without using the compiler at runtime, closures give you another way to meld machine code with runtime data. For instance, the CL-PPCRE regular expression library, running in CMUCL, is faster than Perl's regular expression engine on some benchmarks, even though Perl's engine is written in highly tuned C. This is presumably because in Perl a regular expression is translated into what are essentially bytecodes that are then interpreted by the regex engine, while CL-PPCRE translates a regular expression into a tree of compiled closures that invoke each other via the normal function-calling machinery.[326]

However, even with the right algorithm and a high-quality compiler, you may not get the raw speed you need. Then it's time to think about profiling and tuning. The key, in Lisp as in any language, is to profile first to find the spots where your program is actually spending its time and then worry about speeding up those parts.[327]

You have a number of different ways to approach profiling. The language standard provides a few rudimentary tools for measuring how long certain forms take to execute. In particular, the TIME macro can be wrapped around any form and will return whatever values the form returns after printing a message to *TRACE-OUTPUT* about how long it took to run and how much memory it used. The exact form of the message is implementation defined.

You can use TIME for a bit of quick-and-dirty profiling to narrow your search for bottlenecks. For instance, suppose you have a function that's taking a long time to run and that calls two other functions—something like this:

(defun foo ()

(bar)

(baz))

If you want to see whether bar or baz is taking more time, you can change the definition of foo to this:

(defun foo ()

(time (bar))

(time (baz)))

Now you can call foo, and Lisp will print two reports, one for bar and one for baz. The form is implementation dependent; here's what it looks like in Allegro Common Lisp:

CL-USER> (foo)

; cpu time (non-gc) 60 msec user, 0 msec system

; cpu time (gc) 0 msec user, 0 msec system

; cpu time (total) 60 msec user, 0 msec system

; real time 105 msec

; space allocation:

; 24,172 cons cells, 1,696 other bytes, 0 static bytes

; cpu time (non-gc) 540 msec user, 10 msec system

; cpu time (gc) 170 msec user, 0 msec system

; cpu time (total) 710 msec user, 10 msec system

; real time 1,046 msec

; space allocation:

; 270,172 cons cells, 1,696 other bytes, 0 static bytes

Of course, that'd be a bit easier to read if the output included a label. If you use this technique a lot, it might be worth defining your own macro like this:

(defmacro labeled-time (form)

`(progn

(format *trace-output* "~2&~a" ',form)

(time ,form)))

If you replace TIME with labeled-time in foo, you'll get this output:

CL-USER> (foo)

(BAR)

; cpu time (non-gc) 60 msec user, 0 msec system

; cpu time (gc) 0 msec user, 0 msec system

; cpu time (total) 60 msec user, 0 msec system

; real time 131 msec

; space allocation:

; 24,172 cons cells, 1,696 other bytes, 0 static bytes

(BAZ)

; cpu time (non-gc) 490 msec user, 0 msec system

; cpu time (gc) 190 msec user, 10 msec system

; cpu time (total) 680 msec user, 10 msec system

; real time 1,088 msec

; space allocation:

; 270,172 cons cells, 1,696 other bytes, 0 static bytes

From this output, it's clear that most of the time in foo is spent in baz.

Of course, the output from TIME gets a bit unwieldy if the form you want to profile is called repeatedly. You can build your own measurement tools using the functions GET-INTERNAL-REAL-TIME and GET-INTERNAL-RUN-TIME, which return a number that increases by the value of the constant INTERNAL-TIME-UNITS-PER-SECOND each second. GET-INTERNAL-REAL-TIME measures wall time, the actual amount of time elapsed, while GET-INTERNAL-RUN-TIME measures some implementation-defined value such as the amount of time Lisp was actually executing or the time Lisp was executing user code and not internal bookkeeping such as the garbage collector. Here's a trivial but useful profiling tool built with a few macros and GET-INTERNAL-RUN-TIME:

(defparameter *timing-data* ())

(defmacro with-timing (label &body body)

(with-gensyms (start)

`(let ((,start (get-internal-run-time)))

(unwind-protect (progn ,@body)

(push (list ',label ,start (get-internal-run-time)) *timing-data*)))))

(defun clear-timing-data ()

(setf *timing-data* ()))

(defun show-timing-data ()

(loop for (label time count time-per %-of-total) in (compile-timing-data) do

вернуться

326

CL-PPCRE also takes advantage of another Common Lisp feature I haven't discussed, compiler macros. A compiler macro is a special kind of macro that's given a chance to optimize calls to a specific function by transforming calls to that function into more efficient code. CL-PPCRE defines compiler macros for its functions that take regular expression arguments. The compiler macros optimize calls to those functions in which the regular expression is a constant value by parsing the regular expression at compile time rather than leaving it to be done at runtime. Look up DEFINE-COMPILER-MACRO in your favorite Common Lisp reference for more information about compiler macros.

вернуться

327

The word premature in "premature optimization" can pretty much be defined as "before profiling." Remember that even if you can speed up a piece of code to the point where it takes literally no time to run, you'll still speed up your program only by whatever percentage of time it spent in that piece of code.