Algorithm S (Selection sampling technique). To select n records at random from a set of N, where 0 < n <= N.
S1. [Initialize.] Set t <— 0, m <— 0. (During this algorithm, m represents the number of records selected so far, and t is the total number of input records that we have dealt with.)
S2. [Generate U.] Generate a random number U, uniformly distributed between zero and one.
S3. [Test.] If (N - t)U >= n - m, go to step S5.
S4. [Select.] Select the next record for the sample, and increase m and t by 1. If m < n, go to step S2; otherwise the sample is complete and the algorithm terminates.
S5. [Skip.] Skip the next record (do not include it in the sample), increase t by 1, and go back to step S2.
This description can be easily translated into a Common Lisp function, after renaming a few variables, as follows:
(defun algorithm-s (n max) ; max is N in Knuth's algorithm
(let (seen ; t in Knuth's algorithm
selected ; m in Knuth's algorithm
u ; U in Knuth's algorithm
(records ())) ; the list where we save the records selected
(tagbody
s1
(setf seen 0)
(setf selected 0)
s2
(setf u (random 1.0))
s3
(when (>= (* (- max seen) u) (- n selected)) (go s5))
s4
(push seen records)
(incf selected)
(incf seen)
(if (< selected n)
(go s2)
(return-from algorithm-s (nreverse records)))
s5
(incf seen)
(go s2))))
It's not the prettiest code, but it's easy to verify that it's a faithful translation of Knuth's algorithm. But, this code, unlike Knuth's prose description, can be run and tested. Then you can start refactoring, checking after each change that the function still works.[211]
After pushing the pieces around a bit, you might end up with something like this:
(defun algorithm-s (n max)
(loop for seen from 0
when (< (* (- max seen) (random 1.0)) n)
collect seen and do (decf n)
until (zerop n)))
While it may not be immediately obvious that this code correctly implements Algorithm S, if you got here via a series of functions that all behave identically to the original literal translation of Knuth's recipe, you'd have good reason to believe it's correct.
Another aspect of the language that special operators give you control over is the behavior of the call stack. For instance, while you normally use BLOCK
and TAGBODY
to manage the flow of control within a single function, you can also use them, in conjunction with closures, to force an immediate nonlocal return from a function further down on the stack. That's because BLOCK
names and TAGBODY
tags can be closed over by any code within the lexical scope of the BLOCK
or TAGBODY
. For example, consider this function:
(defun foo ()
(format t 'Entering foo~%')
(block a
(format t ' Entering BLOCK~%')
(bar #'(lambda () (return-from a)))
(format t ' Leaving BLOCK~%'))
(format t 'Leaving foo~%'))
The anonymous function passed to bar
uses RETURN- FROM
to return from the BLOCK
. But that RETURN-FROM
doesn't get evaluated until the anonymous function is invoked with FUNCALL
or APPLY
. Now suppose bar
looks like this:
(defun bar (fn)
(format t ' Entering bar~%')
(baz fn)
(format t ' Leaving bar~%'))
Still, the anonymous function isn't invoked. Now look at baz
.
(defun baz (fn)
(format t ' Entering baz~%')
(funcall fn)
(format t ' Leaving baz~%'))
Finally the function is invoked. But what does it mean to RETURN-FROM
a block that's several layers up on the call stack? Turns out it works fine—the stack is unwound back to the frame where the BLOCK
was established and control returns from the BLOCK
. The FORMAT
expressions in foo
, bar
, and baz
show this:
CL-USER> (foo)
Entering foo
Entering BLOCK
Entering bar
Entering baz
Leaving foo
NIL
Note that the only 'Leaving . . .' message that prints is the one that appears after the BLOCK
in foo
.
Because the names of blocks are lexically scoped, a RETURN-FROM
always returns from the smallest enclosing BLOCK
in the lexical environment where the RETURN-FROM
form appears even if the