I wrote the following essay at work when a definition of "source code" was applied to scripts. This has caused consternation among legal types who want to apply a wooden definition of source code where it shouldn't be applied. There is reference to a teleconference that was made, and a coworker's name has been redacted to protect the innocent. In retrospect, I used way too much emphasis via font, but you get it here in all its glory.

Sorry for the following dissertation, but I don't get to think like this often. Also, please note that I am generalizing and oversimplifying here: In the discussion that follows, there are real-world exceptions to just about everything.

The problem we have is that a simplistic, three-fold definition of "computer code" where the three types are supposed to have orthogonal, i.e., non-intersecting, definitions is not tenable in practice (not merely theory). For review, the three divisions presented on the phone were

The problem is that all three are designed to have a computer operate on them. Exectuable code is what is loaded into memory with all static addresses resolved. Object code is machine code that has unresolved addresses, and a specialized program called a linker is used to pull the parts together (link them, in fact) and resolve all the static addresses, turning it into executable code. In this continuum, source code is the human-readable input to a program called a compiler that produces object code.

The problem we are facing is this: Scripting languages (in general) combine all three elements at once. The "scripting engine" (or "interpreter") reads human readable instructions, parses them (i.e., determines their meaning) and executes them.

Waxing pedantic for a few moments, there are different strategies for doing this (bear with me, this really is relevant):

The highlighted comment in the last bullet is worth noting because it forms the motivation for scripting in general: As Robert Heinlein wrote in The Moon Is A Harsh Mistress, humans can't tell the difference between an answer that comes in a millisecond vs. a microsecond as long as the answer is correct. Applied here, this means the purpose of scripting languages is to maximize the time of the programmer when it is more important than the execution time. And in practice, if a script is fast enough (in clock time), a programmer will not rewrite something scripted if it works correctly.

In the above "pedantic" classification of scripting languages, there are two broad "meta" classifications, corresponding roughly to the first pair and last pair of bullets, respectively: Things a human would type at a command line, and things a human would not type at a command line.

In general, "shell scripts" and "SQL scripts" are lists of commands that are very likely to be typed by a human at an interpreter's prompt, i.e., they are simply collections of commands gathered together for convenient execution.

$Id: ScriptingLanguages.html 417 2006-10-03 04:33:26Z criglerj $