Some introductory words
Yesterday (2001-08-14) I talked with ariels about the node Learn to count from one to ten in different programming languages, that was recently re-created because the programming language writeups in learn to count from one to ten in different languages got nuked. While I agreed that comparison of languages in general is only a good thing, I found the approach shown there very, very
awkward - more of an invitation for us fellow noders to add a simple program in Our Favorite Language.
Also, this was pointless - counting linearily, algorithmically speaking, is a very simple thing, and there's thousands of different languages. GTKYN guaranteed.
I couldn't find either of those nodes today. Were they nuked? Well, that doesn't matter...
Here, I'm trying something more sensible. It actually took me some
days of research and coding to finish this. Most of the code here
were specifically written for this document. Weirdly enough, I had not
implemented this algorithm in Perl before, even when Perl is my
favorite programming language ever!
The following thing tries to explain differences between different
types of programming languages, and also, more specifically,
different programming styles. It would be possible to write
procedural version of this program in Python, or an object-oriented
version in Forth (scary thought, isn't it?)
The example programs shown here are implementations of my
Stutter/Feedback Algorithm, just about the only algorithm I've ever
invented and then described in this sort of detail. (I consider myself
sane... well, somewhat strange, but "sane" in clinical sense of the
word. =) While it's still not much more complicated than the counting
algorithm, it's still complicated enough to be actually interesting
enough to be implemented in different ways.
This writeup does recognize the "macrofamilies" of computer
languages in the scientifical sense, but doesn't try to follow them
strictly. I use pseudo-scientific labels like "procedural languages",
and such, but read some computer science books to get more accurate,
OK? For example, some of the categories listed here
would probably not pass muster... Is stack-based stuff really a group
of its own, or just a subgroup of procedural languages, as I expect
them to be? As sure as hell they look vastly different,
though...
Also, this doesn't cover every type of programming language, just
the most common ones.
Procedural languages
(...aka Imperative languages...)
Examples: C, Perl, Rexx,
Example program: (In Rexx)
parse arg infile
if infile\='' then
do while lines(infile)>0
call stutter(linein(infile))
end
else
say 'Usage: stutterize.rexx filename'
exit
exit
stutter: procedure
parse arg line
string = ''
do i = 1 to length(line)
string = string || substr(line,i,3);
end
say string
return
(I was about to do this in Fortran 77, just to show that We
Aren't Afraid, but damn me how big wuss I am. I ran like the coward I
am when I saw Fortran's string handling... even reading stuff from
stdin to variables was a challenge too mighty for me. Still, I
am going to write the hideous example in F77 some day... =)
Example program: (In Perl)
#!/usr/bin/perl -w
use strict;
my ($input);
while($input = <>) {
chomp $input;
for(my $n = 0; $n <= length($input)-3; $n++) {
print substr($input,$n,3);
}
print "\n";
}
Procedural languages approach the problem by going in, doing the
Magic, and getting out. Program starts, program ends, stuff happens in
the middle. That's the idea. That's how the things are done,
always. That's how it happens. The idea is, simply, to use
algorithm. While most functional languages utilize
recursion a lot, procedural languages typically solve the problems
using iteration - in the Perl program above, we do this:
"Take next line from input source (stdin or input file,
depending on invocation), remove trailing carriage return, and loop
over each character of the string, stopping at the 3rd to last. Print
out each 3-character substring of the string, starting from place
indicated by our current loop counter value."
Most procedural languages look surprisingly much like C today,
probably because C's syntax is fairly neat and it's a good language
that everyone should learn. I used Rexx, a fairly old but still
kickin' language, as my non-ordinary example - Rexx is good for
illustration of algorithms, and I guess I'm not making a big
explanation here on how it works, it's mostly self-evident. Perl is
a good example of a "real-world" language that has been influenced by
C - the for( ... ; ... ; ...) ... loop has been stlen
from there.
Specific-purpose scripting languages
Examples: Found in better applications everywhere!
Example program: (In TinyFugue scripting language)
/def stutter = \
/let mystring=%{*} %; /let out= %; /let ctr=0 %;\
/while ({ctr}<=(strlen({mystring})-3)) \
/let out=$[strcat({out},substr({mystring},{ctr},3))] %;\
/let ctr=$[{ctr}+1] %;\
/done %;\
"%{out}
Here's a good example of a specific-purpose procedural programming
language. (Another good example would be ECMAScript, but that one is
too much like C so repeat isn't good...) TinyFugue interpreter can
only be found in TinyFugue. However, as everyone can see, the thing
borrows a thing or two from procedural languages (control
structures, subroutine/macro definitions), something from shell
scripts (%{*} is equivalent of Bourne shell
$*, for example), something from C
(strlen()), and most of all, it inherits the client's
command syntax (all "normal commands", not functions, start with
slash). This sort of scripting can be done in many apps. For
example, some IRC clients (that are still backwards enough not to
support Perl) use rather similiar syntax - but in many, the slashes
can be left out.
Functional languages
Examples: Lisp, Scheme, Haskell
Example program: (In Lisp)
(defun stutter (string)
(cond
((<= (length string) 2) string)
(t (concatenate 'string
(subseq string 0 3)
(stutter (subseq string 1))))))
The idea of functional languages is to specify program execution
as a self-contained function, hence the name. The program takes
parameters and gives a return value. The "side effects" (modifying
some state somewhere along the way) are discouraged. Of course,
sometimes it makes sense to do something just for the side
effect: For example, functional program may use "print" command not
for its return value (usually the string argument itself
unmodified), but for its side effect (printing argument to screen).
Lisp, as depicted above, is less "mathematical" than Haskell (and I chose it just because I already knew some Lisp, but I couldn't learn Haskell in a matter of moments. Impatience rocks...). However, as you can see, as a pseudocode the program would look like this:
Stuttered version of strings of less than 3 letters is string itself (special case)
Stuttering == first three letters of string, followed by stuttering of string from the second letter onward
The idea is simple: we use recursion. The program is defined as a recursive function that only passes stuff as function parameters and only returns stuff - doesn't mess with variables.
Object-oriented languages
Examples: C++, Java, Objective C, Smalltalk, Eiffel
Example program: (In Python)
import sys, string
class Stutterizer:
__processing = 0
def stutter(self,x):
__processing = 1
stuttered = ""
n = 0
while n <= len(x)-3:
stuttered = stuttered + x[n:n+3]
n = n+1
__processing = 0
return stuttered
def isProcessing(self):
return __processing
stut = Stutterizer()
s = "\n"
while s != '':
s = string.rstrip(sys.stdin.readline())
print stut.stutter(s)
There's two things I learned while writing this program: First, the claim that Object-Oriented metaphor is overkill for small problems is completely true.
Secondly, Pythonites' claim that Perl's object orientation sucks
is in grade "pot, kettle, black." True, Python has "class" reserved
word while Perl uses mystical rituals of blessed references, but it
won't help if all class members are public and methods
virtual - that's just as bad as the situation is in
Perl! (Or has this changed in Python 2? I'm sure Perl 6 will address
these issues too...) (Update, 2005-04-01: Please just disregard this mindless, mindless offtopic babbling. Sorry if I struck the nerve of Python fans - there's one language I have had severe trouble "getting". If I get amused enough, I may rewrite this example in Ruby, I have no complaints about that language =)
Okay, enough rambling. Above, you can see that I have a class
called Stutterizer that has two methods: stutter (which
returns the stutterized string) and isProcessing (which
returns the "private" member's number). The
__processing private member just holds the information
whether or not we're processing - this in attempt to make the thing
thread-safe, but this is overkill because there's nothing to be
thread-scared of in this class.
The actual meat of the definition - the stutter()
function - is not remarkably different from Your Average Procedural
Solution. The difference here is that we use encapsulation to hide
state variables, and provide "object"-like concept. We have a thingy
that creates weird noises when you feed stuff in it.
"Create a new Stutterizer object. Read lines from standard
input until you hit EOF, using stutter() method of our Stutterizer
object to process strings and printing out results."
Low-level languages
(Here, "low-level" means roughly "not really that easy for trained
monkeys to grasp at first, but really schweet if you happen to be a
computer".)
Stack-based languages
Examples: Forth, MUF, PostScript
Example program: (In Forth)
: stutter { s -- }
s bounds swap 3 - swap ?do
i c@ emit i 1 + c@ emit i 2 + c@ emit
loop
;
Most users of procedural languages who have never seen languages
like this will probably now stand there uttering "Oh, what the hell is
that?" and "You was right, there are more unreadable
languages than Perl". =)
Stack-based languages are, like the name implies, based on
stack. As such, they
- Take surprisingly little to run (many really small embedded
systems use Forth, for example), and
- Are hard to understand for humans but really friendly for computers.
For example, math in Forth follows RPN pretty closely. "2 3 +"
will put "5" to top of the execution stack. One might argue that
designers of Forth were just too damn lazy to convert stuff
from postfix to infix, but... hey, it's just machine/compiler
efficiency!
Above explained:
"Define new word (procedure) 'stutter'. It
takes one argument which we store to local variable s. This word
will not return anything. Take string, and figure out its
boundaries in the memory, leaving start and end
addresses to the top of the stack. Now, swap the addresses, put
number 3 on the top of the stack, and subtract the top number of stack
from second to topmost, leaving that to the top of the stack.
(Translation: Subtract 3 from the stack top number, that is, the end
address of the string s.) Now, swap the top two numbers
again. (Translation of the beginning part of this definition:
We take memory range and subtract 3 from the end address, leaving them
on same order on the top of the stack.) Using the topmost two
numbers as numeric memory address range, do the following: take next
numeric address (represented by variable i), fetch the character
(ASCII values!) at this location, and put it to the top of the
stack. Interpret this ASCII value from top of the stack as a character
and send it to standard output (emit). Repeat this twice, but adding 1
and 2 to the address i before fetching the value. Now, Okay, repeat
this for the whole address range. This concludes our new word."
All that for a couple of symbols. Now, I hope you see why these
programming languages, without proper documentation of the code,
tend to become write-only.
You can use this program by getting gforth, load the interpreter
up by saying gforth stutter.fs and say s" Something
here" stutter cr to see the effect.
Assembly
Examples: Glorious RISC Code, Inferior CISC Shit
Example program: (In 6502 assembly, for Commodore 64,
compileable with xa)
#include <xa/c64.s>
.word $C000
*=$C000
MAIN: LDA #<text ; Store start address to pointer P1
STA P1 ; on zero page (address $00fb)
LDA #>text
STA P1+1
LDX #$00
LOOP: LDY #$00 ; Unrolled for clarity and speed
LDA (P1),Y
JSR CHROUT
INY
LDA (P1),Y
JSR CHROUT
INY
LDA (P1),Y
JSR CHROUT
INC P1
INX
CPX tlen
BNE LOOP
LDA #13 ; Print carriage return.
JSR CHROUT
RTS
text: .byte "tHIS IS A TEST OF THE eMERGENCY bROADCAST sYSTEM."
tlen: .byte 47 ; Length of the text.
Sorry, I was too tired to write an input/output system, and this
code isn't exactly elegant. But it works, and it's fast...
Here, we descend deeper, deeper, deeper in the realm of the
Processor. What you are looking at now is a good symbolic description
of the real code the processor can execute.
Here's a description of the program:
"First, the start address of the text is stored to memory,
addresses 00FB and 00FC (collectively known with symbolic names P1 and
P1+1, but the latter isn't important from our point of view). The
character counter, stored in processor register X, is reset. In the
loop, we first reset our character pointer Y to 0. Then, we repeat
following three times: We take the address of the start of the string
from P1, add the character pointer from Y, and fetch the character
from that address to accumulator (register A), and then, we send the
character to screen using kernal (sic) routine CHROUT, after which we increment our character
pointer at Y. After these three repeats, increment the pointer value
at P1, and increment the character counter at X. If we have not yet
got tlen letters, jump back to loop start. Else, print
out carriage return and return from the machine-language subroutine.
(The fancy name for the addressing mode used here is "Indirect,Y".
Interestingly, Here the code is 43 bytes and data is 50 bytes...
Stuff missing from this document at the moment
Feel free to /msg me more suggestions or corrections!