by Dennis Allison, Bernard Greening, happy Lady, & lots of Friends
(reprinted from People's Computer Company Vol. 4, No. 3)
Dear People,
After a quick Pique at TINY BASIC I have the following (possibly ill-considered) comments:
1. It looks useful for tiny computers, which is as intended.
2. Those accustomed to extended BASIC, or even the original Dartmouth BASIC, will be irked by its limitations. But then, that's how the bits byte!
3. How does the interpreter scan the word THEN in an IF statement?
4. Some of the comments for EXPR seem to be on the wrong line, or my reading is more biased than usual.
5. Users should note that arithmetic expressions are evaluated left-to-right unless subexpressions are parenthesized (i.e., there is no implicit operator procedure).
6. Real numbers would be nice, but would take up a lot more space. Probably too much. Ditto for arrays and string variables.
7. Please consider adding semicolon (i.e., unzoned) PRINT format with a trailing semicolon inhibiting the CRLF. This would be very useful and would be easy to add.
8. If INPUT prompts with a question mark, please print a blank character after the question mark (for readability).
9. I suggest allowing THEN as a separator in any multi-statement line, not just in IF statements. Since lines like
IF 5<X THEN IF X<10 THEN GOSUB 100are already legal, why not allow lines like
LET A=B THEN PRINT Aor any other combination, including silly ones like
GOTO 200 THEN INPUT Zthe second statement of which would never be executed. If THEN works for IF, it should be possible to make it work for anything.
10. I also suggest allowing comments somehow. At present, comments must be held to a minimum are possible via subterfuges such as
IF X<>X THEN PRINT "THIS IS A COMMENT"but that seems kind of gauche. Naturally, comments must be held to a minimum in TINY BASIC, but sometimes they may be vital.
11. Doing a
PRINT " "seems to be the only way to print a blank line. Well, all right.
12. Exponentiation via would seem fairly easy to add, and might be worthwhile.
13. By the way, all of this will execute in 1 K, won't it?
Jim Day
17042 Gunther St.
Granada Hills, CA 91344
Answering your Questions by number where appropriate:
3&4. Woops! There should be a TST instruction to scan the THEN. The comments are displaced a line. See the corrected IL listing in this issue.
5. Expressions are evaluated left-to-right with operator precedence. That is, 3+2*5 gives 13 and not 25. To see this, note that the rou- tine EXPR which handles addition gets the operands onto the stack by calling TERM, and TERM will evaluate any product or quotient before returning.
7. Agreed, but this is intended as a minimal system.
9. One man's syntactic sugar is anothers poison. I don't like the idea. Incidentally, how would you interpret
LET A=B THEN GOSUB 200 THEN PRINT 'A'The GOSUB then has to store a program address which botches up the line entry routine or one has to zap the GOSUB stack when an error is found. Both are solved only by kludges.
10-12. See 7.
13. Maybe. But 2K certainly. See below.
Dear PCC,
I am thrilled with your idea of an IL but I think that if you intend only to write a BASIC interpreter that a good symbolic assembler would be appropriate. With an assembler similar to DEC's PAL 3 or PAL 8 the necessary routines could be written and used in nearly the same way without having to write the associated run time material that would be necessary for its use as an interpreter. A command decoder, a text buffer, and a line editor would be necessary and all of this uses up a good amount of space in memory.
If you are aware of all these things and still plan to develop an IL interpreter, then I suggest you start as DEC did with a simple symbolic editor as the backbone of the interpreter. In this way you allow a 2800% increase in development and debugging speed (according to Datamation's comparison of interpreters and compilers whose fundamental difference is the on line editing capability). Once this has been implemented and IL is running on a particular system then the development of interpreters of all types is greatly simplified. By suggesting IL you have stumbled onto the most logical and easiest way to develop a complete library of interpreters. In addition to BASIC, it is very easy to write interpreters for: FOCAL, ALGOL, FORTRAN. PL 1, LISP, COBOL, SNOWBAL, PL/m, APL, and develop custom interpreters with the ease with which one would write a long BASIC program!
As I pointed out earlier, all these features take up memory space and, as you have pointed out, run time is much slower. The way around this is to define the IL commands in assembly language subroutines then assemble the completed interpreter as calls to these subroutines. Thus the need for the IL interpreter as a run time space and time consumer is no longer necessary! (OK symbolic assembler haters, let's see you do this in machine language in less than ten man-years!)
In places where time and space are not so much of a problem, I suggest the addition of an interrupt handler and priority scheduler to allow IL to be used as a simplified and painless TIMESHARED system enabling many users to run in an interpreter and use more than one interpreter at once. Multi-lingual timeshare systems previously being available to those who have a highspeed swapping disk, drum, or virtual memory, are now avaliable to the user who has about 16K of memory and a method of equitably bringing interpreters in to main memory from the outside world (a paper tape reader or cassette system is the eastest to come by).
In short, IL as I suggested, in its minor stages would be a powerful software development aid; and in its final, most complex stages would provide a runtime system of unheard of inexpense.
I have heard from unofficial sources that ordinarily an interpreter or compiler requires ten man-years to write and debug to the point of use (if one man works the job would require 10 years, if 10 men work it would take one year). Since this is to be expected as the initial development of IL and since I have a general idea of the circulation of PCC, we should have IL up and running by the next issue of PCC!!
At this time I would like to request a few reprints of the article dealing with IL because I want to get some help from others in my school in getting a timeshared version working on our 16K PDP 8/m with DECTAPE. I seem to have lent my copy of that issue to one of the people I had been trying to get on this project and he has not returned it to me. Meanwhile, I need the article to begin initial work on the interpreter to insure compatibility with the version coming across through PCC. I will keep you posted as with regards to the development.
William Cattey
39 Pequet Road
Wallingford, Ct. 06492
The IL approach to implementation is quite standard and dates back to Schorre's META II, Gleenie's Syntax Machine, and numerous early compilers. It was widely used in the Digitek FORTRAN systems. We did not "stumble"' on to the technique, we chose it with some deliberation.
You are right that a symbolic assembler can be used either to assemble the pseudocode into an appropriate form or to expand the pseudocode into actual machine instructions with the attendant cost in space (and decrease in execution time). Our goal is a small, easily transportable system. The interpretive approach seems consistent with this primary goal. We are using the Intel 8080 assembler's macro facility to assemble our pseudocode.
I certainly agree that it is relatively easy (but not simple!) to implement other languages using the IL approach. From the users standpoint, provided he is not compute bound, there is little difference. Interpreters are often a bit more forgiving of errors and can give better diagnostics.
In my experience, your figure of 10 man-yews is high for some languages and low for others. A figure of two to four man-years is probably more accurate, and that includes documentation at both the implementation and user level. Good luck on your implementation.
....I have found in my adaptation of it (TINY BASIC IL) for full use that certain commands need strengthening, while some might be dropped. I will hopefully be coming out with these possible modifications. Concerning my ideas on space trade-offs; I think an assembled version would take less space, since each command is treated as a subroutine call in a program made up of routines. while the interpreter needs a run time system in the background which, since it is interpretive in itself, takes up space.
P.S. You missed my allusion to assembler over strictly octal or hexidecimal op codes (my meaning was twofold). In DEC's PAL8 assembler the following syntax is needed to make the most efficient use of routine calling:
TSTN=JMSI (jump to subroutine indirectly
via this location)
100 XTSTN
The assembler shows the binary as if TSTN were like a JMSI 100/ JMP to subroutine indirectly via 100 (requiring very very little extra space per routine-one word, to be exact).
I would be happy to resolve any questions regarding compilers vs. interpreters. (Datamation did an article on the writing of a standard program in several languages then documented development and run time.)
William Cattey
There are several different varieties of interpreters. One is simply a sequence of subroutine calls. Another is, as you suggest, a list of indirect references to subroutine calls. We are considering a different organization where the call address and some additional information is packed into a single byte. This is a good strategy vis a vis memory conservation only if the size of the code memory to decode the packed instruction plus the size of the encoded instructions is smaller then the size of a more straightforward encoding. This remains to be seen.
I guess I did miss your point on assemblers. However, let me assure you that I would never advocate making software by programming directly in hex or binary. Even an assembler seems cumbersome and difficult to me; I prefer a good systems language like PL/M!
Dear Dennis and other PCCers,
In my last crazily jumbled letter I made some comments about TINY BASIC. Here is the result of 2-3 days work and thinking about it. Instead of having an interpretive IL, I chose to set it up as detailed as possible, then have people with different machine code up subroutines to perform each IL instruction. I'm not convinced that this way would take more space, and I'm sure it would be faster.
There are a couple of changes in the syntax from your published version: separate commands from statements, add terminal comma to PRINT, and restrict IF-THEN to a line number (implied GOTO).
The semantics are separated out from the syntax in IL as much as possible. This should make it easier to be clear about what the results of any given syntatic structure. This is most apparent in the TST instructions, and the elimination of the NXT instruction. That one in particular was a confusion.
Please let me know how this fits with what you're doing. I don't have a micro yet--time, not money pevents it.
John Rible
51 Davenport St.
Cambridge. MA 02140
Because of space limitatinns, we have not been able to publish all of John Rible's version (dialect) of TINY BASIC. We'll probably include it in the first issue of the TINY BASIC NEWSLETTER. Limited space requires it to be in 2nd issue.
By separating the syntax from the semantics he has produced a larger and possibly simpler to understand IL. There are more IL instructions so, I believe, the resultant system will be larger; further, the speed of execution is roughly proportional to the number of IL instructions (decoding IL is costly), it will be slower.
INTERMEDIATE LANGUAGE PHILOSOPHY
Instead of IL being interpreted, my goal has been to describe IL well enough that almost anyone will be able to code the instructions as either single machine language instructions or sinall subroutines. Besides speeding up TINY BASIC, this should decrease its size. Most of the instructions are similar to those of Dennis' (PCC V4 no. 2), but the syntactical has been seperated from the active routines. This would be useful if you want the syntax errors to be printed while inputting the line, rather than when RUNning the program.
Most subroutines (STMT, EXPR, etc.) are recursively called, so in addition to the return address being stacked, all the related data must be stacked. This can use up space quickly.
SYNTAX for John Rible's version of TINY BASIC (omitted)
Dear Mr. Allison,
I was very interested in your Tiny BASIC article in PCC. Your idess
seem quite good. I have a few suggestions regarding your IL system. I hope
I am not being presumptuous or premature with this. Unless I misunderstood
you, your IL encoding scheme seem inadequate. For instance, IL JMPs must
be capable of going up and down from the current PC. This means allotting
one of the- 6 remaining bits of the IL byte as a sign bit resulting in
a maximum PC change of +/-31 which is not adequate in some camses, i.e.
the JMP from just above S17 back to START. May I suggest the following
sdwme which is based on 2 bytes per IL instruction:
IL | ML | ||||||
JMP | CALL | TST | CALL | ||||
OXX8 | 1XX8 | 2XX8 | 1XX8 | (1st byte) | |||
YYY8 | YYY8 | YYY8 | YYY8 | (2nd byte) |
Where XX= lower 6 bits of high part of address (assume upper 2 bits
are 00)
YYY= all 8 bits of low part of address.
The complete address being OXXYYY8. These addresses represent the locations associated with the IL and ML instructions. Note that if a points to a table with a stored address, you have 3 bytes used-- my scheme uses only 2 bytes with the same basic information.
I also wondered about the TST character string. In my implementation
I am using the following technique: the string follows the TST byte pair
immediately with a bit 7 set in the last character.
Example: | 240 | TST | fail address in 0400068 |
006 | |||
0 {L} | |||
0 {E} | |||
1 {T} |
On the TSTL, TSTV, and TSTN IL's, it appears you need a ML address for the particular subroutine and 2 additional bytes for the fail address. At least this is how I am handling it.
I am looking forward to future articles in the series. Thanks again-- keep up the good work!
P.S. I am co-owner of an Altair. We are writing our Tiny BASIC in Baudot to feed our Model 19's.
Richard Whipple
305 Clemson Dr.
Tyler, Tx. 75701
We found the same problem with the published IL interpreter. We solved it by doing a bit of rearranging and introducing a new operations code which does jumps relative to the start of the program, but has the same basic encoding. Your mechanization will, of course, work, but requires one more byte per IL instruction, may be harder to implement on some machines, and takes more code.
We are using the same scheme of string termination (i.e., using the parity bit) as you are. It's simple, easy to test, and difficult to get into the assembler.
There are a few errors and oversights in the IL language and in the interpreter you didn't mention. See the new listing in this issue.
Good luck. Keep us informed of your progress.
Dear People at PCC,
I have a couple of comments on Tiny BASIC:
S4 says TST S7, but S7 got left out. T1 says TST on my paper which I suppose should be TST T2.
What is LIT and all these "or 2000"? When are we going to start putting some of this into machine code?
Sincerely,
BOB BEARD
2530 Hillegass, No. 109
Berkeley CA 94704
Soon! Ed.
Dear Tiny BASIC Dragon,
Please scratch my name onto your list for Tiny BASIC Vol. 1 Enclosed is a coupon for 3 chunks of f ire".
I am really enjoying my subscription to PCC, especially the article on Tiny BASIC.
Someday I am going to build an extended Tiny BASIC that will take over the world.
Basically yours,
RON YOUNG
2505 Wilburn, No. 144
Bethany OK 73008
Since the last issue came out, the IL code, macro definitions for each IL instruction, a subroutine address table for the assembly language routines that execute the IL functions, the assembly language code that executes the IL functions (all except the 16-bit arithmetic ones), and the IL processor have been punched on paper tape in source form.
HOP, TST, TSTN, and TSTL now do branches +32 relative to the current position counter. If the relative branch field has a zero in it, indicating a branch to "here", the IL processor prints out the syntax error message with the line number. The ERR instruction that was in the old IL code no longer exists.
IJMP and ICALL are used because the Intel 8060 assembler uses JMP and CALL as mnemonics for 8080 instructions. IJMP and ICALL are followed by one byte with an unsigned number from 0 to 255. This is added to START to do an indexed Jump or call.
Bernard