Design has become a major focus of this course in programming, and
is scattered about, mostly at the front of every programming project. No
software of any consequence can be written without a good design, and we
hope our students, at the completion of our course, will be able to produce
a good up-front design for any program they set out to write, or at least
be able to code to a design prepared by somebody else, and if so, they
will be able to make a good living in the software industry.
The people in the educational community who try to teach software Design typically give it some other name, like "Thinking Computationally" and identify four components, but not always the same four components. Here are the most common of them:
Code Abstraction, also called "Successive Decomposition" (we call it "Top-Down Design" or TDD), where the program is structured logically as a tree (with the root at the top), and each layer adds more detail, until the bottom layer is the actual lines of code.Pattern Recognition, also called "Code Re-Use", where the program is examined for sequences of similar code, which is then merged into iteration or (more often) subroutines. This somewhat overlaps the Abstraction item above.
Algorithm Design, choosing the right sequence of operations to do the job. Rarely do programmers design their own algorithms. Mostly we know about the most important algorithms for doing a variety of things, and go search for whatever we don't know about. The internet has made this search very much easer.
Data Types and Structures, choosing appropriate data types (and classes, if appropriate) to do the job. If you have one or more groups of very similar (or identical) data elements, you will use arrays; if you have composite data clumps of very dissimilar elements, you will use classes. If you have a vast amount of data, or if you need persistence across runs, you will use files. If you have numbers, it helps to think about what kinds of numbers you have, thereby to choose integers or floating-point.
When I design a new program, I get up and walk away from the computer, pacing up and down across the floor, telling myself a story in rough abstract terms (in English), what the computer must do to run this program. Michael Schuff considers it helpful to think about the known inputs: Are they single items, or is there a sequence of inputs? What needs to be set up before you can do anything with these inputs? What operations do you perform on them after they are in? Stuff like that.
I think it is helpful to write down (in English) first a one-line title describing what this program does, then in three lines under that, the major parts of what it does.
For example, if you are playing a single round of Rock-Paper-Scissors (that would be the title), the first obvious thing that happens is the two players wave their fist up and down to get into synchronization. The purpose when two people play is to make sure both players expose their plays at the same time (otherwise whoever shows their play first is at a disadvantage to the other player who can see what they are going against). Then on the third down-beat, both players show their play at the same time, then finally the two plays are compared to see who won that hand. Later on you can add an iteration around the round to repeat it.
If you are doing a calculator, the three parts might be to get the inputs (two numbers and an operator), then do that operation on those inputs, then display the result. Later on you can add an iteration around this basic form to combine several operations into a session.
Most programs, there is no one way to decompose it into three component lines. If you are doing a card game, you might combine shuffling and dealing into the first step, then iterate the individual plays (one line in the top level) then do your scoring. Or you could separate out the shuffling and dealing as separate steps, and then combine the plays and betting into a single iteration. The important thing is to ask yourself is this a part of the game or program that I can give a short name to, and explain in a single sentence?
Even more important, Do you understand (in your head) what this program must do? Can you write a short narrative (in English) telling that operation as a story? Can you break that story into chapters and paragraphs and sentences? That's what your program is: the chapters are classes, the paragraphs are subroutines, and the sentences are lines of code. Write your chapter titles first, then ignore all but the one you are looking at, and write its parts. You cannot keep the whole program in mind at once, so you must ignore (forget) all the parts you are not currently working on. Do them later, or earlier, but not now.
This is the same skill they call "outline" when they teach it in English composition. Learn it there, use it here. Or else learn it here, use it there. It's a terribly important life skill, it will also help you sell your product to the customer (if you become a salesman), or pitch a project to the Vice-President of Engineering in a corporate environment, or give a campaign speech if you run for office. If you are a soccer coach or a scout leader, it will help you tell the kids in your troop or team how to cooperate for this hike or game. If you decide to write a novel, it can make the difference between a best seller and the dust bin. If you become a painter, it will help you organize your image so it is pleasing to the eye. As a computer programmer, you cannot even imagine a complete program without breaking it into component parts, which you fill out one at a time.
The most important skill in programming is hiding what you are not working
on right now. I write big humongous subroutines, but they are impossible
to understand. Break your code and data into smaller chunks, you will be
a better programmer.
Some programs do a lot of multiply-and-add, multiply-and-add, multiply-and-add,
you can make an iteration to do that. If you are writing three card games,
they all shuffle the deck, so write a single subroutine to shuffle, then
use the same subroutine everywhere. Look for the similarities.
Every programming language has primitive data -- numbers, booleans (True or False), characters and sometimes whole words (strings of characters treated as a unit). All modern programming languages (usually not including Assembly, which is a human-readable version of machine language) also let you specify aggregate data, either collections of named data fields where you can refer to the whole collection by name -- older languages call these "records" but newer languages like Java they are called class data -- or else a linear array of the same type, accessed by a numerical index. A file is a (possibly permanent) array of records (or any kind of data) usually stored in a non-volatile mass storage like disk or tape. The file can be much larger than you have memory space for, so this is the classic way to process vast amounts of data, or data that must survive to another day.
Relational database files started the practice of structuring the data to minimize duplications, and it's a good idea when you write complex Java programs also. For example a business might have a file of all their customers, names and addresses and phone numbers, and another file of invoices with some of the same customer data printed on the invoices, as well as the particular transactions for the current billing cycle. Rather than duplicate the customer data in both files, they put just a reference to the customer (typically a unique customer number) into the invoice record in the file, then the database can be told to build the invoice from both files, where the customer number is called a relation between the files (or "tables").
You can do the same thing in your program. For example, if you are making a card game, there is one deck of cards with (for example) 52 unique cards, which you can refer to by a simple number (perhaps the order the new deck came in its box), and then have a string array, the display name for each card, and another numeric array, each card's rank when scoring. Most of your processing then works with simple integers, perhaps small arrays for each player's hands, and another for the undealt deck and another for the discard pile. Dealing and shuffling and discarding then deal with simple integers, the fastest of all computer operations.
Another example, the game Tetris has a rectangular grid for its game board, so the natural data structure is a 2-dimensional array, where each cell could be one of the tetronimo colors, or else blank, but a better cell content would be a reference number for the particular tetronimo, which in turn would be a record of a 4-element array (the cells it currently occupies), plus color and shape. You still need to update all four cells when you rotate or move a tetronimo, but all the information you need is in that one record.
More TBA
Tom Pittman
Rev. 2022 November 3