Now that you have learned about operating systems, let's go into another type of program, utilities. In addition to the utility commands (like diskcopy and rename), which are built into the operating system, you will probably have some independent utility programs. These are standard programs that run under control of the operating system just like your applications programs. They are called utilities because they perform general types of functions that have little relationship to the content of the data. Utility programs eliminate the need for programmers to write new programs when all they want to do is copy, print, or sort a data file. Although a new program is not needed, we do have to tell the program what we want it to do. We do this by providing information about files, data fields, and the process to be used. For example, a sort program arranges data records in a specified order. You will have to tell the sort program what fields to sort on and whether to sort in ascending or descending sequence.
Let's examine two types of utility programs to give you some idea of how a utility program works. The first will be sort-merge and the second the report program generator (RPG).
Sorting is the term given to arranging data records in a predefined sequence or order. Merging is the combining of two or more ordered files into one file. For example, we normally think of putting a list of people's names in alphabetical order arranging them in sequence by last name. We arrange those with the same last name in order by first name.
If we do this ourselves, we know the alphabetic sequence - B comes after A, C after B, and so on, and it is easy to arrange the list, even if it is a time consuming job. On a computer, the sequence of characters is also defined. It is called the collating sequence. Every coding system has a collating sequence. The capability of a computer to compare two values and determine which is greater (B is greater than A, C is greater than B, and so on) makes sorting possible. What about numbers and special characters? They are also part of the collating sequence. In EBCDIC, (EBCDIC is a computer code that is discussed in detail in chapter 4) special characters, such as #, $, &, and *, come in front of the alphabetic characters, and numbers follow. When you sort records in the defined sequence, they are in ascending sequence. Most sort programs also allow you to sort in reverse order. This is called descending sequence. In EBCDIC, it is 9-0, Z-A, then special characters.
To sort a data file, you must tell the sort program what data field or fields to sort on. These fields are called sort (or sorting) keys. In our example, the last name is the major sort key and the first name is the minor sort key.
Sorting is needed in many applications. For example, for mailing we need addresses in ZIP Code order; personnel records may be kept in service number order; and inventory records may be kept in stock number order. We could go on and on. Because many of our files are large, sorting is very time consuming, and it is one of the processes most used on computer systems. As a user, you will become very familiar with this process.
Sort-merge programs usually have phases. First they initialize: read the parameters, produce the program code for the sort, allocate the memory space, and set up other functions. The sort-merge program then reads in as many input data records as the memory space allocated can hold, arranges (sorts) them in sequence, and writes them out to an intermediate sort-work file. It continues reading input, sorting and writing intermediate sort-work files until all the input is processed. It then merges (combines) the ordered intermediate sort-work files to produce one output file in the sequence specified. The merging process can be accomplished with less memory than the sort process since the intermediate sort work files are all in the same sequence. Records from each work file can be read, the sort keys compared based on the collating sequence and sort parameters, and records written to the output file maintaining the specified sequence.
REPORT PROGRAM GENERATORS
Report program generators (RPG) are used to generate programs to print detail and summary reports of data files. Figure 3-1 is an example of a printed report. RPGs were designed to save programming time. Rather than writing procedural steps in a language like BASIC or COBOL, the RPG programmer writes the printed report requirements on specially designed forms.
Figure 3-1. - Printed report using a report program generator (RPG) program.
Included in the requirements are an input file description, the report heading information lines, the input data record fields, the calculations to be done, and the data fields to be printed and summarized. The RPG program takes this information and generates a program for the specific problem. You then run that program with the specified input data file to produce the printed report. The input data file must be in the sequence in which you want the report to summarize the data.
In our example (fig. 3-1), we summarized requisitions based on unit identification codes (UIC). We first sorted the input data file on the field that contains the UIC. We provided specifications to the RPG program to tell it to accumulate totals from the detail (individual) data records until the UIC changed. We then told it to print the total number of requisitions and total cost for that UIC. We did not have it print each detail record, although we could have. The UIC is called the control field. Each time the control field changes, there is a control break. Each time there is a control break, the program prints the summary information. After all records are read and processed, it prints a summary line (TOTALS) for all UICs. You can also use RPGs to generate a program to update data files.
Q.6 What programs eliminate the need for programmers to write new programs when all
they want to do is copy, print, or sort a data file?
Programmers must use a language that can be understood by the computer. Several methods can achieve human-computer communication. For example, let us assume the computer only understands French and the programmer speaks English. The question arises: How are we to communicate with the computer? One approach is for the programmer to code the instructions with the help of a translating dictionary before giving them to the processor. This would be fine so far as the computer is concerned; however, it would be very awkward for the programmer.
Another approach is a compromise between the programmer and computer. The programmer first writes instructions in a code that is easier to relate to English. This code is not the computer's language; therefore, the computer does not understand the orders. The programmer solves this problem by giving the computer another program, one that enables it to translate the instruction codes into its own language. This translation program, for example, would be equivalent to an English-to-French dictionary, leaving the translating job to be done by the computer.
The third and most desirable approach from an individual's standpoint is for the computer to accept and interpret instructions written in everyday English terms. Each of these approaches has its place in the evolution of programming languages and is used in computers today.
With early computers, the programmer had to translate instructions into the machine language form that the computers understood. This language was a string of numbers that represented the instruction code and operand address(es).
In addition to remembering dozens of code numbers for the instructions in the computer's instruction set, the programmer also had to keep track of the storage locations of data and instructions. This process was very time consuming, quite expensive, and often resulted in errors. Correcting errors or making modifications to these programs was a very tedious process.
In the early 1950s, mnemonic instruction codes and symbolic addresses were developed. This improved the program preparation process by substituting letter symbols (mnemonic codes) for basic machine language instruction codes. Each computer has mnemonic codes, although the symbols vary among the different makes and models of computers. The computer still uses machine language in actual processing, but it translates the symbolic language into machine language equivalent. Symbolic languages have many advantages over machine language coding. Less time is required to write a program. Detail is reduced. Fewer errors are made. Errors which are made are easier to find, and programs are easier to modify.
The development of mnemonic techniques and macroinstructions led to the development of procedure-oriented languages. Macroinstructions allow the programmer to write a single instruction that is equivalent to a specified sequence of machine instructions. These procedure-oriented languages are oriented toward a specific class of processing problems. A class of similar problems is isolated, and a language is developed to process these types of applications. Several languages have been designed to process problems of a scientific-mathematical nature and others that emphasize file processing.
Procedure-oriented languages were developed to allow a programmer to work in a language that is close to English or mathematical notation. This improves overall efficiency and simplifies the communications process between the programmer and the computer. These languages have allowed us to be more concerned with the problems to be solved rather than with the details of computer operation. For example:
COBOL (COmmon Business Oriented Language) was developed for business applications. It uses statements of everyday English and is good for handling large data files.
FORTRAN (FORmula TRANslator) was developed for mathematical work. It is used by engineers, scientists, statisticians, and others where mathematical operations are most important.
BASIC (Beginner's All-Purpose Symbolic Instruction Code) was designed as a teaching language to help beginning programmers write programs. Therefore, it is a general-purpose, introductory language that is fairly easy to learn and to use. With the increase in the use of microcomputers, BASIC has regained popularity and is available on most microcomputer systems.
Other languages gaining in popularity are PASCAL and Ada. PASCAL is being used by many colleges and universities to teach programming because it is fairly easy to learn; yet is a more powerful language than BASIC. Although PASCAL is not yet a standardized language, it is still used rather extensively on microcomputers. It has greater programming capabilities on small computers than are possible with BASIC.
Ada's development was initiated by the U.S. Department of Defense (DOD). Ada is a modern general-purpose language designed with the professional programmer in mind and has many unique features to aid in the implementation of large scale applications and real-time systems. Because Ada is so strongly supported by the DOD and other advocates, it will become an important language like those previously mentioned. Its primary disadvantage relates to its size and complexity, which will require considerable adjustment on the part of most programmers.
The most familiar of the procedure-oriented languages are BASIC and FORTRAN for scientific or mathematical problems, and COBOL for file processing.
Programs written in procedure-oriented languages, unlike those in symbolic languages, may be used with a number of different computer makes and models. This feature greatly reduces reprogramming expenses when changing from one computer system to another. Other advantages to procedure-oriented languages are (1) they are easier to learn than symbolic languages; (2) they require less time to write; (3) they provide better documentation; and (4) they are easier to maintain. However, there are some disadvantages of procedure-oriented languages. They require more space in memory, and they process data at a slower rate than symbolic languages.
Q.11 With early computers, the programmer had to translate instructions into what type
of language form?