Almost all modern high-speed digital electronic …faculty.ccp.edu/faculty/cherbert/csci 111/programming... · Web viewJava, JavaScript, Visual BASIC, C, C++, C#, and Python are all

Computer Programming Languages Draft (incomplete) C. Herbert 9/15/2005

Computer programming languages

Low-level vs. high level languages

Machine code

Assembly language

Interpreters and compilers

internal vs. external storage

source code, object code, executable files

Development of high-level languages

Teaching languages vs. production languages

Scripting languages


Computer Programming Languages Almost all modern high-speed digital

electronic computers are based on the binary

numbering system. In the heart of the

computer, inside the central processing unit

(CPU), there are one or more arithmetic logic

units (ALUs) that process information by

performing binary arithmetic. All of the

audio, video, word processing, Internet access

and so on, which we associate with modern

computers is processed with binary arithmetic

inside the CPU by these relatively simple

electronic circuits. All data to be handled by

the computer, and all of its instructions, must

be processed in the CPU as a stream of binary

numbers.

The set of binary digits, or bits, that the CPU understands as its instructions to perform this binary arithmetic is

called the computer’s “machine code.” Each CPU, or each family of CPUs, such as the Intel 8086 family, has

its own machine code. So, there are as many machine codes as there are families of processing units.

Eventually, everything that a computer does must be translated into its machine code.

When a new processing unit is first invented and manufactured, it only

understands its machine code. Systems programmers work with these

binary codes to create a new language called assembly language. They

do this by using machine code to build an assembler, which is a program

that translates assembly language into machine code. Assembly

languages are made up of very primitive instructions, just like machine

code, but they can be written using numbers in bases other than base

two; mnemonics, or short words that sound like the instructions they

represent, such as ADD for addition or SUB for subtraction; and

symbolic names instead of numbers to refer to memory locations.

In the sample code on the left, …

Figure 2.1 – All information in modern computers is processed as base two binary numbers.

Insert image here

The same code in java, assembler, and machine code.


Writing sophisticated software such as word processors and video games is still rather difficult and very time

consuming in assembly language. Eventually, computer scientists and software engineers build translators that

can handle high-level languages, which are closer to human languages. Java, JavaScript, Visual BASIC, C, C++,

C#, and Python are all examples of modern high-level computer programming languages.

The translators that convert high-level

languages into machine code fall into

two categories: compilers and

interpreters. Using a compiler, a

programmer ends up with two stored

copies of the program. The first, in the

original high-level programming

language, is called the “source code.”

The second stored copy of the

program, which is the same program

after translation into a particular

machine code, is called the object

code. Even after translation into

machine code, a program may still

need to be processed so that it will run

on a particular computer with a particular operating system. There is often another step necessary after

compiling to mix the object code in with subroutines from the operating system. This step is sometimes called

“linking and loading” or “making” an executable program. Sometime linking and loading happens when we try

to run object code, and sometimes the compilers make and store an executable program as another step in the

process of compiling. So, with a compiler, there are two stored copies of the program, the original source code

and the object code, and sometimes a third copy called an executable program.

An interpreter is much simpler than a compiler. Rather than translating an entire source code program into

object code at once, the compiler translates each instruction one at a time and then feeds it to the CPU to be

processed before translating the next instruction. The only stored copy of the program is the original source

code program. Often scripting languages, such as JavaScript or Visual BASIC for Applications (VBA), work

this way. Scripting languages are simplified high-level programming languages that allow someone to program

in a particular environment. JavaScript can be added to the HTML codes for Web pages to provide them with

some primitive data processing capability. VBA allows someone to program features in Microsoft Office

products such as Microsoft Word or PowerPoint.

Figure 2.3 – All programs must be translated into machine code. Compilers, interpreters and assemblers perform this translation.


Interpreters are also used for teaching languages. Serious computer programming languages such as Java and

C# that are used by professional programmers are sometimes referred to as production languages. Teaching

languages are languages that are not generally used in production environments, but are instead used to teach

someone the logic of computer programming or the processes used in creating computer software before

attempting to teach them to use production languages. The Alice programming language, which is included

with this book, is an example of a teaching language. Its primary purpose is to be used as a tool to teach people

to be better programmers.

The first high level language that ever existed was the FORTRAN language, created in 1957 by the U.S.

Government in cooperation with IBM, which, at the time, was by far the world’s largest computer company.

FORTRAN was intended to be used by scientists and engineers working on large, primitive main frame

computers of the day – about 20 years before the first personal computers appeared – to program mathematical

formulas and processes. For example, it is rather easy, assuming one knows the math, to write FORTRAN

programs to perform matrix algebra on large sets of data or to perform the fast Fourier transformations that

electrical engineers use in calculus-based applications. In fact, the name FORTRAN comes from the two

words “formula translator.”

Before FORTRAN all software had to be created using assembly language and machine code. Once

FORTRAN appeared, people began to use it for much more than science and engineering. The increasing use

of FORTRAN to process commercial business data led to problems for financial accountants and auditors. A

bank auditor, for example, needs to be able to read a computer’s instructions to see what it computer is doing

with figures that represent banks deposits, account interest, and transaction fees. This was nearly impossible

with FORTRAN, unless the auditor was also a trained computer programmer.

The solution to the problems of using FORTRAN in the business world were solved with appearance in 1960 of

the COBOL language. Like the name FORTRAN, COBOL is an acronym that comes from the words “common

business-oriented language.” COBOL was developed by a team of people working for the United States Navy

under the direction of Grace Hopper, who rose to become an admiral before she retired nearly 40 years later. It

is estimated that as of the year 2000 there were more lines of code written in COBOL than in any other

computer programming language. COBOL has functions and instructions that are more suited to commercial

data processing than FORTRAN, and is a wordier language, which makes it easier for financial auditors to

understand without extensive training.

Yet COBOL, like FORTRAN, takes a while to master. For College students, this often meant that several

semesters had to be spent learning programming before anything useful could be done with a computer. At the

same time, computers were becoming smaller, less-expensive, and more accessible to the public. Personal


computers were still some years away, but by the mid-1960’s many college campuses had computers that

students could use. In 1965, in response to the promise of the computer on campus, and in order to make

programming as accessible to students as the new “mini-computers” that had begun to appear, two professors at

Dartmouth College in Hanover, New Hampshire, John Kemeny and Thomas Kurtz, invented the BASIC

programming language. BASIC was an interpreter-based language rather than a compiled language like

FORTRAN and COBOL, which was designed to be easy to learn and easy to use. It caught on quickly, and

when personal computers began to appear in the late 1970’s every machine had to have a BASIC interpreter or

people wouldn’t buy it, and more people learned BASIC than any other language.

Yet, as BASIC increased in popularity, one major problem with the language became evident. The BASIC

language had a GOTO command, which is sometimes strangely referred to as an “unconditional branching”

command. Each line in a BASIC program was numbered, and at any point in the program the GOTO

instruction could suddenly re-direct the flow of control to a line number in another part of the program. The

command was intended to let users set up branching ad looping command linked to IF…THEN statements, but

it was so flexible to use that for more than just a simple straight line sequence of instructions, programmer often

ended up with poorly designed logic that jumped repeatedly back and forth throughout the code. People other

than the original programmer often had to spend hours trying to figure out how the program worked. People

had to be trained to avoid creating what was referred to as “spaghetti code.”

The BASIC language was so easy to learn and so easy to use, that people often developed very bad

programming habits, such as creating spaghetti code, before they were properly trained in how to design the

logic of computer programs.

In response to this problem, a computer scientist from the Netherlands named Nicholas Wirth, invented the

Pascal programming language around 1970. He named the language after the 17th Century French

Mathematician and Philosopher, Blaise Pascal, who 300 years earlier had been one of the first people to ever

build a working mechanical calculator. From the beginning, Wirth’s Pascal programming language was

intended to be used as a teaching language, and contained commands with built-in structured logic, which we

will see in chapter xx.

Pascal was the first language to have built-in commands for looping and branching that forced the user to write

programs according to good principles of structured design. In Pascal, it became natural for programmers to

Insert loop example

Pascal vs. BASIC here


construct programs with a logical flow of instructions and almost impossible for them to end up with spaghetti

code. Like BASIC, Pascal was a simple interpreter-based language and was easy to learn and easy to use.

[Continue with C, Smalltalk, C++, and JAVA.

Documents

Almost all modern high-speed digital electronic …faculty.ccp.edu/faculty/cherbert/csci 111/programming... · Web viewJava, JavaScript, Visual BASIC, C, C++, C#, and Python are all