Chapter 2: Introduction to Computer Sciencedrgates.georgetown.domains/GatesPythonIntroBook/... · Chapter 2: Introduction to Computer Science Chapter 2 will focus on topics in computer

Publication Pending Author: Dr. Ami Gates Date: 8/11/2016 All material is subject to copy write laws. Do not post, print, or reproduce without written permission.

Chapter 2: Introduction to Computer Science

Chapter 2 will focus on topics in computer science, including a brief history of computers and

computing, an introduction to binary and decimal numbers, the relationship between binary and

electronic computers, topics in computer organization, including the CPU, memory, and I/O,

networking and the web, programming languages, algorithm design, and complexity. All

following chapters will focus on Python development and examples. This chapter may be

skipped if the focus of the reader, in utilizing this book, is only an introduction to Python or

programming.

2.1 A Brief History of Computers

Today, it seems almost impossible that we could survive without computers. Computers are

integrated into the management and distribution of food and resources, are coupled with

transportation, are used for communication, are needed for education and employment, and

interestingly are involved in a significant portion of socialization.

Mobile computing and smart devices allow people to navigate through traffic, maintain home

security, follow the news and the stock market, check the weather, and perform a multitude of

other tasks. We download new applications each day to make our lives seem easier, more fun,

more interesting, and more secure. But when and how did this deep relationship with computers

and technologies originate?

The notion of computing is certainly not new. In fact, there is evidence that computing dates

back to early Cro-Magnon humans who recorded and manipulated values. Such evidence

suggests that people of that era may have kept track of values, or measures by etching them into

bone.

Over time, the discovery of several mechanical devices suggests the continuing and growing use

of computing tools. The abacus, for example, is a mechanical device that was employed for basic

math calculations and is believed to have been invented in Greece. The first Chinese abacus was

invented in 500 BC.


Figure 2.1: An Example of an Ababus

The growth and innovation of mechanical computing machines continued through the centuries.

Leonardo da Vinci’s mechanical calculator (1452 – 1519) may have been one of the first

mechanical calculators to be conceived. A replica of this invention was recreated in 1968 at IBM

by Dr. Roberto Guatelli. It had 13 wheels that each represented individual digits, ranging from

zero to nine. Wheels were physically turned to create numbers (Figure 2.2).

Figure 2.2: Sketch of Leonardo da Vinci’s mechanical calculator

John Napier (1550 – 1617) created a mechanical calculator called Napier’s Bones. This

mechanical calculator was a set of rotatable cylinders (thought to be made of bone). Each

cylinder contained all products of values 1 through 9 for that given cylinder (so that cylinder 9

contains 9, 18, …81). Napier’s Bones allowed for fast multiplication of any two numbers.


Figure 2.3: Sketch of Napier’s Bones

Figure 2.4: Mathematical Chart Representing Napier’s Bones

The Napier’s Bones chart in Figure 2.4 is created using the products of each pair of values. For

example, consider the 6th row, starting with 6. Under the “1 column” is the value 6 *1 or 6.

Continuing along the 6th row, the “2 column” contains 12, where the lower triangle holds the 2

and the upper the 1 (much like a ones place and a tens place). Confirm this method for any

square. Consider row 8 column 5. Because 8 * 5 is 40, there is a “0” in the lower triangle and a

“4” in the upper triangle. In this way, the Napier’s Bones chart is the product of all possible

combinations of 1 through 9, with the lower triangle of each square being the ones values and the

upper triangle of that same square being the 10s value.


Example 2.1: Using Napier’s Bones for Multiplication

To understand the elegance of Napier’s Bones, recall the rules of base b logarithms.

Use Napier’s Bones to multiply 26347 * 7

Start from the right. (See Figure 2.5)

1) Write down the 9

9

2) Sum the two values in the parallelogram: 4 + 8 = 12. Because 12 exceeds 9,

write down the 2 and “carry” the 1 to the next sum.

29

3) Sum the carryover if any with the sum the next two parallelogram values: 1 + 2

+ 1 = 4. Write down the 4.

429

4) Because there is no carry over, sum the next two numbers in the parallelogram:

2 + 2 = 4. Write down the 4.

4429

5) Because there is no carry over, sum the next two numbers in the parallelogram:

4 + 4 = 8. Write down the 8.

84429

6) Last value is a 1. Write down the 1.

184429

Now check the answer. Is the product of 26,347 * 7 = 184,429? Yes.

Given this concept, any two numbers can be multiplied, using sums. Think about how

you might multiply 787 by 26347?


Figure 2.5: Aligning Napier’s Bone Columns to Multiply 26,347 times 7

1600 – 1800: Adding Machines:

It is interesting to observe that most computer history overviews begin with the abacus and with

other such seemingly simplistic tools, but the tools themselves are not as important as the need

for their invention. Humans wanted and needed to compute. The realization that a mechanical

device could be used to compute, and thereby exceed human ability, was a monumental step.

Creations, such as the abacus and Napier’s Bones illustrated not only the need for computational

tools but the capability and aptitude of humans to conceive and construct them.

From the mid-1600s through the late 1890s, further mechanical tools with greater abilities were

invented, including Pascal’s Pascaline (1642), Odhner’s Calculating machine, (Figure 2.6), and

Burrough’s Adding Machine (1887).

Figure 2.6: Historical Adding Machines. Left: Blaise Pascal’s Pascaline 1642, Right: Odhner’s

calculating machine (1874)


All such adding machines and calculators were mechanical in nature, with each advancing the

concept in its own way. Odhner’s machine was still in use in the 20th century, until the invention

of the electronic calculator. In fact, the first few hand-held calculators were not produced until

the 1960’s, coming from BellPunch Co. (1963), Sony (1964), and Texas Instruments (1967).

1800 – 1950:

In the time between 1800 and 1950, several key inventions changed the once mechanical device

into the now electronic computer. Charles Babbage (1791 – 1871) created the Difference

Machine (run by steam) that not only calculated sets of tables, but printed the results by stamping

the results onto plates. Babbage also envisioned and designed the plans for an Analytic Engine

that used punched cards for input and for the storage of information. This step, built in part on

ideas from J. M. Jacquard (1752 – 1834), lead to the punch card – and to computers that

remember and create saved output.

During this same time, George Boole (1815 – 1864) invented Boolean algebra, which would

later become the logical foundation for digital electronics and computers.

While Babbage was not able to complete his vision, Herman Hollerith (1860 – 1929) took the

punch card to the next step. Hollerith, while working for the US Census Bureau, invented a

punched-card based machine that could “crunch” through data faster than any other method or

machine of the time. Punch cards (Figure 2.9), were cards with holes in them, in very specific

locations, that could hold location-based information. Hollerith patented his invention (1889) and

later sold the patents to a company called Computer-Tabulating-Recording Company whose

name was later changed to International Business Machines (IBM).

Figure 2.9: Punch Cards


IBM was originally owned by Charles Flint (1850 – 1934) and run by CEO, Thomas Watson

(1874 – 1956). Watson is credited with changing the name to IBM. IBM also retains the potential

for a disgraceful association with Germany during WWII and in assisting the Nazi’s in their

tabulation of genocide (IBM and the Holocaust: The Strategic Alliance between Nazi Germany and America's

Most Powerful Corporation, Edwin Black.)

In 1904, Ambrose Fleming (1849 – 1945) invented the vacuum tube (Figure 2.10), which was a

diode that could conduct electricity in one direction and block it in the other. Fleming’s initial

vacuum tube was subsequently improved by Lee de Forest (1873 – 1961). Forest’s new

“Audion” is said to be the beginning of the electronic computer age. Between 1907 and 1960,

vacuum tubes that controlled the flow of electrons were developed for various applications,

including radio, radar, television, and computers.

Figure 2.10 Fleming’s Vacuum Tube

The 1940s and 1950s

The invention of the Vacuum tube lead to the development of the first electronic computer,

known as the first generation of computers. The first programmable, electronic, and digital

computer was the ENIAC (Figure 2.11), which was built at the University of Pennsylvania in

1946.


Figure 2.11: ENIAC

Once the ENIAC was born, almost 100 other models followed, including the Z3 (Zuse, 1941)

and the Mark I (Harvard and IBM 1944), until the next leap in technology occurred and created

the second generation of computers.

The 1950s

The invention of the transistor again revolutionized the computer. The transistor is a fraction of

the size of a vacuum tube, does not generate as much heat, and allows for the control of

electrons. The transistor (transfer resistor) was invented at Bell Labs by Shockley, Bardeen, and

Brattain in 1947. These three scientists were later awarded the Nobel Prize (1956). While the

transistor was smaller and better than the vacuum tube, it was slow to be adopted. In the 1950s, a

small Tokyo-based company invested in the incorporation of transistors into their products.

They are now called Sony. Figure 2.12 shows an image of a transistor computer.


Figure 2.12: Transistor Computer (Second Generation)

The 1960s

The next large jump in technology, which took the advancement of computers along with it, was

the invention of the integrated circuit. These new computers were known as third generation

computers. The invention and implementation of integrated circuits was the work of several

scientists over a couple of decades. While there is some debate over the original idea and

invention, two people are credited with its realization--the first, Jack Kilby at Texas Instruments,

and the second, Robert Noyce with Fairchild Semiconductor Co. In 1961, Noyce was awarded

the first patent. Figure 2.13 shows a third generation computer using integrated circuits.

Figure 2.13: Third Generation Computer: Integrated Circuits


The 1970s

Once the idea of digital and electronic computers took hold, and the vacuum tube progressed to

integrated circuits, the next steps were focused on smaller and faster options. The advancement

of the computer was tied to the advancement of electronic technology. The 1970s brought about

the VLSI (very large scale integrated) circuits. VLSI circuits are on a single chip that contains

thousands of transistors and circuit components. VLSI based computers did not require cooling

systems and were able to employ an internal fan, which allowed computers to be much smaller

and much faster. The personal computer soon followed.

Figure 2.13: Left: VLSI circuit computer: Fourth Generation, Right: VLSI Chip

The 1980s

From the first IBM PC in 1981 and the first Macintosh computer in 1984, the race was on, and

the goal was smaller, faster, and mobile. Xerox was the first company to offer a computer with a

mouse in 1981 called The 8010 Star Information System, although Engelbart filed the first patent

for a computer mouse in 1967. Fifth generation computers include advancements in graphics,

sound, speed, voice recognition, and even nano technologies. Computers have also migrated into

a more diverse collection of devices, with advances in mobility and wireless capabilities.


The 1990s

Sixth generation computers are still evolving, especially through nano technologies, parallel

systems, multiple processing units, wide area networking, and greater mobilization. Current

computers include smart devices, tablets, digital paper, and laptops. Progression is continuing

toward voice commands and verbal recognition, as well integrated AI. Figure 2.14 shows the

evolution of computers from 1981 to 1998.

Figure 2.14: Computer Progression from 1981 to 1998

2007 - 2016

In 2007, Apple released its first iPhone Smartphone. In 2010, the mobile computer game, Angry

Birds, becomes a top seller and the Apple iPad was released. In 2011, Adobe’s Creative Cloud

was announced as was Apple’s Siri for the iPhone 4S. Cloud storage, sharing, and computing

also began to grow. In 2013, Microsoft introduced the Xbox One. In 2014, HTML 5 was born. In

2015, the Apple Smart Watch was released.

On June 17, 2016, the University of California, Davis announced the world’s first 1000-

processor microchip.

“The energy-efficient “KiloCore” chip has a maximum computation rate of 1.78 trillion instructions per

second and contains 621 million transistors. The KiloCore was presented at the 2016 Symposium on

VLSI Technology and Circuits in Honolulu on June 16” (https://www.ucdavis.edu/news/worlds-first-

1000-processor-chip/). Applications for the chip include coding and decoding, video processing,

encryption, and data processing.”

http://vlsisymposium.org/

http://vlsisymposium.org/

https://www.ucdavis.edu/news/worlds-first-1000-processor-chip/

https://www.ucdavis.edu/news/worlds-first-1000-processor-chip/


Figure 2.14: UC Davis KiloCore Chip

Today, all areas of computer science and technology are continuing to progress, with the goals of

smaller and faster joining the newer goals of smarter and mobile.

2.2: Digital Representation

Binary and Decimal Numbers: Base 2 and Base 10

The invention and utilization of the binary number system was a core conceptual component in

the creation of today’s computers. Binary numbers are numbers that are made up only of “0”s

and “1”s. From an electronic standpoint, this can be topically conceptualized as ON (flow of

electrons) and OFF (no flow of electron). In Binary (base 2), the number 101 is equivalent to the

number “3” in the decimal base 10 number system.

The binary (base 2) number system works in the same way that the decimal (base 10) number

system works. In base 10, the digits 0 – 9 are utilized, and the location of each digit within a

given number signifies its weight or value (based on exponents of 10). Similarly, in base 2, the

digits 0 and 1 are utilized, and the location of each digit within a given number signifies its

weight or value (based on exponents of 2).


Example 2.2: Base 10 Number System

The base 10 number of 3659.67 can be described as (3 * 103) + (6*102) + (5*101 ) + (9*100) +

(6*10-1) + (7*10-2).

(3 * 103) = 3 * 1000 = 3000

(6*102) = 6 *100 = 600

(5*101)=5 * 10 = 50

(9*100)=9 * 1 = 9

(6*10-1) =6 * .1 = .6

(7*10-2) = 7 * .01 = .07

If these values are summed back together, the original value of 3659.67 is regained.

The key idea is that the location of each digit, with respect to the decimal place, signifies the

magnitude or order of its value. The digit 3 is in the third position to the left of the decimal point

(starting with “0” as the first position to the left of the decimal). Similarly, the “6” is in position

two, the “5” is in position one, and the “9” is in position zero. To the right of the decimal, the

positions start at minus one, then minus two, and so on. The position of the each digit determines

the exponent of the base (10 in this case).

This same concept extends to any number-base system, including binary, hexadecimal, and base

8 (a few more common systems in use).

Example 2.3: Base 2 Number System

The base 2 or binary number system uses two digits, 0 and 1. Base 2 numbers are created using

exponents of the number 2 combined with position.

The base2 number of 1101.01 can be described in base 10 as (1 * 23) + (1*22) + (0*21 ) + (1*20)

+ (0*2-1) + (1*2-2) = 8 + 4 + 0 + 1 + 0 + .25 = 13.25

(1 * 23) = 1 * 8 = 8

(1*22) = 1 *4 = 4

(0*21) = 0 * 2 = 0

(1*20) = 1 * 1 = 1

(0*2-1) = 0 * .5 = 0

(1*2-2) = 1 * .25 = .25

In reverse, the value 13.25 in base 10, can be converted back to base-2 (or any other base)


Example 2.4: Conversion from Base 10 to Base 2

Goal: Convert 13.25 from base 10 into base 2.

1) Consider the largest exponent of 2 such that 2 raised to that exponent is smaller than

13.25. The answer is 3. The value of 23 is 8 and the value 24 is 16. Because 23 is the

largest exponent of 2 that is smaller than 13. 25, it is the starting point. This creates the

following:

1*23 +

2) Subtract 23, which is 8, from 13.25. The result is 5.25.

Repeat the steps above. Determine the largest exponent of 2 such that the result is less

than 5.25. The answer is 2 because 22 is 4 and 23 is 8.

Now we have:

1*23 + 1*22 +

3) Subtract 22, which is 4, from 5.25. The result is 1.25.

Repeat the steps above. The next largest exponent of 2 such that the result is less than

1.25 is 0.

Now we have:

1*23 + 1*22 + 0*21 + 1*20 +

Notice that it was necessary to include the “0*21 “ because we did not need to use the

exponent of “1”, but it is required that all exponents of 2 are represented, in order, and in

the correct position or location in the number.

4) Subtract 20, which is 1, from 1.25. The result is .25.

Repeat the steps above. The next largest exponent of 2 such that the result is less than .25

is -2.

Now we have:

1*23 + 1*22 + 0*21 + 1*20 + 0*2-1 + 1*2-2

This can be written as: 1101.01.

In all base systems, including our common base-10 system, the location of the digit in the

number (with reference to the decimal place) will determine its “weight”. For example, given the

base 10 number 934, the “9” is in the “hundreds place” (102 place). Similarly, the “4” is in the


“ones position” (100 place). Figure 2.15 illustrates the weighting of each position in base-10 and

in base-2. Figure 2.16 offers the first ten decimal values as binary.

4 3 2 1 0 decimal -1 -2

104 =

10000

103 =

1000

102 = 100 101 = 10 100 = 1 . 10-1 = .1 10-2 = .01

24 = 16 23 = 8 22 = 4 21 = 2 20 = 1 . 2-1 = .5 2-2 = .25

Figure 2.15: The weights of digits given location for base 10 and base 2.

Practice Exercise 2.2:

1) Write the number 45.75 as binary

2) Write the number 10101.1 as base 10

Solutions:

1)

25 = 32

1*25 + Subtract 32 from 45.7 to get 13.75

23 = 8

1*25 + 0*24 + 1*23 + Subtract 8 from 13.75 to get 5.75

22 = 4

1*25 + 0*24 + 1*23 + 1*22 + Subtract 4 from 5.75 to get 1.75

20 = 1

1*25 + 0*24 + 1*23 + 1*22 + 0*21 + 1*20 + Subtract 1 from 1.75 to get .75

2-1 = .5

1*25 + 0*24 + 1*23 + 1*22 + 0*21 + 1*20 + 1*2-1 + Subtract .5 from .75 to get .25

2-2 = .25

1*25 + 0*24 + 1*23 + 1*22 + 0*21 + 1*20 + 1*2-1 + 1*2-2

The result is 101101.11


2)

To write 10101.1 as base 10, expand the value.

1*24 + 0*23 + 1*22 + 0*21 + 1*20 + 1*2-1 = 16 + 0 + 4 + 0 + 1 + .5 = 21.5

It is a suggestion that you convert 21.5 back to binary to confirm that it is 10101.1.

Decimal

pattern

Binary

number

0 0

1 1

2 10

3 11

4 100

5 101

6 110

7 111

8 1000

9 1001

10 1010

Figure 2.16: Decimal and Binary


Binary Words: ASCII and UNICODE

Binary digits or bits, and collections of bits (8 bits = 1 byte, 64 bits=1 word for 64 bit machines)

can be used to represent individual letters and therefore words, sentences, books, sonnets, or

angry messages to neighbors with car alarms that go off endlessly and for no reason. There are a

few different methods that have been used over time and by different companies to represent

letters. For example, ASCII (American Standard Code for Information Exchange) offers a set of

binary numbers to represent each letter. Basic ASCII uses a 7-bit encoding, which allows for 27 =

128 different letters. Extended ASCII uses 8 bits for 28=256 different characters. For example

‘N” is “01001110”, “R” is “01010010”, and “a” is “01100001” is lowercase “a”.

Unicode is another method for encoding characters (letter, symbols, and numbers). Unicode uses

between 8 and 32 bits and so can represent all world languages. Once a single letter or character

can be represented as a binary code, this concept can be expanded to represent as much

information as memory will allow. Of course, memory itself has been evolving alongside

computers and technology.

Binary Music, Images and Video

Music and sound, as well as video and images can all be represented with bits and bytes. While

this textbook does not offer an in depth overview of the mathematical and binary representation

of sound and images, consider the following example.

Representing something digitally is to represent it with numbers only. Because all numbers can

be converted to base 2, and base 2 can be interpreted using the flow of electrons, this leads to

electronic computers. Color, for example can be represented with 24 bits: 8 bits of red (R), 8 bits

of green (G), and 8 bits of blue (B) – or RGB. This permits 28*28*28, or approximately16 million

shades of color.

An image file is made up of pixels (from the words palette index). Images are two-dimensional

arrays of color pixels, where each color may be represented in the RGB format. Because images

can be very large and can contain redundant color, they are often compressed. Common image

formats are jpeg (joint photographic experts group) and gif (graphics interchange format).


Figure 2.18: Example of a pixel in a 2D image.

Sound is analog and must be digitized to store and share it via computer. The process of

converting analog sound to digital sound is called A/D conversion. This process is beyond the

scope of this book, but conceptually, sound is a collection of waves that can be represented

numerically using factors, such as frequency and wavelength. Sound can also be broken down

into small intervals of time. The sampling frequency when creating a CD for example is 44,100

samples per second (44.1kHz sample rate). In this case, the amplitude of the sound wave is

measured and represented as bits about 44 thousand times each second. MIDI (Musical

Instrument Digital Interface) is a common approach for storing music as parameters, such as the

note, the volume, and the tempo. This is done at 44.1 kHz as well.

Figure 2.19: Digital Sound

Reference: http://code.compartmental.net/category/sound-bytes

Video representation is similar to image representation. One can think of video as a collection of

image frames. For example, HDTV is on average 50 – 60 frames (images) shown per second.


There are video formats, such as MP3/MPEG4 or avi. MP3 is an abbreviation for MPEG

(Motion Picture Expert Group).

Because numbers, characters, words, books, poems, music, sound, voice, images, and video can

all be “digitized” or represented as numbers, they can all be converted to binary and represented,

stored, and produced by an electronic computer. It is this realization that has culminated into the

Internet of Web Pages, the platforms for Social Media, the options for Music and Video Sharing

and Production, and everything from facial recognition to computational neuroscience to

blogging.

2.3 Computer Organization

First generation computers, such as the ENIAC, had to be externally programmed; switches had

to be manually turned and wires connected or disconnected. The stored program concept (Von

Neumann architecture, 1945) allowed computers to store both the program code (set of

instructions) as well as the data needed to run the program or as a result of the program. Figure

2.20 illustrates the basic Von Neumann architecture.

Figure 2.20: Von Neumann Architecture for Computer Organization

Digital computers are most often composed of at least three main components: memory, a central

processing unit (CPU), and devices for input and output. Per the Von Neumann architecture,

instructions are retrieved from memory and are executed in the order they are retrieved. This

section will introduce and explore computer memory, the CPU and its components, and input

and output options.


Computer Memory

The memory of a computer are all storage locations that the computer can access, either directly

via internal bus, external plug, or the Internet. Memory is where information, data, instructions,

and programs can be stored and retrieved.

There are several different types of memory and each has its own benefits and challenges. An

overall hierarchy of memory (Figure 2.21), starting with memory closest to the CPU, is

commonly organized as the following: registers, levels of cache, main memory, and external

memory (including remote and cloud). Further memory may exist on graphics cards that are

integrated into the computer through PCI (peripheral component interconnect) slots on the

motherboard of the computer.

Figure 2.21: Memory hierarchy of a computer

Registers

Registers are a type of RAM (random access memory). They are closest in proximity to the CPU

(see Figure 2.26) and they interact directly with the CPU to manage the progress and execution

of instructions for a program. There are several registers (see Figure 2.22) and each register

performs a very specific task or operation.

REGISTERS bits Name Function

DR 16 Data Register Holds memory operand

AR 12 Address Register Holds address for memory

AC 12 Accumulator Process Register


IR 16 Instruction Register Holds instruction code

PC 12 Program Counter Holds address of instruction

TR 16 Temporary Register Holds temporary data

INPR 8 Input Register Holds input character

OUTR 8 Output Register Holds output character

Figure 2.22: CPU Registers

Cache and Levels

Cache is fast and relatively small random access memory (RAM) between the CPU and main

memory. Cache can be located within the CPU or very near the CPU. When the CPU requires

data, it checks the cache first before accessing main memory. If the required data is in the cache,

the process will proceed more quickly. It is possible to have several levels of cache. The first

level of cache (L1) is closest to the CPU (often on the chip) and the fastest to access. Level 2

(L2) and level 3 (L3) caches can follow and each can be progressively larger, with the last level

of cache leading to main memory.

There are several methods for organizing and managing cache. While memory management is

beyond the scope of this book, Figure 2.23 illustrates an organizational example of a quad-core

(four CPU processors), each with two levels of cache and all sharing a larger L3 cache. Figure

2.24 compares two computers that both have three levels of cache. Note the size difference

between each cache level. As a side note, L1 and sometimes L2 cache will be on the CPU, while

(thus far on average) L3 cache is often between the CPU and main memory.

Overall, cache tends to store more recently used data (temporal storage), but there are several

algorithms that can optimize the contents of the levels of cache for a higher hit-rate. If a

program requires data that is not located in any levels of cache, it is called a miss and the main

memory must be accessed. A miss takes extra time and therefore slows down the overall

processing speed. If the data needed is in cache, this is called a hit.


Figure 2.23: Quad-core of CPUs, each with L1 and L2 cache and a shared L3 cache.

Figure 2.24: Levels of Cache and cache sizes for Sony versus Dell

Main Memory and Secondary Memory

Like cache memory, the main memory can also be directly accessed by the CPU and is random

access memory (RAM). Random access memory (RAM) requires electricity to maintain its

contents and state. When a computer is shut down, the main memory contents disappear (this is

called volatile memory).

Secondary memory, such as the hard drive, is any memory that is permanent. While main

memory (RAM) is faster, permanent memory can be slower. Most computers have integrated


hard disk memory as part of the overall system. Hard drive memory is (for now) usually

measured in GB or TB (giga-bytes or terra-bytes). Hard drives can also be external and can

extend the memory capacity of a computer system.

There two most common types of hard drives (at this time); the hard disk drive (HDD) and the

solid state drive (SSD). HDD secondary memory is a spinning disk with non-volatile

(permanent) memory. It has moving parts, which is one of the reasons it is slower than volatile

RAM (main memory, cache, and registers). A hard disk drive can be thought of as a metallic

plate that data can be written to and read from. It has a mechanical “arm” that accesses the data

as the disk spins. This is very much the same as an old-fashioned record player. Figure 2.25 is an

illustration of a hard disk drive.

Figure 2.25: Hard Disk Drive

Unlike the HDD, the SSD does not have moving parts. The SSD is made up of interconnected

flash memory chips. While HDD must rely on spinning platters, SSD does not. Therefore, SSD

can be created in very small sizes. SSD are also very quiet.

Flash memory describes a type of memory that is non-volatile and solid state (no moving parts).

Flash memory is used in flash drives and SSD. Flash memory can retain data without electricity

and so it offers permanent storage.

Secondary memory can be internal, such as a HDD or SSD and can also be external, such as

portable hard drives, flash drives, or cloud storage.

ROM: Read Only Memory

It is common to hear about RAM versus ROM. ROM stands for read only memory. Unlike

RAM, ROM memory is non-volatile. It does not lose contents when the computer is shut down

(when electricity is off). Like the name suggests, ROM is a type of memory that is intended to

contain permanent and critical information that will be read and used by the computer, but not

written over. ROM is memory that is required for mandatory and internal computer processes,

such as boot up (the starting up and loading of elements of the computer system). As an

example, computer devices will have small programs that manage their abilities. These programs

will not change and are required for device activity. As such, these device-internal collections of


code (programs) are ROM. Similarly, ROM chips are often used in games for the initial game

loading and start-up.

The Cloud

In addition to options for external hard drive or solid state memory, data can also be stored on

the cloud. The cloud is not actually a single location, nor is it necessarily above us. Cloud

storage is distributed memory hosted by services such as Box, Google, or Microsoft. Because the

location of memory affects its speed of use, cloud storage will be several orders of magnitude

slower than cache or main memory. However, it is comparable to other external memory options.

Cloud storage is accomplished via the Internet. For example, Google offers the Google Drive.

The Google Drive is also associated with Google email (gmail). Data and any items can be

uploaded, created, edited, shared, and downloaded using the Google Drive. The Google Drive

allows for the creation of Google Documents, the utilization of Google Applications, as well as

online and group editing, sharing, and the free storage of up to 5 TB (terra bytes).

Stored data can be retrieved at will via an internet connection. A clear benefit of cloud-based

memory is that it is persistent and accessible from any location that offers internet access unlike

a physical external memory system, such as a flash drive, which must be physically present to be

accessed. However, it is important to understand that there is no actual cloud. The storage of data

and information on the cloud is actually stored and accessed from physical servers that are

connected to the internet and distributed.

Computer Processor: The CPU

The CPU is the central processing unit of the computer, and as such is often referred to as the

“brains.” The purpose of the CPU is to execute programs that are stored in memory by fetching

each instruction, one after the next, and seeing to their execution. The CPU contains an

Arithmetic and Logic Unit (ALU), a Program Counter (PC), and a Control Unit (CU).

The ALU performs all of the arithmetic and logical operations. The CU fetches (gets) and

decodes each instruction from memory. The CU also determines the nature of each instruction.

The PC holds the location in memory of the next instruction that will be performed; it points to

the next instruction in memory. The components of the CPU are connected by a bus (often called

the front side bus or FSB). Figure 2.26 illustrates a common configuration for the CPU and

connected components. Figure 2.27 is an image of Intel’s Core 2 Extreme quad-core CPU chip.

Image 2.28 illustrates the differences between one, two, and four processor (core) CPUs.


Figure 2.26: A common CPU configuration

Figure 2.27: The Intel Quad Core

Figure 2.28: Single, Dual, and Quad Core CPUs


Busses

Components of a computer must share information. This is done using busses (a collection of

parallel wire that can transfer data). The front side bus (FSB) connects the CPU to the memory

and to other components on the motherboard (the circuit board containing and connecting

components of the computer). Other buses transfer data between neighboring components.

Figure 2.29 illustrates a possible bus and CPU configuration.

Speed

The CPU or processor speed is measured in hertz, which is how many times per second the

internal clock in the processor can switch on and off (tick). This is also known as the clock

speed. Given that kilo is 1000 times, mega is 1 million times, giga is 1 billion times, and terra is

1 trillion times, a 3.1GHz processor can process bits about 3.1 billion times per second. More

specifically, a 64 bit processor with a 3.1 GHz clock speed can process 64 bits, 3.1 billion times

per second.

Figure 2.29: Bus System Configuration


Computer Input and Output

The third major component set for any computer system (aside from memory and the CPU) is the

input and output (I/O). I/O can include any component that offers data or information to the

computer system and any component that collects or displays information from the computer

system. Input components might include the mouse, touch-screen, voice, scanners, and keyboard.

Output component might include the monitor, speakers, and printers.

The configuration of the average computer system contains a motherboard. The motherboard

contains the CPU chip, extra slots for additional components (called PCI slots), a bus system,

and edge sockets (such as USB2 ports) for the connection of external components. It is common

for a computer system to have more than one bus, so that components can connect to each other

and to the CPU as needed. I/O devices have device controllers that manage the device and

connect it to the appropriate bus (for information transport).

Digital Logic and Gates: A Quick Look

Everything that is possible on a computer is made possible by the remarkable blend of logical

gates, electricity, and the use of binary encoding to link all data with ON and OFF (0 for OFF

and 1 for ON). While the intricacies of gates, transistors, and circuits are a bit beyond the scope

of this book, the following will offer a few examples of simple logical gates.

A logical gate can take inputs and will generate an output. Logically, a AND b will generate a

True if both a and b are true (or “1” or ON). There is a set of Boolean logic rules for logical

operators, such as AND, OR, and NOT. These can be further extended to operators, such as

NAND (not AND), NOR (not OR), and others. Together, logical gates can be built into physical

electrical circuits that control the flow of electricity. This is the method used to convert between

the use of electricity and encoding in binary. Figure 2.30 illustrates a few possible logical gates.


Figure 2.30: Illustration of logical gates for AND, NAND, NOT, and OR.

In a general sense, digital logic circuits are made up of many transistors that are connected to

form logical gates. An integrated circuit is composed of many transistors. More than 7 billion

transistors can be placed on an integrated circuit (chip). A CPU can have one or more chips

(processors).

2.4 The Internet, Networking and the World Wide Web

History of the Internet

In 1962, Licklider of MIT first noted his idea of a “Galactic Network” that might globally

connect a set of computers so that users might quickly share and access data. In 1961, Kleinrock,

a colleague of Licklider at MIT published a paper on “packet switching”. A packet is a small

collection of data or information that can be “sent” from one computer to another over a physical

(or today wireless) network. In 1965, Merrill and Roberts created the first wide area network

(WAN) by physically connecting two computers using a dial-up method over phone lines (where

the phone lines where the physical connection between the computers). In 1966 Roberts joined

DARPA and started development on ARPANET.

The Network Measurement Center at UCLA was the first node or connected computer system in

ARPANET (1969). Later, other nodes (connected computers) were added to the small network,

including UC Santa Barbara and the University of Utah. By 1971, ALOHAnet (University of


Hawaii’s network) was added, as was London’s University College and the Royal Radar

Establishment in Norway.

In 1972, Kahn demonstrated ARPANET at the International Computer Communication

Conference; this was the first public demonstration of this new technology. In 1972, the concept

of email was introduced and Ray at BBN wrote the first software for “send and receive” for

electronic mail over a network that connects computers.

Figure 2.31: The Beginning of the Internet

The Growth of Networks in the Internet

Networking computers together locally is called a LAN (local area network). It is common for

many nearby computers to be networked to each other, creating a local network. From there,

two or more local networks can connect to each other, creating a wider network or WAN (wide

area network). Once most computers in the world were in some way connected (either directly or

indirectly), the Internet was born.

Figure 2.32: LANs and WAN


WWW, TCP/IP, and Protocols

As the networking and connection between computers and LANs and WANs increased, the need

for protocols became evident. In the 1970s, Vinton Cerf invented “Transmission Control

Protocol” (TCP). Later, he added on an “Internet Protocol” (IP), which together became the well

know and still utilized, TCP/IP.

The TCP/IP protocol allowed the Internet to support the World Wide Web (WWW). During the

1980s, files and data were shared across computers using the Internet. In 1991, Tim Berners-Lee

(working for CERN, the Particle Physics Lab in Geneva Switzerland) introduced the idea of the

“web” as an area where anyone could share information directly on a page (rather than just send

information). In other words, the WWW (and all the Web Pages) rests on top of the Internet. The

Internet is the platform for distributed sharing.

Berners-Lee also developed an internet protocol know as hypertext transfer protocol (HTTP)

as well as a language to send more than just text (hypertext) over the Internet. This language is

known as HTML (Hypertext Markup Language). Specifically, the format for HTTP is HTML.

Figure 2.33: The TCP/IP Protocol of 1973

In 1992, Mosaic (Netscape) was created by a group of students and researchers at the University

of Illinois, as the first web browser. By 1998, Microsoft Windows operating system (Windows

98) came equipped with a browser (Internet Explorer: IE) and options for internet service.

During the 1990’s Internet services grew worldwide, with the coffee shop “hot spot” becoming

popular. In 1998, Google was founded and later released YouTube in 2005.


In 2016, 88.5% of USA actively uses the Internet and through it the World Wide Web

(www.internetlivestats.com/internet-users/us/). The Internet is now wireless in many areas and

still maintains a foundation for the WWW to reside on. Figure 2.33 illustrates Internet usage

evolution.

Figure 2.33: Internet Use Over Time: 1995 - 2014

Details of the Internet and the Web

The World Wide Web (WWW) is a collection of Web Pages that reside on the interlinked

platform created by the Internet. While the Internet is the physical and often wireless network

that connects computers together, the web is the collection of trillions of pages that can be

accessed via the Internet connection. The only way to access a web page is to first connect to the

internet. The Web is not the Internet. Rather, the Web is a layer above the Internet.

A Browser is software that is managed by the operating system on each computer. A browser

can use a web page URL to locate a web page on the Internet and render it on a local computer

in a way that allows humans to understand and interact with it.

A web page URL (Universal Resource Locator) is the address (physical location) of that page on

the Internet. All web pages are stored “somewhere.” Their residence is usually a server

computer that is on and connected to the Internet at all times. If a computer, acting as a server, is


shut off, all web pages residing on that server would no longer be accessible from an Internet

connection. A URL (from a computer standpoint) is an address of numbers, such as:

xxx.xxx.xxx.xxx, where “x” is an integer from 0 to 9 and some “x” values can be missing.

For example, the URL, 173.194.204.106 is one of Google’s. Commands such as tracert

and nslookup can resolve a domain name, such as www.google.com , into its numerical IP

(Internet protocol) URL address (see Figure 2.34.)

Figure 2.34: The IP web address for the domain name: www.google.com using nslookup.

Within a URL, the HTTP stands for Hypertext Transfer Protocol. This is recognized by all

browsers. The “www” portion of the URL notes that the web page resides within the World

Wide Web (rather than ftp or another location). From the human view, URLs are names, such as

http://www.georgetown.edu/campus-life. The “georgetown.edu” portion is resolved by the DNS

(Domain Name Server which resolved names into IP address). The server hosting this page is

then located so that the page can be accessed. The “/campus-life” portion of this URL is a

subfolder on Georgetown University’s domain where this page resides. Figure 2.35 illustrates the

steps in locating a web page on the Internet.

http://www.google.com/

http://www.georgetown.edu/campus-life


Figure 2.35: (From HowStuffWorks.com) Illustration of Web Page Location

HTML

Web pages are written in HTML (Hypertext Markup Language) which is resolved and rendered

by a browser. Hypertext stands for “more than text” because webpages can send images, video,

music, and other structures that are beyond the limitations of plain text. HTML tags the contents

of a page to allow the browser to “understand” how to render it (make it appear) to the user.

Figure 2.36 illustrates HTML code (left) and how it will appear after the browser interprets the

HTML (right).

Figure 2.36 illustrates HTML code (left) and how it will appear after the browser

interprets the HTML (right)

Within HTML, other languages such as JavaScript can be used to create more interactive and

robust web pages. When users view a web page, they are seeing the result of the HTML code

sent to the browser from the server where the web page resides (over the Internet), and rendered.


2.5 Programming Languages

Language Types

High Level Versus Low Level Languages

Programming (coding) is a method that humans use to communicate instructions to a

computer. To that end, there are a plethora of programming languages from very low

level, such as machine code and assembly language, to very high level, such a Java or

Python.

Lower level languages are “closer” to the language that the computer uses, which is

binary. At some point, all computer processing is done in binary code. Machine language

and assembly use very few commands and can directly manipulate the contents of the

registers in the CPU. Programing in low level languages can be cumbersome. However,

low level programming allows for direct control over what the computer does. Figure

2.34 illustrates machine language (binary) and its equivalent assembly language.

Figure 2.34: Machine Language binary code and its equivalent assembly code

On the other side of the programming spectrum are high level languages. The goal of a

high level language is to make its clarity and use easier for the programmer. High level

languages are very similar in many ways to English. Python 3, the topic of this book, is

an example of a high level language. Figure 2.35 illustrates a very small Python program.


#Example Python 3 Program Code

def main():

total=0

count=0

print("This program calculates the average.")

print("Enter as many numbers as you like. ")

print("Enter STOP when finished.")

userinput=input("Enter next number or STOP to end: ")

while userinput[0] != "S" and userinput[0] != "s":

count=count+1

total=total+float(userinput)

userinput=input("Enter the next number: ")

if total > 0:

average=total/count

print("The avg is ", average)

else:

print("No numbers were entered.")

main()

Because high level languages are more similar to English than to binary, they must be

interpreted so that the computer can understand. High level languages are either

interpreted or compiled.

Interpreters and Compilers

Strictly speaking, if a program is fully compiled, and then run, all of the high level

English-like code is pre-converted to appropriate machine code. It is common to state that

a program is “compiled into an executable file” for example. In a fully compiled

program, the code is completely converted to machine code before it is run (executed).

In an interpreted language, a “middle program” reads and interprets each line of code and

converts it to machine code during the execution process. In other words, the interpreted

program is converted to machine code as it executes, rather than before it executes. There

are variations and combinations of these methods as well.


Programming Paradigms and Related Languages

While all programming languages are used to communicate instructions to computers, the

application and goal of the program can significantly affect the choice of programing

language. For example, if coding for the Web, languages, such as HTML,

JavaScript/CSS, PHP, or Python might be selected. If coding complex mathematical

applications, programming languages, such as C, C++, Python, or Matlab might be

selected. Some languages are highly versatile, like Python, which offers options for

several platforms and applications, while other language are very specific for certain

goals, such as LISP for AI programming.

Packages and Libraries

Many high level languages offer the option of adding functionality through the

importation of packages and libraries. For example, Python offers MatPlotLib for

graphics, NumPy and SciPy for math and computation, scrapy for webscraping, PyGame

for game development, pandas for data analysis, and hundreds of other options. Python

also comes in different “flavors,” such as CPython, JPython, IronPython, and Anaconda

Python (used in this book). Other high level languages have similar options.

Language Popularity

Programming languages evolve and move through time, gaining and losing in use and

popularity. Some core languages, such as C and Java have been very persistent. Others,

such as Python and R have grown significantly in popularity, use, and interoperability.

Figure 3.35 was taken from Spectrum IEEE 2016. It shows the rankings (per IEEE) of

each language for overall development value. Figure 3.36 shows both the average salary

and the general application focus of each of the more popular languages of 2016. In both

cases, Python is near to the top and is the focus of this book.


Figure 3.35: IEEE Ranked Top Ten Programming Languages of 2016

Programming Paradigms

While most high level languages can be written in more than one paradigm, there are a

few basic distinctions between programing styles and the languages that follow these

models.

Functional

Functional programming languages use functions as their primary structure. Functional

languages are often used for artificial intelligence programming as well as mathematical

programming. In a functional language, functions can be sent to and returned from other

functions and even created by other functions. Nested functions and recursion are also

common to this paradigm. LISP is a functional language.

Modular

Modular languages also use functions but are more focused on structuring the overall

program in individual modules or collections of code. A module of code is a set of


statements that together perform a set of related tasks. A module can be a user-defined

function, a decision structure, a loop, or combinations of these constructs. Pascal and C

are modular language by nature, and Python can be written modularly as well.

Tagged Languages

Some languages, such as HTML and XML use predefined tags to structure data and

information and to incorporate other languages. HTML, for example, contains a tag

called “<SCRIPT>” that allows for the inclusion of JavaScript into the HTML code.

HTML is not modular and is called a “mark-up” language because it marks up a page for

interpretation by a Browser for Web viewing.

Scripting Languages

Many languages can be used for scripting. The word scripting is “computer slang” for a

“smaller program that performs a very specific task.” Perl is a great scripting language

for example, but Python can also be used for this purpose. JavaScript is also a very

common scripting language for web development.

Object-Oriented Languages

In the same way that modular languages separate collections of code into smaller

modules that perform a few related tasks, object oriented (OO) languages extend the idea

of structured modules into classes and objects. OO languages allow the user to create new

classes (and therefore objects of that class). A class is a collection of code and data that

together encapsulate the operations and requirements of that object. OO languages, via

classes, can offer options, such as inheritance, encapsulation, and polymorphism. C++

and Java are OO languages. Python is also OO, but can be used as a modular language or

even a scripting language.

Again, many languages can be used to focus on the creation of large-scale software, small

scale programs, very small scripts, functions, classes, data structures, computation, web

development, etc. Some languages are more appropriate and more applicable to certain

tasks. For example, it would not be possible to write a large scale application, such as

Microsoft Office in HTML (a tagging language), or even JavaScript. Microsoft Office is

written in C and C++, which are often used for such large-scale complex applications.

Similarly, a small script, perhaps to parse through files and folders for example, can more

easily be written in Perl rather than in C since C has lot more overhead and complexity

not needed for a smaller goal.

Further topics in this area can be investigated in textbooks that focus on programming

language principles (PLP).


Figure 3.36: Applications and Salary Range for Programming Languages

2.6 Development, Analysis, and Complexity

Creating programs is a combination of art and science. It is necessary to know the language, but

only knowing the language is not enough to be a good coder. In other words, knowing English

does not make a person a poet. There are stages in the development process, as well as practice,

patience, and experience.

Development Stages

The first stage in the development of any project, large or small, is to fully understand the

requirements and specifications of the problem or question trying to be solved. This is the


planning stage. Part of this first stage is also considering all possible inputs (which is often not

simple) as well as desired outputs.

The second stage is the design and development stage. This stage often involves the use of

diagrams, such as flowcharts, UML diagrams, ERD diagrams, or other illustrations that assist in

the planning and clarification of the overall flow of the program. During this stage, new

functions or classes can be conceived of, and an overall outline or blueprint of the project can be

created with possible goals and examples of I/O.

The third stage is to begin programming, but one should do so carefully and slowly, and with

continuous testing. Writing code is not like writing a book or even a chapter in a book. Normally,

when people write, they create several pages and then proof read for errors and changes. This

method cannot be employed in the development of programs. Writing code in small sections and

testing often and continuously can help in the avoidance of bugs.

Bugs

Bugs are errors within a program, either syntactical or logical. Locating bugs in a program can

be very time consuming, and in some cases highly improbable, if too much complexity has been

added before bugs have been eliminated. The word “bug” actually originated in 1947 as the

result of a moth flying into the Harvard Mark II. The moth (bug) was located and removed by

Grace Murray Hooper, who was programming the machine at that time. Once the bug (moth)

was removed, the program ran successfully – and the word “bug” stuck.

One of the most important parts of programming is code testing and error elimination. The

responsibility of error location and correction (both syntactical and logical) is that of the

programmer.

The fourth stage is testing and verification. Once the program is written and it runs successfully,

it must be tested thoroughly and with many different, expected, and unexpected inputs. During

the test stage, outputs will be confirmed and updates made as needed. For large software

projects, many companies employ test-engineers who test and run code, document the results,

and suggest fixes and improvements. This area might also include improving the user interface

or experience, considering usability for users with disabilities, and further improving issues of

human-computer interaction.

The fifth stage, which is best initiated at the beginning and continued through all stages is

documentation. The program, the expected input and output, limitations, and possible future updates

are all part of the documentation. Inside of the program, comments are placed to explain sections

of code, functions and classes, and expected I/O.

The sixth stage (when applicable) is maintenance. This might include adding or changing code

within the program. Commenting programs as they are written will help considerably if future


updates to the code are required. Keep in mind that someone other than yourself might have to

update or change the code.

Algorithms and Complexity

An algorithm is a set of steps or a method that can be used to accomplish a task. By definition,

an algorithm always successfully and correctly completes a given task, or solves a given

problem. However, not all tasks and problems have viable algorithms (yet).

As a simple example, consider the goal of sorting. Imagine that the input is a collection (perhaps

an array or vector) of integers, in no particular order, and with some integers missing. The goal

of a sorting algorithm is to take the dataset of integers as input and then output the set of integers

in sorted order. Conceptually, this is easy to envision:

Input

[3, 10, 56, 7, 2, 87, 45, 67, 1, 9]

Output

[1, 2, 3, 7, 9, 10, 45, 56, 67, 87]

There are many algorithms that can sort values. Some algorithms are common and have well-

known names, such as bubble sort, heap sort, insertion sort, and binary sort. It is also possible to

create a new sorting algorithm, especially if there is extra knowledge about the format of the

input that allows for more creative methods.

Because there are often many algorithms that can solve a problem, other factors are considered in

algorithm analysis and selection, such as space usage (how much space does the method of the

algorithm require) and time (how long will it take to complete for a dataset of size n).

The area of the analysis of algorithms studies different options for solving problems and the

space and time requirements of those options. For smaller datasets, such analysis is often ignored

because 1 nanosecond versus 3 nanoseconds may not be relevant. However, some datasets are

huge, such as searching the Web for an image, the study of Twitter or Facebook data, the study

of bio-physics or astrophysics data, or finding a specific phrase from all texts in the world.

In cases where the size of the data is “big,” the algorithms used to analyze the data become

critical. This concern involves the area of computer science (known as algorithm analysis), as

well as complexity theory and big data analytics.


Complexity, Computability, Intractability

Some problems are so complex that they may not be computable (at least in our lifetime). Such

problems may be called intractable.

Common questions asked in theoretical computer science are the following:

1) What are the limits of what a computer can compute?

2) Are there questions or problems that no computer(s) can solve?

While this is an enormous and exciting area of research and interest, it is well beyond the scope

of this textbook; however, it seems worthwhile to define a few important terms and to invite the

reader to explore this topic further.

The Turing Machine

In 1936, Alan Turing introduced a “machine” based on? mathematics that could simulate any

computer algorithm, no matter the complexity. Because of this remarkable invention of

mathematics and theory, many note Turing as one of the fathers of computer science. The

reference to Turing’s paper is: 'On Computable Numbers, with an Application to the

Entscheidungsproblem', which appeared in Proceedings of the London Mathematical Society (Series 2, volume 42 (1936-37), pp. 230-265).


Figure 3.37: Turing Machine Visual Illustration

The Halting Problem

The famous Halting Problem asks the following question: “If given the description of a

problem, and input for the problem, will a program written to solve the problem ever halt or

stop?”

Alan Turing proved that an algorithm to solve the halting problem for all possible programs and

related inputs does not exist. In other words, the Halting Problem is undecidable. This leads to

further philosophical issues involving computers and decisions.

An undecidable problem, by definition, is a problem for which no single algorithm can be

created that always leads to the correct answer. The Halting Problem is an example of an

undecidable problem. There are several other undecidable problems. To review a few, the

following paper is recommend: http://www-math.mit.edu/~poonen/papers/sampler.pdf

Complexity

As algorithms and their analysis progressed, it became evident that certain problems fall into

certain “classes” of difficulty.

http://www-math.mit.edu/~poonen/papers/sampler.pdf


Within the study of complexity theory, there are labels such as P, NP, NP-Hard, and NP-

complete. Here is where one finds the famous question of whether P = NP?

The P stands for polynomial time. If an algorithm can run in P-time, it can run on the order of

some polynomial time algorithm (such as linear time, O(n), log time O(logn), quadratic time,

O(n2), etc. )

The NP stands for nondeterministic polynomial time. For problems in NP, there may not be a

polynomial time algorithm for which to solve the problem, but, if solved, the solution can be

verified as correct in polynomial time. In the case of NP problems, a heuristic is often used to

make a good guess at the solution. The solution is then verified (or not) in polynomial time.

Keep in mind that polynomial time is often a manageable amount of time. A heuristic is similar

to an algorithm in that is seeks to solve a problem or make computation. The key difference is

that an algorithm will always successfully reach the correct and best solution. ?A heuristic may

or may not reach a solution that itself may or may not be the optimal solution.

NP Hard is a collection of problems that are at least as hard to solve as the hardest NP problems.

NP-complete problems are a collection of problems considered to be the “hardest” problems in

NP. More formally, NP-completeness can be defined using an area of computer science called,

“reduction” theory, which is well beyond the scope of this book. NP-complete problems have no

known P-time solution and are considered intractable. However, there is no proof that a solution

in P-time does not exist for NP problems (including NP-complete problems); one has just not

been found yet, if it exits. This is the core of the famous question: Does P=NP?

An intractable problem is a problem that takes (for our current methods) an impossible amount

of time to solve. Such problems can only be considered and resolved as very small subsets of the

actual problem, or with the use of heuristics. For example, protein folding is considered to be

intractable and a member of the NP-hard class of problems. However, if only two or three amino

acids and a finite number of water molecules are considered, a solution can be discovered.

Alternatively, on a protein-size scale, with hundreds of amino acids and several hundreds of

other neighboring molecules, the question of how the linear sequence of amino acids folds

(quickly) into the three-dimensional biologically active protein that is formed in the cell

cytoplasm, is algorithmically intractable.


Figure 3.37: Complexity Sets. NP is non-deterministic P-time. P is polynomial time. NPC is

NP-complete. (In the case that P ≠ NP – which has not been proved or disproved)

Summary

Chapter 2 offers a broad and brief overview of several core areas of computer science, including

history, organization, networking, and complexity. The goal of Chapter 2 is to offer a conceptual

foundation and a starting point into the area of computer science and programming. The

remainder of this book will focus on Python 3 and programming constructs and concepts.