McGill University:
Department of Computer Science
Winter 1997 Class Notes for 308-251B

DATA STRUCTURES AND ALGORITHMS

Topic #2: MODELS OF COMPUTATION. COMPLEXITY


Complexity: Models of Computation

The following three models are examples of virtual, ideal machines:

The Turing Machine

Alan Turing (1912-1954) proposed the Turing machine in 1936. A brilliant mathematician and an individualistic thinker, he contributed to breakthroughs in cryptanalysis (which made possible the breaking of German codes in WWII) as well as to computer science.

The basis for the Turing machine is a tape that extends infinitely in both directions. The tape is divided into bits of a fixed size where each bit can contain either a 1 or a 0. Using this format, the tape contains both data and instructions.

The Turing machine also consists of a unit capable of reading and writing to individual bits, called the head. The head is in a fixed position.

A finite automaton controls the head. An automaton is a control mechanism designed to follow precoded instructions. In the case of the Turing machine, the tape contains these instructions. The finite automaton has a finite number of states along with corresponding transitions between these states. The course of action of the automaton is determined by the current state of the automaton and input that is read from the tape. For example, an automaton may be designed to go to state B if it reads a 1 from the tape while it is in state A. An everyday example of a finite automaton containing two states is a light switch. In this example, the two states are on and off. If the state of the lightswitch is on, pushing the switch is the input that will cause a transition to the off state.

Using four instructions, the Turing machine can perform a variety of tasks such as evaluating mathematical expressions and even sorting (However, these programs would be very difficult to implement). These four instructions are:

  1. Read the bit to which the head is pointing.
  2. Write to the bit to which the head is pointing.
  3. Move tape to the right.
  4. Move tape to the left.
Each instruction takes one time unit. Also, implicit in the above four instructions is the option to do nothing.

The Random Access Machine (RAM)

The RAM consists of a finite set of registers and a finite memory. Memory is divided up into words where each word has a unique integer address between 1 and n. The contents of each word can be an integer, a real number and numbers with infinite expansions such as pi. These values can represent either data or instructions.

Address computations are possible on the RAM. In other words, it is possible to take an address and add a value to it. This is especially useful for arrays. Referencing elements in an array becomes a simple case of knowing the address of the first element and then adding an offset to that address to get to the desired element.

Fetching and storing items from memory involve going to the appropriate address stored in a register. Fetching or storing a value takes the same amount of time, regardless of where the value is stored in memory.

On the RAM, all operations are performed in the registers. Standard operations such as fetch and store are all O(1), meaning they take constant time.


The Pointer-Based Machine

The pointer-based machine is based on a number of cells, or nodes, that have the capacity to grow without limit. A pointer is attached to each node, and each node contains a record. As in the RAM, this machine contains a group of registers.

Also like the RAM, pointer-based machines execute standard operations such as store and fetch in O(1), or constant, time.

Unlike the RAM, each cell has no positional relationship to the others, meaning pointer-based machines don't support address computations. As a result, arrays and hashing don't exist on these machines. However, pointer-based machines have the advantage of containing an unlimited number of cells.


Counting Time

There are three approaches to counting the time taken by an algorithm. The one we will be concerned with in this class will be the first method, uniformed cost measure.
  1. Uniformed Cost Measure: The idea behind this method is that each operation takes a constant amount of time. The advantage of this method is simplicity. In most cases, it gives a good estimation of the running time, but there are some instances where this method is overly simple. For example, adding two 100,000 digit numbers will take more time than adding two single digit numbers. Saying they both take the same constant time is misleading.

  2. Logarithmic Cost Measure: This method offers a better estimation in the example mentioned above. Its main premise is that the time of an algorithm will always be proportional to lg n where n is the input size. Using the above example, adding two numbers will be O(lg n) instead of O(1). The reasoning behind this stems from the fact that adding two integers of size n will take time proportional to the number of bits in the integers. This will require at most lg n bits to store in memory. For example, suppose we have a value n. In binary,

    where k is the largest power of two in the representation of n. It follows that

    Taking the log of both sides, we get

    In other words, the representation of n in memory is at most lg n bits. If the time taken to add two numbers is dependent on their size (ie. the space the values take up in memory), the time will be proportional to lg n. This method is more realistic, but it tends to be impractical since it requires more sophisticated computations.

  3. A Compromise: This method is a combination of the two above methods. If an operation involves values less than or equal to n^c where c is some large finite constant, that operation is considered to take constant time. Operations where the values are greater than n^c take times that are proportional to lg n. The reason for this is similar to before:

Remark: "lg n" is equivalent to "log base 2 of n".


Introduction To Running Time

The exact running time of an algorithm is dependent on many factors including the input size, compiler, and the machine. The true time of an algorithm would have to be a function of all these factors:

In general, it is impractical to try including factors such as the compiler to calculate values for the time T. Instead, we find an integer value for T by considering the number of basic instructions executed before halting. Two approaches to finding T are calculating the worst case running time and the average case running time.

The worst case running time is the slowest possible time an algorithm may take. An algorithm is absolutely guaranteed to never exceed this:

Worst case:

The average running time is the time the algorithm will usually take. This type of analysis is best for evaluating applications such as editors and compilers where the programs are often called. In these cases, worst case inputs are rare, and the average time is more indicative of the algorithm's performance.

Average case:


Java applet

The following applet is a simulation of a Turing Machine which compares (the lengths of) two strings of 1's. The initial contents of the tape is of the form: *string1?string2*. When the machine reaches its terminal state, the question mark is replaced by the appropriate relational operator, i.e. one of "=", "<" or ">". You can specify the arguments (strings to be compared) by typing them in the two input boxes. When you press the button "Load", the tape contents is modified to reflect your choices. You can then either go through the machine's states one by one (by pressing "Step") or make it run automatically until it reaches its terminal state (by pressing "Auto"). The description of the current state of the machine can be found in the bottom part of the applet. The states also appear in a tabular form below.

No attempt has been made to make this machine optimal (i.e. either minimize the number of states or minimize the time it takes). However, this simulation illustrates one of the main downsides of Turing machines. From a theoretical point of view, the Turing machines can perform any computable task. However, the actual implementation of a very simple computational process such as comparing two strings takes a relatively large number of states; this makes the implementation of complex algorithms using Turing machines almost impossible.


Table of states
State name Read Write Move Next state
START R LR1
LR1 0 R LR1
1 0 L LL1
* L LL3
LL1 0 L LL1
? L LL2
LL2 0 L LL2
1 0 R LR2
* R LR3
LR2 0 R LR2
? R LR1
LR3 0 1 R LR3
? < R LR4
LR4 0 1 R LR4
1 STOP
* STOP
LL3 0 1 L LL3
? L LL4
LL4 0 L LL4
1 R LR5
* R LR6
LR5 0 1 R LR5
? > STOP
LR6 0 1 R LR6
? = STOP
STOP



Your browser does not support Java, hence the Turing Machine applet is not currently available to you. Try again using a Java-enabled browser. If you are using Netscape, try enabling Java by accessing Options->Network Preferences->Languages. If you are using MS Internet Explorer, try View->Options->Security and check the box labeled "Enable Java programs".


Here are the source files:


Links to related material

Turing Machines

Turing Machines

Background, definitions, and examples. - Barney J. Cabrera - University of California at San Diego.

Turing's World

Turing Machine. - Barwise & Etchemendy from CLSI.

Alan Turing Homepage

Website dedicated to Alan Turing. - By Andrew Hodges.

Turing Software Demo

Click to download for MS-DOS systems. - From Rutgers University.

Turing Machine Simulator (TMS 1.0)

User programmable, examples included are a binary shifter and binary adder. - James Arvo and Boris Dimitrov of California Institute of Technology.

Alan Turing

Part of a series of lectures describing the development of computers and computer science. Computers from the Past to the Present - Michelle A. Hoyle.

Turing Machine

Programmable applet demonstrating addition and multiplication. - Buena Vista University Java Team

Turing Machine Simulator

Programmable. Demo of palindrome detection, subtraction and some "busy beaver" examples. - Suzanne Skinner - Ohio University.

RAM

A Random Access Machine

Definition of RAM - Dave Matuszek

Description of RAM and PRAM.

Roberto Tamassia - Brown University.

RAM and PRAM

Better Description. - Ulrich Ruede

Time Stuff

Analysis

Analysis of Sequential Algorithms. - Roberto Tamassia - Brown University.

Everything

Complexity Theory

Computability & Complexity. - Steve Linton - University of St. Andrews.


References

  1. Tarjan, Robert Endre. Data Structures and Network Algorithms. Philidelphia: Society for Industrial and Applied Mathematics, 1983.

Web page creators


Puleeze don't hurt us.

Last updated January 23, 1997 by Alvin Loh.

Copyright © 1997, Alexandru Ghitza, Alex Hamilton, Aline Normoyle, Alvin Loh. All rights reserved. Reproduction of all or part of this work is permitted for educational or research use provided that this copyright notice is included in any copy. Disclaimer: this collection of notes is experimental, and does not serve as-is as a substitute for attendance in the actual class.