What The Heck Is Graph Theory
Before we start -do me a favour. Forget that the word 'graph' usually means a chart
with an x and a y axis. Assume that we have an entirely new meaning of the word. Don't
sit there thinking "I know what a graph is, I like graphs, any moment we are going to
start ploting points on a graph". Wrong, Wrong, Wrong! It may help if you make up a
word such as "bazonka" and replace every occurance of the word "graph" with the word "bazonka".
Introduction to Bazonkas (easy)
The mathematical use of the word "set" is fine. It fits with our everyday use of the word.
In the Lerch Grossman case we are particularly interested in a set of "blocks" -the chunks of rock that
make up the geology in our area of interest. Graph theory parlance has us call the elements of this
set vertices. Imagine a second set, this time a set of connections between the vertices, we call these connections
edges. It is common to draw vertices as points and edges as line between them. Thats all there
is -a graph is a set of things (vertices) and a set (or a collection) of connections between them (edges).
Sometimes these edges are defined as an pair of vertices.
Graph with set of vertices {A,B,C,D}
and set of edges {(A,B),(A,C),(C,B)(C,D)(B,D)}
Notice the use of braces {} to indicate a set.
In the LG algorithm we encounter two graphs. Though the set of vertices is the same (the
set of blocks), the set of the edges is different.
One represents information about the overburden of each graph (I call it the "slope graph").
This information does not change in the course of the solution. The other graph is a tree (explained
below) and changes as the solution proceeds. I call this the "tree graph".
The Graph Zoo
There are many types of graphs rather too many terms to describe them. Moreover not all authors
use the same terms in the same way. Mining engineers should not be troubled by this as they are
used to dealing with geologists. Important distinctions include:
- The null graph -no vertices, no edges.
- Multigraphs permit more than one edge between a given pair of vertices. So a graph might
Contain vertices {A,B,C} and have edges {(A,B),(A,B),(B,C)}. We don't meet multigraphs in the
Lerch Grossman Algorithm. However it illustrates a point about sets. Since in this diagram there
are two edges between A and B the pair (A,B) occurs twice. However, the nature of sets is that
something is either a member or it isn't. The way that we have written it we have (A,B) being a member twice.
We can expect a visit from the Math Police. The get out is to call it a "collection" of edges rather
than a "set".
- Directed graphs are connected by directed edges. In the Lerch Grossman paper directed
edges are called arcs. A directed edge has a sense of direction to it. We draw them with little
arrows on. We write the set of edges as an ordered pair. In a regular graph the edge (A,B) is the
same as the edge. For mathematicians directed graphs are usually viewed as a special case -they
are simple graphs with (if you like) arrows added. For the programmer directed graphs are easier to
deal with and you expend effort ensuring that your program knows that an edge (A,B) is the same
as one (B,A). Directed graphs are often called 'digraphs' for those wishing to seam familiar.
- Connected graphs are graphs in which there is a chain (a sequence of edges (A,B)(B,C)(C,D))... )
joining every pair of vertices. LG use the term 'path' on a directed graph for the same concept
but using directed edges. (Watch for other terms from other authors e.g. edge sequences).
The issue of connectedness is a little more complicated with digraphs but we don't need it here.
- Cyclic vs acyclic graphs. If there are loops in a graph its called cyclic if not its acyclic.
Formally a loop is a chain with the same first and last vertex.
- Trees a connected acyclic graph is called a tree. If you pick a vertex (at random) in a tree
and call it the "root" then you have a "rooted tree". L and G go on to define branches and twigs
(yawn). They mean what you'd think, more or less. Rooted trees have two important properties.
Even though we are in theory talking about an non-directed graph, every edge now has a special
direction which is the direction of the root. In other words you can if you want think of rooted
trees as directed graphs. As programmers we say "yes please" to this. The second point, which turns
out to be important in the LG algorithm is this. Every vertex, except the root has exactly one
edge on the root side. It may have any number (including zero) on the 'leaf' side. So if we have
a rooted tree we may be able to represent it as a data structure in which each record has exactly
one pointer pointing in the root direction. This turns out to be the case in the LG algorithm
-and is a good thing. You read it here first folks!
- The list goes on. With only the above properties we can ask a large number of intriguing
questions of a graph. For example, going back to Euler in 1763, we can ask whether it is possible
to find a circuit that will use all the edges exactly once (an euler circuit). Often, however, we need to add to
the basic graph concept. Vertices represent values or edges represent distances. Numbers and
sometimes colours are added. The menagerie of graph types and properties is extensive.
Some Diagrams
Multigraphs
Here Euler's Konigsberg bridge problem is shown
as a multigraph drawn in red. It is a multigraph because there are more than one edges between
certain vertices.(here representing bridges).
The background is intended to depict the layout of the the
town with bridges where the edges cross the river. Euler showed that (for this particular
multigraph problem) residents could not take a walk that would cross each bridge exactly once.
Directed Graph
Directed Graph with set of vertices {A,B,C,D} and directed
edges {(a,b),(a,c)(b,c),(d,c),(c,a)}
Graph that is not connected
Graph that is not connected. Although there is (e.g.) a path from
G to C {(G,F)(F,E)(E,C)} there is not one from A to C. Note that {(C,D)(D,E)(E,C)} forms
a cycle so this particular graph is
not acyclic.
Rooted Tree
If you look at the graph drawn in yellow you can
see that it is acyclic and connected. Therefore it is a tree. Any vertex can be chosen as a root.
I chose E even though the graph is drawn suggesting that it should be A. Every edge has
rootwards direction (green arrow).
In the LG algorithm the "slope graph" is directed (i.e. its a digraph). As all the directed
edges point to blocks that are higher than themselves there are no cycles. The other graph
is a rooted tree. In fact it would not generally be connected -it would be a set of
disconnected trees. However, a "dummy root" is introducted. This is a vertex with no corresponding
block of rock associated with it -though it is good to imagine it below the rest of the model.
Adding this extra block is a shameless dodge that just keeps things tidy. Similarly the tree
graph is treated in the literature as an undirected graph -even though the direction of the root
is very important. Further as there is the all important value of each block we might call them
"weighted graphs" -this just refers to the presence of the additional information associated with
each block (vertex.)
Transitive closure
Transitive is an important concept in many aspects of discrete mathamatics, including the LG
algorithm. Supose that we have a graph (in this case a digraph) where the vertices represent
people and the directed edges imply the relation "is a parent of". If we were interested in the
transmission of a genetic disorder we can see that not only can the child receive the disorder
so can the grand child, great grand child etc.. We can define a new graph "is an ancestor of"
in terms of the original simple one . Sometimes this is refered to as the repeated application
of the function (function in the computing sense). Often a reference is made to a computer technique called "recursion".

In this graph you can think of the purple graph as the original. It might represent the
function "is a parent of" on the set of 8 people (vertices). The red graph is then created as the
"transitive closure " of the red graph -it might represent "is an ancestor of".
In the LG algorithm a digraph (directed graph) is used to express the constraints of rock strength and slope
design. I call this the slope graph. Each block is represented as a vertex that has directed edges (or arcs) to its close
neighbours on the levels above. The transitive closure of this graph represents the concept of
"is in the overburden of". From the above diagram you can see that there is often vastly more
edges in the transitive closure of a graph than in the graph itself. So the concept of transitive closure
is the defining of a new graph in terms of an old one, often with vastly more edges.
Partial Graphs and Subgraphs
Suppose we make a new graph "graph B" by picking out bits of an existing graph "graph A".
There are two special possibilities here:
- Suppose that graph B contains some of the vertices of graph a but all
of the edges from A that connect those vertices. Here B is called a subgraph.
If A consisted of vertices which
represented the towns of Canada and edges that represented connecting roads then an example
of a subgraph would cover e.g. the cities and roads of Manitoba -some of the towns
but all of the roads connecting them.
- On the other hand if we keep all of the vertices (often only nominally) but only
some of the edges, then the result is called a partial graph. Extending our previous
example, a partial graph might represent all the towns of Canada but only roads which
are four lanes or more.
Closures of a Graph and Maximum Closure of a Weighted Graph
The LG algorithm is formally defined as and algorithm to find "The maximum closure of
a graph". What does this mean? A closure means a "closed subgraph". This generally
applied to directed graphs. A closure is a subset of vertices that
has no directed edges (arcs) pointing to vertices outside the closure. Consider the following
digraph: {(A,B),(B,C),(D,B),(D,E),(F,G),(G,E),(G,H),(H,C)}
If, for example, vertex D is to be part of a closure, then so are B and E because of the
directed edges (D,B) and (D,E). But because B is in the closure so is C because of the
directed edge (B,C). In finding a closure containing D we are using a similar concept to that
introduced with "transitive closure"-repeated application. We are not adding edges, however.
It does not follow that G is a member of the closure because
the edge (G,E) "points" the wrong way. If we need to have D and F in the closure then
it must include the vertex set {D,F,B,E,G,H,C}. Notice that the null graph is always a closure.
If the directed graph in question is
weighted (there is a numerical value associated with each vertex) then the maximum closure
is the closure with the greatest value obtained by summing all the weights in the
closure. Look carefully at the following diagram:
Notice:
- A directed graph is shown in red.
- The four vertices in the yellow bubbles are a closure.
- They are the maximum closure and have the value 2+3-1-1=3. Convince yourself that this is so.
i.e. that there is no other closure with a greater value
- Working out the maximum closure of even a simple graph is not trivial
The Pit Optimisation Context
Remember that one of the graphs in the LG algorithm is a directed graph, which I call
the "slope graph". The simplest case of the slope graph gives
each block in the interior of the model 9 edges/arcs pointing at the nearest 9 blocks in the
layer above. These are nine blocks that must be mined first, but before they can be mined
there are blocks on the layer above the nine that must be mined. The graph whose directed
edges indicate "block Y is in the overburden of a block X"
is therefore the transitive closure of the slope graph. In this case it
defines an overburden resembling and inverted pyramid. Moreover any mine
design that proposes slopes which are feasible (as defined by the slope graph) is a
closure of the slope graph. That means that for any block X to be mined any
block Y pointed to by and directed edge or arc (X,Y) will also have to be mined. We don't
want just any closure -we want the one that make the most money.
In the pit optimisation process each block has a monetary value -which may be negative.
We can therefore regard any graph for which the blocks are regraded as vertices as
"weighted" -the weights of the vertices are the values of the blocks. The optimum pit
must be geotechnically feasible (therefore it must be a closure) and we want the most lucre
we can have -therefore we want the maximum closure. So the L-G algorithm is entitled in
the original paper the "Maximum Closure of a Graph".
Last Thoughts About graph theory
Background information only.
Graph theory is an approachable area of mathematics. Many problems
are regularly featured as puzzles or under the term "recreational
mathematics". Introductory texts deal with problems of organising parties,
dances and sporting events. The impression is that graph theorests
have lots of fun while calculus instructors spend their time in
leaky baths or on ladders forever sliding down walls. Applied graph theory,
however, is generally considered part of Operations Research (a nebulous
term). Moreover it is not coincidental that graph theory started in earnest
at the same time as computers started to appear.
What is it that Graph theory dudes do? When they are not attending parties, dances and sporting events
they do three things:
- As we see above it is important to have a system for describing graphs and their properties.
- As mathematicians are wont to do, they prove things about graphs. So for example there is
a simple way to determine whether a graph has an Euler circuit, and this was proven by Euler.
- Lastly (and this is where we get interested) they work on algorithms to solve problems
that involve graphs.
The remainder of this page is intended for those hoping to program the algorithm.
Graphs and Computer Programming
Strictly for programmers
This section is only relevent if you are considering programming the LG algorithm
yourself
Since graphs consists of things (vertices) and connecting between things (edges) it is
worth reviewing some relevent programming issues. This page is not about computer programming.
But it seems worthwhile to review some issues.
- Dynamic vs Static Storage. Are you going to use static arrays, dynamic data structures
(linked or doubly linked records) or a more modern type of container, such as those provided
by C++'s standard template library.
- How the edges are represented. Options include true pointers, iterators (pointers on drugs),
integers that represent the index of vertex in an array or array like structure, names (substrings)
that can then be used to lookup the storage position of the indicated vertex, and using 2d arrays
as adjacency matrices.
- Maintaining the structure.For example, if we are not dealing with a directed graph we if
A is adjacent to B than B is adjacent to A.
- Additional data: Depending on the method additional information may be stored abour individual
vertices or edges. The might be flagged e.g as belonging to a particular sub set, they may have
temporary or permanent values applied to them etc..
Graph Algorithms
This section is only relevent if you are considering programming the LG algorithm
yourself
Formally an algorithm is a method for solving a problem that requires a finite number of steps.
Just a pieces of string have finite length, proving that a method will work in a finite time does't
imply that the finite time is less than, say the age of the universe. Never the less, its better
than not knowing that it will work. As a rule the publishing of an algorithm does not involve giving
many of the practical details concerning how you should or could program it. The tone is generally
"of course we know how to program it ourselves but we don't want to spoil your fun".
Complexity
The LG algorithm is ....er well an algorithm. It is proven to find a solution in a finite number of
steps. We don't know how many. The study of how many steps an algorithm requires is called complexity.
Complexity is usually a function of "problem size" which in the case of graphs may be the number of vertices
and or edges. For example, an algorithm for finding the shortest route between two cities on a given road network
might take a number of steps that is proportional to the number of cities and proportional
to the square of the number of roads. As far as I am aware nothing has been published on the LG algorithm.
Certain problems likely take much longer than others even if they are the same size. In
practice it finds solutions reasonably well, but there doesn't seem to be any guarantee that the
next model that you try will finish in your lifetime. Note that speed is often obtained at the expense
of memory usage.
Representing Graphs in The computer
The first thing that they don't tell you is how to represent a graph in the computer. The
requirements of different methods of representation vary, however they include:
- Size: is it important how much memory is required? In particular some matrix methods tend to
require space that is proportional to the square of the number vertices. This is no problem for
say 100 vertices but for a million it becomes impractical.
- The ability to visit each vertex (or edge) once. Algorithms tend to have steps that say
things like "find a vertex with property P -if one exists". If the vertex data is embedded in
a linear structure such as a list or an array this is easy.
- Examining for structural features makes different demands on an implementation. For example
determining whether a graph is connected or cyclic, or counting the vertices on a branch each have their
own requirements.
- Creating and destroying the structures. Particularly in the case of the more complex data
structures, you must give some thought
Methods include:
- Adjacency matrices. If there are n vertices in a graph an adjacency matrix is an n x n
matrix containing 1's where vertices are connected by edges (i.e. are adjacent), and 0 otherwise. So
the first graph above (the yellow and black one) would be written as:
A B C D
A 0 1 1 0
B 1 0 1 1
C 1 1 0 1
D 0 1 1 0
A two dimensional array (e.g. of booleans) would clearly work. Notice that the space required increases
with the square of the number of vertices. This may not be a problem is the problem sizes are moderate or
if a large percentage of the vertices are connected by and edge
- Adjacency lists. Here for each vertex a list is made of each adjacent vertex. The yellow graph
could be written:
A:BC
B:ACD
C:ABD
D:BC
This is not quite so simple as we have a list or array of items which have variable length.
However it is a good fit with the storage classes available through C++'s standard template
library for example.
- Data Structures. The most general way to represent graphs is probably through data structures.
Records that represent vertices and have a variable number (linked list?) of pointers to other
such records.
This diagram is supposed to indicate a doubly linked structure. As vertex A is linked by an edge to B, so there
is a pointer in A pointing at B and a pointer in B pointing at A. The red line is supposed to
represent a pointer locating the structure from the programs point of view. If the graph is not
known to be connected you need to arrange multiple such pointers -perhaps a dummy vertex- pointing at
least once at each component. It is worth considering that disposing of the memory in a structure
such as this is not trivial.
- Problem specific methods. IN the LG implementation that I present here our vertices represent
blocks of rock which are arranged in a three dimensionally so that the arrangement suggest storing
all the data in three dimensional arrays. One of the data structures is a tree. Because it is a tree
and because it is possible to work always in a rootwards direction you can get away with a three
dimensional array of "pointers" to each blocks rootward neighbour. In fact true pointers are less
convenient than remembering the index of the adjacent block in the array.
- Rule based representations. On occassion it is possible not to have to represent the edges of the
graph at all because they have a regularity that can be described by rules. Suppose
Recursion and Transitive closure
In the following example it is assumed
that there is a function called ISAParentOF(A,B) which returns a true value only if A is a parent
of B. If there is a set of vertices: a set of people considered {A,B,C...} this function defines
a digraph. The following pseudocode defines a second graph on the same set of vertices. The function
IsAnAncestor(A,B) returns true if the directed edge (A,B) is in the transitive closure of IsAParentOf.
Note the recursive call in bold (the function calls itself). Note that, as written, bad things would happen if
IsAParentOF defined a cyclic graph.
Function Boolean IsAnAncestorOf(A,B)
{
if(IsAParentOf(A,B))return true;
for each C such that IsAParent(A,C) if IsAnAncestor(C,B) return true;
return false;
}