Dependency graphs are everywhere in computing: Database tables have foreign key relationships with each other, programming languages have functions that call each other, and filesystems have folders containing folders and files.
This article will show how you how to:
Construct a graph by reading folders and files from disk
Render the graph into a .dot file
Render the .dot file into an image
The graphviz package is available here, and the final code is available on Github.
Constructing the Graph
A graph is usually described as a collection of vertices and a collection of edges that connect two vertices:
graphviz extends this with labels that provide additional information when rendering, and clusters that describe how to group vertices together.
We’ll want both types of labels when we construct our filesystem graph, because we’ll color them differently:
Vertex label: Is the file a directory, symlink, or file?
Edge label: Is the link a hard link between e.g. a folder and its contained files, or a symbolic link?
Here are our types:
Here’s how you can traverse the directory structure:
The output of readDirectoryGraph are application-specific Haskell values. We haven’t yet used anything from graphviz.
Rendering the Graph
When we render the graph, we’ll want to set different colors for files, symlinks, and directories. Here’s how we can do that with the labels we saved during the traversal:
Now we’ll tie everything together:
This can be rendered into an image using the dot command from the graphviz package that your OS’ package manager likely carries:
And here’s the final result:
This is the corresponding directory structure:
The graphviz package isn’t too difficult to use, but I feel like it could use a few examples. Hopefully this helps someone trying to render a graph.
Michael Burge has years of experience developing complex software, including custom compilers, data stores for "big data", and machine learning models.