D3network by christophergandrud

Fork me on GitHub

Christopher Gandrud

d3Network

Tools for creating D3 JavaScript network, tree, dendrogram, and Sankey graphs from R.

v0.5.1 Build Status



Mike Bostock's D3.js is great for creating interactive network graphs with JavaScript. The d3Network package makes it easy to create these network graphs from R. The main idea is that you should able to take an R data frame with information about the relationships between members of a network and create full network graphs with one command.

Commands

d3Network currently has four commands for creating network graphs:

d3SimpleNetwork

d3SimpleNetwork is designed to take a simple data frame that has two columns specifying the sources and targets of the nodes in a network and turn it into a graph. You can easily customise the look and feel of the graph. Let's do an example.

First make some fake data.

Source <- c("A", "A", "A", "A", "B", "B", "C", "C", "D")
Target <- c("B", "C", "D", "J", "E", "F", "G", "H", "I")
NetworkData <- data.frame(Source, Target)

It's important to note that the Source variable is the first variable and the Target is the second. We can use d3SimpleNetwork's Source and Target arguments to specify which variables are which, if the data is in another order.

Now we can simply stick the NetworkData data frame into d3SimpleNetwork:

d3SimpleNetwork(NetworkData, width = 400, height = 250)

You'll notice that I added the width and height arguments. These change the size of the graph area. They are in pixels. Here is the result:

Play around with this graph. Notice that when you click on the nodes the text expands and changes colour.

There are many options for customising the look and feel of the graph. For example we can change the colour of the links, nodes, and text. We can also change the opacity of the graph elements:

d3SimpleNetwork(NetworkData, width = 400, height = 250,
                textColour = "orange", linkColour = "red",
                nodeColor = "orange", opacity = 0.9)

There are many different ways you can specify the colours other than just their names (as in this example). One way to select more specific colours is with hexadecimal colour values. A nice resource for choosing colour palates is the Color Brewer website. The next example uses hexadecimal colour values.

Other important ways to customise a force directed graph are to change the link distance and charge. Link distance is simply the distance between the nodes. Charge specifies how strong the force either repelling or pulling together the nodes is. Here is an example with a charge of -50:

d3SimpleNetwork(NetworkData, width = 400, height = 250,
                textColour = "#D95F0E", linkColour = "#FEC44F",
                nodeColour = "#D95F0E", opacity = 0.9,
                charge = -50, fontsize = 12)

This is a weaker charge than what we have seen so far (the default is -200). A weak negative charge means that the nodes do not repel each other as strongly. They are closer together than if there was a larger negative charge. Positive charges make the nodes attracted to one another. Basically, you will get a clump of nodes. Also, in the above example the text was a little small so I increased the font size to 12.

Have a look at the d3SimpleNetwork documentation for more customisation options.

d3ForceDirected

If you want to make more complex force directed graph structures use d3ForceNetwork. It allows you to use individual link and node properties to change the distance between individual nodes and the colour of the nodes depending on their membership in specific groups.

Maybe it's better to understand this with an example. We'll use d3ForceDirected to recreate an example by Mike Bostock. The network graph will show Les Misérables' characters co-occurance (the original data was gathered by Donald Knuth). The link distances are based on how close the characters are to one another and the colours symbolise different character groups.

To start out let's gather the data and create two data frames with it. One of the data frames will have information on the links, similar to what we have worked with so far. The other will have information on individual nodes; in this case Les Misérables characters.

# Load RCurl package for downloading the data
library(RCurl)

# Gather raw JSON formatted data
URL <- "https://raw.githubusercontent.com/christophergandrud/d3Network/master/JSONdata/miserables.json"
MisJson <- getURL(URL, ssl.verifypeer = FALSE)

# Convert JSON arrays into data frames
MisLinks <- JSONtoDF(jsonStr = MisJson, array = "links")
MisNodes <- JSONtoDF(jsonStr = MisJson, array = "nodes")

In this example we converted a JSON-formatted file into two data frames. You can of course just work with R data frames. Now let's look inside these data frames:

head(MisLinks)
##   source target value
## 1      1      0     1
## 2      2      0     8
## 3      3      0    10
## 4      3      2     6
## 5      4      0     1
## 6      5      0     1
head(MisNodes)
##              name group
## 1          Myriel     1
## 2        Napoleon     1
## 3 Mlle.Baptistine     1
## 4    Mme.Magloire     1
## 5    CountessdeLo     1
## 6        Geborand     1

You can see in the MisLinks data frame that we again have source and target columns. Notice that the data frame is sorted by source. We also have a new column: value. This can be used to determine the link widths and distances.

In the MisNodes data frame we have two columns: name and group. There is one record for each character (node) in the network. They are in the same order as the source column in MisLinks. The group column simply specifies what group each character is in. This will be used to set the nodes' colours.

To make the network graph we just need to tell d3ForceNetork where the data frames and columns are:

d3ForceNetwork(Links = MisLinks, Nodes = MisNodes,
               Source = "source", Target = "target",
               Value = "value", NodeID = "name",
               Group = "group", width = 550, height = 400,
               opacity = 0.9)

Mouse over the nodes to see the characters' names.

Link Weighting

The arguments linkDistance and linkWidth allow you to set how far apart the nodes are and how wide the links are, respectively. You can set these at fix numeric values, or enter JavaScript functions in order to weight the distances/widths by the value. For example, the default linkWidth is set as the function linkWidth = "function(d) { return Math.sqrt(d.value)}". This finds the square root of each value and uses it to set the width.

Zooming

You can also use the zoom option to create a graph that you can zoom in and out of with your mouse scroll-wheel:

d3ForceNetwork(Links = MisLinks, Nodes = MisNodes,
               Source = "source", Target = "target",
               Value = "value", NodeID = "name",
               Group = "group", width = 550, height = 400,
               opacity = 0.9, zoom = TRUE)

d3Tree

A clean way to present hierarchical data is with modified Reingold-Tilford Trees. Use these types of trees when you have a single root connected to hierarchically organized child nodes.

Use the d3Tree command to create the trees. Many of the aesthetic arguments are the same as with the force directed commands above. Zooming is allowed as above. The major difference is that there is only one data argument List. This is a list type object that has a particular structure that we'll look at later. Also, instead of charge and linkDistance the spacing of d3Tree nodes can be set by changing the diameter of the whole graph with the diameter argument.

Let's recreate Mike Bostock's Reingold-Tilford tree example, using JSON formatted data. The data shows the Flare class hierarchy.

# Download JSON data
library(RCurl)
URL <- "https://raw.githubusercontent.com/christophergandrud/d3Network/master/JSONdata/flare.json"
Flare <- getURL(URL)

# Convert to list format
Flare <- rjson::fromJSON(Flare)

# Create Graph
d3Tree(List = Flare, fontsize = 8, diameter = 800)

Mouse over the nodes to enlarge the labels.

Data structure

Data for d3Tree needs to be in a hierarchical list with one root node and a number of children. All nodes need to be labeled as name and all children need to be further lists of named nodes. Maybe the best way to understand this is to look at a simple example. The following CanadaPC object is a list where the root is Canada. The first level of child nodes are the provinces and territories. These have further children which are capitals/principle cities of each province/territory:

CanadaPC <- list(name = "Canada",
             children = list(list(name = "Newfoundland",
                                  children = list(list(name = "St. John's"))),
                             list(name = "PEI",
                                  children = list(list(name = "Charlottetown"))),
                             list(name = "Nova Scotia",
                                  children = list(list(name = "Halifax"))),
                             list(name = "New Brunswick",
                                  children = list(list(name = "Fredericton"))),
                             list(name = "Quebec",
                                  children = list(list(name = "Montreal"),
                                                  list(name = "Quebec City"))),
                             list(name = "Ontario",
                                  children = list(list(name = "Toronto"),
                                                  list(name = "Ottawa"))),
                             list(name = "Manitoba",
                                  children = list(list(name = "Winnipeg"))),
                             list(name = "Saskatchewan",
                                  children = list(list(name = "Regina"))),
                             list(name = "Nunavuet",
                                  children = list(list(name = "Iqaluit"))),
                             list(name = "NWT",
                                  children = list(list(name = "Yellowknife"))),
                             list(name = "Alberta",
                                  children = list(list(name = "Edmonton"))),
                             list(name = "British Columbia",
                                  children = list(list(name = "Victoria"),
                                                  list(name = "Vancouver"))),
                             list(name = "Yukon",
                                  children = list(list(name = "Whitehorse")))
             ))

Clearly, R doesn't make it super easy to create these types of lists. As we saw in the previous example, you can always just import correctly formatted JSON files into R and use those instead.

Anyways, let's create a tree graph for the CanadaPC data:

d3Tree(List = CanadaPC, fontsize = 10, diameter = 500,
       textColour = "#D95F0E", linkColour = "#FEC44F",
       nodeColour = "#D95F0E")

d3ClusterDendro

We can use the same type of data to create cluster dendrograms using the d3ClusterDendro command. Again it's aesthetic arguments are similar to the other commands. You can change the width and height of the graph (rather than the graph area) with the widthCollapse and heightCollapse arguments. These are the proportion of the total graph area width and height that you would like it to be reduced by. For example, widthCollapse = 0.5 would reduce the graph by 50% of the overall width.

Here is an example using the CanadaPC data from above:

d3ClusterDendro(List = CanadaPC, fontsize = 12,
                zoom = TRUE, widthCollapse = 0.8)

The graph is zoom-able with the scroll-wheel and can be dragged about.

d3Sankey

You can use d3Sankey to create basic Sankey diagrams. Here is an example:

# Load energy projection data
library(RCurl)
URL <- "https://raw.githubusercontent.com/christophergandrud/d3Network/sankey/JSONdata/energy.json"
Energy <- getURL(URL, ssl.verifypeer = FALSE)
# Convert to data frame
EngLinks <- JSONtoDF(jsonStr = Energy, array = "links")
EngNodes <- JSONtoDF(jsonStr = Energy, array = "nodes")

# Plot
d3Sankey(Links = EngLinks, Nodes = EngNodes, Source = "source",
         Target = "target", Value = "value", NodeID = "name",
         fontsize = 12, nodeWidth = 30, width = 700)

d3Network in stand alone documents

So far we have only seen the basic syntax for how to create the network graphs. If you've been following along you'll notice that running a d3Network command spits out the HTML and JavaScript code needed to create the graph into your R console. If you want to save it to a file creating a stand alone HTML file (i.e. one you can just double click on and it will open in your browser) use the file option. For example:

d3SimpleNetwork(NetworkData, file = "ExampleGraph.html")

This will create a new file called ExampleGraph.html in your working directory.

You can open this file in any text editor and modify the code however you like. For example, checkout the D3 Force Layout Wiki for more force network graph customisation options.

d3Network in dynamic knitr reproducible documents

If you would like to include network graphs in a knitr Markdown dynamically reproducible document just place your d3Network code in a code chunk with the option results='asis'. Also set the argument iframe = TRUE and specify a file name with file as before.

d3Network in IPython notebooks

Check out Spencer Boucher's helpful example of how to use d3Network inside of IPython Notebooks.

d3Network in Shiny web apps

From version 0.5, d3Network graphs can also be used in Shiny web apps. A full working example can be found at christophergandrud/d3ShinyExample. This example creates a very simple app that allows a user to change the node opacity for the d3ForceDirected graph we saw earlier.

Here is the ui.R code from the example:

library(shiny)

shinyUI(fluidPage(

    # Load D3.js
    tags$head(
        tags$script(src = 'http://d3js.org/d3.v3.min.js')
    ),

    # Application title
    titlePanel('d3Network Shiny Example'),
    p('This is an example of using',
    a(href = 'http://christophergandrud.github.io/d3Network/', 'd3Network'),
        'with',
        a(href = 'http://shiny.rstudio.com/', 'Shiny', 'web apps.')
    ),

    # Sidebar with a slider input for node opacity
    sidebarLayout(
        sidebarPanel(
            sliderInput('slider', label = 'Choose node opacity',
                min = 0, max = 1, step = 0.01, value = 0.5
            )
    ),

    # Show network graph
    mainPanel(
        htmlOutput('networkPlot')
    )
  )
))

And the server.R:

# Load packages
library(RCurl)
library(d3Network)

# Load data once
URL <- "https://raw.githubusercontent.com/christophergandrud/d3Network/master/JSONdata/miserables.json"
MisJson <- getURL(URL, ssl.verifypeer = FALSE)

# Convert JSON arrays into data frames
MisLinks <- JSONtoDF(jsonStr = MisJson, array = "links")
MisNodes <- JSONtoDF(jsonStr = MisJson, array = "nodes")

# Create individual ID
MisNodes$ID <- 1:nrow(MisNodes)

#### Shiny ####
shinyServer(function(input, output) {

    output$networkPlot <- renderPrint({
        d3ForceNetwork(Nodes = MisNodes,
                        Links = MisLinks,
                        Source = "source", Target = "target",
                        Value = "value", NodeID = "name",
                        Group = "group", width = 400, height = 500,
                        opacity = input$slider, standAlone = FALSE,
                        parentElement = '#networkPlot')
    })
})

There are few quick points to note. First, we told the app how to access D3.js by starting the shinyUI with:

tags$head(
        tags$script(src = 'http://d3js.org/d3.v3.min.js')
    ),

Also in the shinyUI we output the networkPlot using the Shiny htmlOutput function.

In our d3ForceNetwork call in the shinyServer we added the argument parentElement = #networkPlot. This attaches the graph to the div tag created by htmlOutput. Note that networkPlot is used in htmlOutput and output$networkPlot. We can use whatever name we like, as long as we specify it in these three places. If we do not specify this argument, d3Network attaches the plot to the page's body tag by default. In effect this will place it at the end of the page.

Warning: you will likely run into trouble if you include more than one network graph in a single Shiny app page, especially if they have conflicting styles.

Install

d3Network is available for download on CRAN. You can also install the latest development build using the devtools package and the following code:

devtools::install_github('christophergandrud/d3Network')




comments powered by Disqus