Creating your own online data visualizations

Published on Dec 09, 2016

Many researchers are familiar with the regular computer tools and desktop programs that are used for everyday tasks: word processors (like Microsoft Word or LibreOffice Writer) and spreadsheet programs (like Microsoft Excel or LibreOffice Calc) are frequently used to produce documents and reports. However, the use of online software can be very helpful when it comes to preparing data visualizations of urban metabolism research. But unlike office tools, the use of specialist online software is rarely taught in schools and it takes some effort to learn how to use them. The good news is that with a little bit of practice and by using the right tools, any urban metabolism researcher can start producing their own online data visualizations. In this blog post, we will introduce three tools and demonstrate how they work and how they can be used.

We selected three different tools: SankeyMATIC, Online Material Flow Analysis Tool (OMAT) and CartoDB. These were chosen because they provide different kinds of data visualization options but they are all highly relevant to urban metabolism research. They can furthermore all be used free of charge, and are easy to learn. For each of them we will describe what they should be used for, what kind of final can be generated from them, and we include a step-by-step tutorial for each of the tools to show you how to get started with them.

SankeyMATIC

What is it?


SankeyMATIC is an online sankey diagram builder. A sankey diagram is a diagram that shows the origin and destination(s) of flows by drawing a set of connected arrows, representing the quantities by varying the width of each line. This kind of diagram is very useful to visually represent the flow of energy through a system or to display the results of a material or substance flows study. This benefit of using such a diagram is that it can convey lots of information in an easy-to-understand format which requires little explanation. If this data is represented in a tabular format, it is often much more difficult to grasp proportions of flows and the relationship between the different flows. These kinds of diagrams may be easy to understand, but they are difficult to design without the right software. With SankeyMATIC it is possible to develop Sankey diagrams in a matter of minutes and with very little training.

Even though the software is still labeled as 'Beta' software, the site has been online for several years and the software works well. The website itself uses a very well-established library for data visualizations called D3.js. This is an open source javascript library that produces the actual data visualizations. D3.js is a very powerful tool, but it is difficult to use for non-programmers. SankeyMATIC has built a user-friendly interface that allows people to use the D3.js visualizations without any programming experience. SankeyMATIC is also licensed as open source software.

Final output


SankeyMATIC creates images for you. These images are regular image files that you can save on your computer and you can use them just like other files. You can include them in a Word document or you can upload them to a blog or website. The images are generated as PNG files and you can define the image size to make sure it fits your purpose. The image below shows an example diagram generated by SankeyMATIC. This image is part of the site's very useful Gallery which has various examples and where you can also see how the author generated the graph.



Tutorial

In this example, we will create a simple sankey diagram for food flows in a city. Let's say food enters the city either through imports or by being locally grown. Then, this food is processed and traded in the local, urban food industry before it is either consumed locally (within the city) or exported (either to national markets or to international markets). The first step is to enter the flows into a spreadsheet, as can be seen below:



Now, let's go to the SankeyMATIC website and click the link to Build a Sankey Diagram. Here, the only thing we have to do is list all the flows in the 'Input' field. These figures could be entered one by one, but it is even possible to just copy and paste from the spreadsheet, as long as we convert it to the right format. To do this, we must ensure that the 'From' values are found in column A, the 'To' values are found in column 'B' and the quantities are listed in column 'C'. Then it is just a matter of pasting the following formula in column 'D':

=concatenate(A2," [",C2,"] ",B2)

This will create a single phrase for each entry that reads something like this:

Processing & Trade [2300] Export: International

And that is exactly the format we need. Now it's just a matter of copying and pasting this into the box at the SankeyMATIC website, and we're all set:



You can click the Preview button to see how the graph looks, and if you want to customize some details you can modify the colors, the size, and the labels very easily. The final graph can be found below. Creating this graph took less than 10 minutes to create and which can easily be included in a publication, website or report.

On a final note, the manual at SankeyMATIC is very helpful and explains all the features very well. By reading this you can see how easy it is to generate diagrams, and the spreadsheet formula to easily create the right syntax can also be found there.



Online Material Flow Analysis Tool (OMAT)

What is it?

The Online Material Flow Analysis Tool (OMAT) is open source software that is part of our very own Metabolism of Cities website. It is an online tool that can be used to administer a Material Flow Analysis (MFA). There are a variety of features and tools that are part of this software, all of them related either to the process of undertaking an MFA (like keeping track of your contacts and data sources) or to the results of the MFA (data entry and processing). Because this blog post focuses on data visualization, we will discuss the features of OMAT that relate to data input and output.

The benefit of using OMAT for your data entry and data processing is that the software is tailored towards MFA work. Unlike Excel or other spreadsheet software, the core functionality of OMAT is geared towards socio-economic metabolism researchers and this means that very little additional work is required to manage your data entry or format your output. Another advantage of using OMAT is that you can easily share your work with others. This is helpful if you are a team of researchers and several people are collecting and entering data at the same time. By doing this online, the work can be done simultaneously and everybody has access to the same, single dataset. Finally, sharing your output with the public is also very easy and will not require any programming skills. With the click of a button can users publish their dataset, graphs and tables online. This data can be shared with others by sending them a link, and this same link can be included in reports, publications or on websites.

Final output

The final output of OMAT is an online website (a minisite dedicated to your project) that contains four different sections: Indicators, Graphs, Data Tables and Data Sources (this last one is only available if you decide to upload/share your original data sources). This minisite can be browsed by visitors and they can view your list of indicators, view graphs of your data, download data tables and view your sources. The strength of OMAT is not so much in creating complex data visualizations, but rather in automating the creation of data visualizations and enabling visitors to explore your data online, by clicking through your figures and tables on a website (all numbers are linked so users can dive into your data and view data at any available depth, which is difficult to do in print). All of this will not require you to have any programming or web development skills.

Below are two screenshots from OMAT.

Screenshot 1: dataset overview

Screenshot 2: example of graph generated by OMAT

But to really see the final output, it is best to simply browse some of the minisites. Use the links below to see some of the datasets:

Barcelona MFA (2005-2011)
Cape Town MFA (2013)
Iceland MFA (1962-2008)

Tutorial

In this example, we will create a very simple MFA project. In this project we will show how to load the Eurostat MFA framework categories and then add some random data for our fictitious city, Ficticity. To start the process, click on the Create Project option and fill out the form. In this case we will add MFA data for 3 years (2013-2015) and we will not activate any of the special options to keep things simple. After creating the project, a dashboard will open as can be seen in the screenshot below.

Under YOUR DATA we should load the figures. There are two options. Either we create our own categories (for instance, we may want to define our own material or energy flows of interest), or we can simply load all the Eurostat data groups by clicking the first option. After clicking this option, over one hundred categories are loaded, over four main flow types (Domestic Extraction Used, Imports, Exports and Domestic Processed Output). For each of these flows we can enter data points for a large list of sub-categories. By navigating to the right category it's just a matter of clicking a button to add a data point and filling out the information:

At any time you can review the data by going to REPORTS and checking out the data tables. Once all the data has been entered you can go to this same REPORTS section to see all the information that is generated by OMAT. By default you will see that OMAT calculated several standard indicators (if you used the Eurostat methodology) including the Direct Material Input, Physical Trade Balance and more. All these indicators are shown in data format and in visual graphs. Furthermore, graphs are generated for each data group and can be explored by yourself and any other person that you give access to the dataset. If you want to share the data with the public, you can return to your dashboard and under SETTINGS you can select the data access (either private, semi-private, or public).

To show the mini site that was generated by OMAT as part of this tutorial, click here.

CARTO (CartoDB)

What is it?

CARTO (formerly known as CartoDB) is an online mapping visualization service. It allows users to create maps to visualize all kinds of data: from differences on neighborhood level to flows between cities or global comparisons. As long as your data is somehow linked to a geographical location, then CARTO can turn it into a map. You simply upload their data, and CARTO turns this into often beautiful looking maps.

CARTO is a so-called 'freemium' service. That means that there the service is available for free, but additional features are available at a cost. For instance, the free plan allows user maps to be loaded 75,000 times per month. Maps that are opened more frequently require a paid plan. Similarly, online location data services (like routing or time and distance isolines) are only available in the paid plans. However, CARTO provides free upgrades for educational or research use of their maps. So if you use the maps to visualize your urban metabolism research data, you can likely take advantage of the advanced features without having to pay.

Final Output

The final output of CARTO is an online map. It is similar to Google Maps: you can open the map and zoom in, zoom out and click on items on the map. You can either direct people to your map on the CARTO website, or what is more frequently done is to embed the CARTO map on your own website (on your blog, research site, etc.). CARTO generates a few lines of code that you can copy and paste. If you paste this code into another website, then the map will load as part of that website, without requiring visitors to leave your site to view the map.

Below are some samples that come from CARTO's excellent sample gallery. These are real maps made by users.



The map on the left is a pollution map for Los Angeles, and the one on the right visualizes New York's tree cover.

Tutorial

For this tutorial we will show you how to create a simple map that we have actually already made for our own website! This is a map that displays where urban metabolism studies are done. We analyzed our publications database and listed the cities that were studied in each publication. The list that we made includes the name of the city, the number of publications, and a small blurb of text to list the titles of each publication (and a link to view them). The spreadsheet looked something like this:



After creating an account on CARTO you can start loading a new map. The first step is to load your data. You can load data from various spreadsheet formats (CSV, Excel, etc.) and the file can be uploaded from your computer, from Google Drive, or from other sources. The screenshot below shows how to upload the data:



Once the data is loaded, CARTO will do its best to create a map for you. However, if you don't have any georeferences in your spreadsheet (like latitudes and longitudes), then the map will come up empty. In our case, we were indeed presented with an empty map. But CARTO can help us with that! Simply go back to the DATASET tab, and click on the GEOREFERENCE LAYER link. You can manually select the column that has geographical data in it. A wide variety of data can be used: city names, administrative regions, postal codes, IP addresses or street names. In our case, we selected Cities and linked this to the column in our spreadsheet.


CARTO now updated the map and successfully showed dots on the map!



In the cropped snippet above you can see that there are dots on the map, but their size is uniform. However, we want the size to be linked to the column that contains the number of publications for that city. Again, no problem in CARTO. By clicking the WIZARDS icon on the right hand side, we can select different types of visualizations. The 'bubble' visualization is the one we need, and we need to link the bubble size to the publication quantity:



After selecting these options, Carto will now show a map that has differently sized dots, depending on the number of publications for each city!



We still need to take one more step: when people click the dot, they should see the text blurb that we typed that contains the titles of the publication. Again, this is very easy. Click on any of the dots, and a window will pop up asking you to link the action of clicking a dot to one of the columns. Here you can select the right column, and that's all.



There are more settings to change if you wish, to customize your map a bit more. But the aforementioned steps are all that is needed to get your map up and running. Once you are set to go, click the PUBLISH button at the right top of the page, and you will be given a link as well as the code to embed your map on other pages. And how will it look? Check it out below, where you can see the final map that we created.

Conclusion

We hope that these data visualization ideas and explanations got you excited about creating your own online visualizations. As you can see, it is not so hard and the possibilities are plenty. We also invite you to contact us if you have any other tips for data visualization software that can be used, or if you would like to add your own blog post or tutorial about the data visualization website of your choice.