New Metabolism of Cities Data Hub Launched
Two and a half years after launching the MultipliCity project, Metabolism of Cities now launched the next version of the online, open source data hub on urban metabolism.
What started initially as an exercise to collect data from academic papers in 2017 has slowly grown into a platform to crowdsource and centralise data on material flows, stocks, and other relevant parameters in the urban metabolism. With the MultipliCity project, Metabolism of Cities rolled out a prototype system for gathering, visualising and sharing urban metabolism data. This system was trialled over the past years by uploading data for a number of cities, and by sharing the system and gathering feedback from relevant users including colleagues from the sibling website Metabolism of Islands.
Since early 2020 Metabolism of Cities has been building a new version of this system, now rebranded to the Metabolism of Cities Data Hub, which is launched as an independent sub-site within the network of websites managed by Metabolism of Cities, as the platform reveals a series of such project sites within its fourth edition.
A number of lessons were learned from the initial data platform, and have led to some key structural changes in the new system.
- Introduction of "basic records": within the previous system, various types of data and other records were all kept in individual data tables. This led to an intricate system of linkages, in order to keep track of the connection between different parts (e.g. between data points, uploaded datasets, data uploaders, data owners, cities, countries, journal publications, reports...). In the new system, a "basic record" has been set up to facilitate creating complicated linkages within the need to have individual relationships being defined. Instead, a new "relationship manager" allows for users to define two records, and their relationship, independently from which type of record is being linked.
- Update to STAFdbs: the underlying data structure, called the Stocks and Flows Database Schema (STAFdbs), was further refined based on the initial experiences. Mostly the existing system held up very well, but a primary difficulty was labelling individual reference spaces using multiple labels or geocode schemes. This came to the fore with cities like Hong Kong, which are a city, an island, and a country all at the same time. In the previous system it required multiple entries for this single space, which led to duplication and inconsistencies, but in the new system this space can be registered only once, and then tagged at three (or more) levels.
- Hierarchical reference spaces: similar to the earlier challenge, the previous system didn't allow for a programmatic understanding of the hierarchy between different spaces. The breakdown of a continent in countries; of countries into provinces/departments/states; of those into municipalities; of cities into suburbs... all of these hierarchies were difficult to properly record and manage. This has now been resolved by the introduction of a nested geocoding scheme.
- Geospatial database fields: in the previous system, geospatial data were either recorded as individual latitude and longitude pairs, or as geojson objects within a text-based database field. While this worked for data recording and general visualisation, it lacked more advanced functions like spatial database queries (e.g. select all records that are located within a certain area, within a certain distance, or that overlap with another area). This limitation was tackled by using PostGIS, which was well supported in Django (our platform of choice) through GeoDjango. This now enables a whole new level of spatial data queries.
- Phased data management: previously, there was a single procedure to upload data and to prepare the data for publication. However, it became clear that not all users that upload data have the skills, time, or interest in processing the information. In order to better facilitate a division of labour, and to allow for the crowdsourcing process to better scale, a phased approach was integrated, which allows for one user to upload data, for another user to process the data, and for yet another to analyse the information. This can all still be done by one single user as well, but where appropriate these steps can now be broken up.
- Document library integration: there is often a clear link between a library item, such as a journal article, report, or published dataset, and the material stocks and flows data that is being published on Metabolism of Cities. However, in the previous system this linkage was not maintained clearly throughout the workflow which meant that there was duplicity in managing meta data and other information about the data points. This is now fully integrated.
- Holistic data approach: previously, the actual data that mattered most were the "stocks and flows data". There were options to upload pieces of text for other information sections (e.g. around climate or population), but these options were limited and mostly embedded as an afterthought. In the new system, there is a more holistic approach towards the information that is managed and presented within the system. This includes the creation of four data layers, based on previous academic work, which all together form the data and information foundation for the city dashboard.
Roll-out
The new data portal will be rolled out in phases. In the first phase, officially launched on August 24, 2020, the Data Collection phase is prioritised. This means that most of the data portal is geared towards encouraging visitors to contribute to the data collection process, and progress will be shown from a data collection perspective. This goes hand in hand with two online courses rolled out by Metabolism of Cities, which are geared towards training people in the data collection process.
Once sufficient data has been collected to provide a solid information baseline will the next phase be rolled out. In this phase, Data Processing is the focal point. This entails reviewing, tagging, and properly storing all data and information loaded into the system in the previous phase. Metabolism of Cities plans to roll out another course to collaborate with participants that are enthusiastic about getting their own city online. This process is scheduled to start in September 2020.
In the last quarter of 2020, a number of tools and features will be rolled out to allow for Data Analysis. The exact make-up of those phase is still to be determined, but by using the information previously uploaded, and by combining the various types of information available (stocks, flows, demographics, economics, etc) there are many opportunities to obtain new insights and to find ways to translate these data into sources of information for policy making and monitoring.