A place for all your questions and thoughts on the Data Hub and how we can make things better, things you might want to see in the future, new ideas etc
Questions and ideas for the Data Hub
Hi, where is the raw data stored? Inside this site, or on github, or ...? This is both for people to contribute and be able to copy and adapt files, but also for peer review. So for example, for data driven visualisations, there are .html files and .js files in addition to the researched data of the metabolic flows.
What I am currently missing is
- a way to access all my comments, questions and similar communication from my profile (or similar). I sometimes remember having asked a certain question already somewhen but I do not find the thread. Its great that you can add a comment or a question almost anywhere, but the thread is hard to find again afterwards. Also, searching in the forum is very limited and it is difficult finding information there, so you tend to ask a question again that might have been answered already previously
- a permanent filter that allows me to only browse within datasets that belong to a given city or that are assigned to my person
- similarly, a two-level view for the datasets: first level: all verified and valid datasets that readily allow analyses, and second all the rest. Currently (at least for Madrid), there are a lot of datasets that are not really useful yet (not processed, or invalid, or anything else) and that create a lot of mess. Actually, its also a filter issue I guess.
and more to come..
- a possibility to change the column width when browsing through the datasets. Currently, the title of the dataset is usually truncated and there is no way of seeing the whole name of the dataset.
Thanks a lot for these comments! We will be sitting down in the second half of January to look at system upgrades so please keep any feedback coming. What you mentioned is already very helpful and we will be looking into improving this.
Bernelle: I also see you asked about raw data storage. To finally answer that question: the datasets are uploaded by users to our site, and can be re-downloaded by anyone. Our website source code itself is available through github.
As I had discussed with you, I was wondering whether it would make sense to enable other reference space in the Data Hub such as countries or regions.
Originally the same layers would be applied than the cities. In the future, of course it would be great to link these scales (this reference space is within another one) and then eventually be able to downscale some info from one to another.
I know you where doing to give a thought, just putting it out there.
Hi Aris, we can look into it but I want to make sure we don't make the MOC Data Hub even more confusing ;-) If it starts featuring especially countries it will look odd. At a minimum I'd need to make some interfaces to split things up.
What do you have in mind specifically? Who would work on this and what would their workflow be? I'm thinking an alternative is to set up those sites but hidden from view -- all very basic and just focused on being able to insert data, not from the MOC Data Hub but from some sort of a temporary link that ultimately may be converted into MOR / MON. What do you think?
thanks for your reply. Yes I don't know what is the best way forward. I thought that having one place for everything would make sense at least as the homepage of the data hub and then you can click on the cities/countries tab up above or down below and you only see info on cities/countries/regions, etc.
Who? I know I have some students that work on other scales, and sometimes I want to do an MFA at another scale so I would find it handy. The workflow would be the same as now (layers, data collection, data processing and in the future data analysis I can imagine). Also the thing is that at these others scales there is much more data. That means that it could illustrate how the Data Hub works and its potential as well as highlight that we need to focus on cities because data are missing.
An alternative can also be a very good idea but what would the benefit be? Not to maintain it? Not to do polishing? As always the question of having MoR and MoN is of course very very appealing and don't know if we should divide them or group them (I guess we need to test that).
I have clearly not good final solution for this. I would say, the best solution would be the one requiring the least amount of work. The idea is to test adding data at other scales and see what are the challenges at these scales. Perhaps it would bring good insights for cities.
Cool Aris, noted.
And yes I see the benefit but am also doubtful about spreading our efforts. Any work I do on this is directly at the expense of our MOC data hub. I really rather not add menu items and options to our existing data hub - I really struggle to keep the menu clear and organized as-is, and adding more confusing sections there will not help. As I said, I could be convinced to have a temporary page/site (think a data hub clone a la Cityloops) where some insiders (ie your students) can work on different scales. But semi-embedding in our city hub is not something I'd be very keen to do. Also it would be very good to think about whether your own efforts around this really strengthen our data hub, or dilute the efforts around it. I don't really think that lack of data on a city level is what is holding us back. After all, we have various cities with full MFAs done, and if we really wanted to we could even take a city state like Iceland and load all of that data in MOI. What we simply haven't technically made yet are the tools to upload, process and then do analysis around MFA-specific data. So if we want to take OUR data hub to the next level, I think we should focus on that and not on adding countries and regions.
I can be convinced to set up something, but do need strong arguments in favor...
Haha fair enough. Touché! You are right the analysis part is way more urgent.
Sorry, I get easily excited about this, because of some of the datasets (especially the IRP but also the different flow datasets we have from the Eurostat grid) that I would like us to ingest, and because I had some students working in Swiss cantons.
But you are right. One step at a time.
However, I do think that in the (far) future, I still think that our data hub should be one single beast with eventually different outlets for more discussion and collaboration (meaning that at a certain stage we should be able to browse and compare data from different scale) but perhaps each scale would have different scientists, citizens and policy makers interested to contribute.
What do you think?
Haha the roles have reversed!! OK but great, we're on the same page then. We'll go step by step.
In terms of the further future, having one single place and being able to combine scales... YES. I agree with that idea, but we'll simply have to see indeed what the best way is of still retaining a simple focus (this is where the whole idea of crowdsourcing hub vs research hub vs policy hub may come into play. I would say let's think this out fully, in due course, after we have taken our existing cities-level data to the next level.
I have another issue related with the update of all the data in the platform. The majority of data one finds is in some kind of propriatary format that requires substantial processing for bringing it into the file format required for upload. Its perfectly fine to do this once, but I fear that the motivation (and time) to do this again every 6 or 12 months is difficult to obtain. So, it would be necessary to find a way for offering a more flexible upload format, allowing for example different flows to be organized in columns ('wide' format) instead of in rows. To give you an example, I attached a data file with statistics on petroleum products. I want to have a file where I can just paste the updated data table anywhen I like into the tab "raw data", and the tab with the processed data then updates automatically. In the current format, this would be quite an easy task, but when having to repeat date rows for fitting the flow data for different categories (segments) into one column its becoming difficult.
ups, there was something wrong with that file. here again
Thanks for the info! You raise a valid point. I have created a new task on our to-do list, which you can find here. As you can see I have added your file there as an example. However, if you have additional examples can you add them there? I'd like to study a few of them so that we can try to observe some patterns and then make a decision .
Hi Paul / other programming capable people :)
Before I make a "feature request" for this, I wanted to ask if it is not already possible: For mark down in data articles, but also other parts of the site, e.g. video descriptions, is it already possible to make a nested list (see attached example image)? (Image source: https://stackoverflow.com/questions/37575916/how-to-markdown-nested-list-items-in-bitbucket) Especially for video descriptions, I'd find this very useful, but also for course exercises for example.
Hi Carolin, that is already possible! Put four spaces down:
- Main item
- Four spaces before I put the dash in
- Four spaces before I put the dash in
- More stuff
- you will see
- lots of nested lists from now on!
- Even if I will go crazy with adding four and eight spaces ;)
- you will see
Hahaha OK great! (if not somewhat worrying... ;-) )
Hello everyone! 😊
I would like to give a few ideas for the future of the DataHub..
In the first page of the Hub, where all the dashboards of the cities are displayed, I think you don't have an overall vision of which city/region people are working on, so what is already existing and what is missing.. I've already suggested to put the name of the country next to the city, maybe this is too much. But what I think it would be even better, is to create a map (as I think you already had in your previous version) that directly shows the location of the cities people are working on. I think this is really important to have an overall vision, and the map is also very cool !
Secondly, I've been working on this platform in the last few months... before to start I knew a bit about Urban Metabolism, but anyway I'm moving my first steps in this domain and so, I’m not an expert at all. However, I was missing a video explaining your overall vision of the platform. To be more specific, I think it would be nice to have in the Education Hub, or as small presentation of the DataHub, a video which explain the aim to use the platform and what you can built with it.
Only once I started to process data, I think I realised what was the real potential of the platform and what you were imagining doing with it. I didn’t get since the beginning the important link between reference spaces (so shapefiles you are collecting during data collection) and dataset on Stock & Flow (asking a reference space while processing). Having known this link since the beginning, would have help to better orientated research and collect the most adequate shapefiles and datasets.
Another example, I created some geospatial spreadsheet because I thought it was good to list different infrastructure, but I didn’t understand that it was maybe more important to list just a few of them in order to create the reference space I really needed.
Maybe I had this struggle because, as I said, I don’t have a lot of experience in this field. But I wanted to give my advice 😊
Anyway, now that I get it better, this platform is really a genius idea! Thank you!