Public data and metadata, Google style
By DAM News Staff | February 23, 2011
I’m finding Google Lab’s (DSPL) pretty fascinating.
Upload a set of data, and it will do some semi-spiffy visualizations of it. (As points out, Martin Wattenberg and Fernanda Viegas now work for Google, so if they’re working on this project, the visualizations are going to get.) More important, the data you upload is now publicly available. And, more important than that, the site wants you to upload your data in Google’s DSPL format. DSPL aims at getting more metadata into datasets, making them more understandable, integrate-able, and re-usable.
So, let’s say you have spreadsheets of “statistical time series for unemployment and population by country, and population by gender for US states.” (This is Google’s example in its helpful.)
- You would supply a set of concepts (“population”), each with a unique ID (“pop”), a data type (“integer”), and explanatory information (“name=population”, “definition=the number of human beings in a geographic area”). Other concepts in this example include country, gender, unemployment rate, etc. [Note that I'm not using the DSPL syntax in these examples, for purposes of readability.]
- For concepts that have some known set of members (e.g., countries, but not unemployment rates), you would create a table — a spreadsheet in CSV format — of entries associated with that concept.
Get articles like this in your inbox: Sign Up