Datos.gob starts the publication of practical guides for the proper use of the formats and open data access

31 march 2020

Published the first of the guidelines developed by the team of Datos.gob.es portal to inform the editors in the proper use of formats and media used to access more open database. The series starts putting the focus in CSV format.

Despite the fact that every time we have more data sources at our disposal, that its economic impact scope record numbers in the coming years (Opens in new window) and that data are more affordable than ever, the potential for reuse are still rather limited. An explanation of this phenomenon is that potential users of such data are faced many times a host of barriers that impede their access and use.

The facets in which there might be a problem of quality that would make the reuse of data are manifold: barely metadata narrative and standardized, election of a licence, the choice of format, the improper use of formats or deficiencies in the own data. Therefore, there are many initiatives aiming to measure the quality of data sets on the basis of their metadata: date and update frequency licence, formats, employees, … as e.g. of quality of metadata (Opens in new window) present in the european Portal database or quality of dimension Open Data Maturity Index (Opens in new window) .

But such analysis are insufficient, given that most times the shortcomings of quality can only be identified after starting the process of re-use. The work that pride themselves vetting processes and preparation are becoming a major burden in many cases is user inasumible open database. This creates frustration and loss of interest from the reutilizador sector in the data provided by public agencies, affecting the credibility of the institutions and significantly reducing publishing the expectations of return and generation of value from the reuse of open data.

These potential problems can be tackled because, in large measure, it has been observed that are due to the publisher could not known how to express the data correctly in the chosen format.

The initiative of Datos.gob

Therefore, and with the objective of contributing to improving data quality open, the team of the portal datos.gob.es (Opens in new window) it has decided to create a collection of guides to inform the publishing partners in the proper use of the formats and open data access more used in the area of data.

The collection of guides starts putting the focus in CSV format. The choice of this format is based on its popularity in the area of open data, in its simplicity and lightweight than it is when it comes to data in table form. It Is the most common in the catalogues of open, living side by side with other formats such as XLS or XLSX that could be raised as CSV. Moreover, it is a format that can be called hybrid (Opens in new window) because it combines ease of automated processing with the possibility of being explored directly by persons with a simple text editor.

This Guide (Opens in new window) includes the basic characteristics of this type of format and a compendium of guidelines to publish correctly in tabular data, especially in CSV. Patterns are accompanied by suggestions of free tools that ease to work with files CSV and extra features. In addition, it is also available a summary of the patterns present in the guide in the form of Cheet Sheet (Opens in new window) (steak or road tricks) to facilitate their use and consultation.

The “ practical guide to the publication of tabular data in CSV files ” is available at PDF (Opens in new window) and PPXT (Opens in new window) .

Original source of news (Opens in new window)

  • Open government