Skip to content

Contribution checklist⚓︎

What form should my data take?⚓︎

Your data should be in tabular format linked to SHRUG identifiers. Most often, data will be contributed at the shrid level. Occasionally other dimensions (such as time) will be required to uniquely identify the rows, or data will be contributed at other levels, such as the subdistrict or district.

One of the most important elements of making a contribution to the SHRUG is to create the necessary documentation that describes geographic and temporal coverage, dataset source, variable construction and usage notes, and any other information relevant to final users. We have a metadata template that standardizes this process.

You can link your data from the various census identifiers included in the SHRUG keys, e.g. the PC11 village-town or shrid key. For example, if you have village- or town-level data that you would like to contribute, you will first need to convert from PC11 identifiers to the shrid.

If you have spatial data or other data that does not link easily to shrid, or data that exists at a higher level of aggregation but would still make a meaningful open-source contribution, please contact us and we may work with you to develop a plan on how to incorporate it.

Checklist

  • Convert dataset to shrid or other SHRUG-native ID
  • Assert that each row is uniquely identified (usually by geographic variables alone)
  • Completed dataset-level metadata according to metadata template
  • Completed variable-level metadata according to metadata template
  • You have checked that all variable names are lowercase with underscores – no camelCase1
  • You have included descriptive variable labels in the dataset if you are submitting your data in .dta format
  • You have provided a key for any data that is not shrid-level2
  • Email zipped data files and metadata sheet to info@devdatalab.org

How do I fill out the metadata?⚓︎

The metadata template provided contains examples of how we expect fields to be filled. Yes, documentation and metadata aren't thrilling. But thorough completion of these fields is an prerequisite to contributing to SHRUG because it's a prerequisite to effective use.

If fields aren't relevant to your data (e.g. sampling or weighting), enter "N/A". If you have any questions, please reach out to use at info@devdatalab.org.

How do I submit my data?⚓︎

To apply to integrate your data into the core SHRUG, please first reach out to us at info@devdatalab.org to see if your data is a good fit. If we think it is, we will then ask you to go through the checklist steps above. Once all items are complete, we will publish your data in the next minor version of the SHRUG.


  1. Good: pc11_state_id. Bad: pc11StateID

  2. A key is a table that matches one set of ID variables to another. For example, an Economic Census 2013 to Population Census 2011 district-level key would match EC13 district IDs to PC11 district IDs. For example, if you are contributing RBI bank branch data, you should provide a key that matches bank branch IDs to shrid