Skip to content

Linking shrid data between SHRUG1 and SHRUG2⚓︎

Shrids in SHRUG versions 1 and 2 differ in two ways.

  1. Shrids in SHRUG 2.0 (called shrid2 in the data) have longer IDs, because they include census district and subdistrict identifiers.

  2. We have improved the quality of the location keys in SHRUG 2.0; this means that some shrids from version 1 of the SHRUG do not have 1:1 relationships with shrid2s. The biggest change is that we have disaggregated shrids in some urban regions, which were combined as very large geographic blocks in SHRUG 1. See this page for more details on what changed.

If you are currently using SHRUG 1, we recommend re-downloading all the modules you need from SHRUG2, as there are many data quality improvements. However, if you are already in too deep and need to merge shrid1 to shrid2, it can be done for >95% of shrids using the shrid1-shrid2 key, included in the keys module. In the key, shrid1s are called shrid1 and shrid2s are called shrid2. The merge can be executed as follows:

/* use a shrid-level dataset from shrug v1 (in this example, PC01) */
. use shrug_pc01_pca.dta, clear

/* rename the shrid variable to indicate it's from version 1 */
. rename shrid shrid1

/* merge in shrid2 IDs using the shrid1-shrid2 key */
. merge 1:m shrid1 using shrid1_shrid2_key.dta

  Result                           # of obs.
  -----------------------------------------
  not matched                        63,220
      from master                    17,265  (_merge==1)
      from using                     45,955  (_merge==2)

  matched                           531,755  (_merge==3)
  -----------------------------------------

Please note carefully:

  1. This is a one-to-many merge, because we split some shrids since the last version. The command above duplicates some of your data, and you may want to recollapse to the shrid1 level.

  2. We lose about 5% of shrid1 locations, which have m:m correspondences to shrid2. Note this represents about 3x more shrid2 locations, because SHRUG 2 is more disaggregated than SHRUG 1, particularly in cities.