Linking shrid data between SHRUG1 and SHRUG2⚓︎
Shrids in SHRUG versions 1 and 2 differ in two ways.
-
Shrids in SHRUG 2.0 (called
shrid2
in the data) have longer IDs, because they include census district and subdistrict identifiers. -
We have improved the quality of the location keys in SHRUG 2.0; this means that some
shrid
s from version 1 of the SHRUG do not have 1:1 relationships withshrid2
s. The biggest change is that we have disaggregated shrids in some urban regions, which were combined as very large geographic blocks in SHRUG 1. See this page for more details on what changed.
If you are currently using SHRUG 1, we recommend re-downloading all the modules you need from SHRUG2, as there are many data quality improvements. However, if you are already in too deep and need to merge shrid1
to shrid2
, it can be done for >95% of shrids using the shrid1-shrid2 key, included in the keys
module. In the key, shrid1s are called shrid1
and shrid2s are called shrid2
. The merge can be executed as follows:
/* use a shrid-level dataset from shrug v1 (in this example, PC01) */
. use shrug_pc01_pca.dta, clear
/* rename the shrid variable to indicate it's from version 1 */
. rename shrid shrid1
/* merge in shrid2 IDs using the shrid1-shrid2 key */
. merge 1:m shrid1 using shrid1_shrid2_key.dta
Result # of obs.
-----------------------------------------
not matched 63,220
from master 17,265 (_merge==1)
from using 45,955 (_merge==2)
matched 531,755 (_merge==3)
-----------------------------------------
Please note carefully:
-
This is a one-to-many merge, because we split some shrids since the last version. The command above duplicates some of your data, and you may want to recollapse to the shrid1 level.
-
We lose about 5% of
shrid1
locations, which have m:m correspondences to shrid2. Note this represents about 3x moreshrid2
locations, because SHRUG 2 is more disaggregated than SHRUG 1, particularly in cities.