durgeshonomics: 2018

Recently, the Indian Railways reported that about 21 lakh bedroll items worth over Rs 14 crore went missing from the AC coaches during 2017-18. This is baffling when we consider that individuals who could pay for the AC-ticket, can definitely afford to buy these items. It seems they took the writing ‘Railways is your property’ a bit too seriously and conveniently forgot to read the ‘protect it’ part of the whole sentence!
The greatness of a nation is a public good; everyone likes to enjoy it but do not want to contribute towards it. Who would not want to have the honour of being a citizen of a strong and great nation? While we want to take pride that our country is better than others, at the same time, we leave the responsibility of making our nation great to others.

Read more here:

We are the Nation: epaper link

or a more readable one:

We are the Nation: Webpage link

Extracting Data from NSS 62nd Round
Using STATA:

First we shall try to extract NSS-data for Consumer Expenditure (62_1.0), which is in ASCII-fixed format using STATA (I used STATA 9).

Step 1: Locate the Folder/CD containing data for NSS-Round 62. It should contain three sub-folders, Nss62_1.0 for Consumer expenditure data, Nss62_2.2 for Manufacturing enterprises data, and Nss62_10 for Employment and unemployment data. We are going to use the folder containing data on Consumer expenditure, i.e., Nss62_1.0.

Step 2: The folder Nss62_1.0 should have four items within it, three folders and one text document (README62_1.0). You can find the State Code for the State of interest by following the path:
Nss62_1.0 → Supporting Documents → Instrn. To Field Staff_62 → Appendix-2.

For Uttar Pradesh, the State code is 09 (It has been changed from 25 in NSS 61st Round).
It is also evident from Appendix II that Uttar Pradesh has been divided into four NSS-Regions. So in this way, we would be concerned with code 091 (signifying the 1st NSS-region in Uttar Pradesh) to 094 (indicating 4th NSS-region in Uttar Pradesh). This would be helpful in data extraction.

Step 3: Locate the “layout_62_1.0” file that is an excel file by following path:
Nss62_1.0 → Supporting Documents → Layout_62_1.0.
We shall call it Layout file for convenience. The Layout file shows various “Levels” containing “Blocks” of the Schedule that was administered to gather the information from respondents. You will notice that there are seven Levels in all. Now starts the real job.

Step 4: Now, we need to extract data for each level separately as their contents differ. For example, the variable of Level 1 would not be there, except initial 17 items (up to HHS No.) Thus, we need to write as many Dictionary files as the number of Levels. This makes us write 7 Dictionary files for extracting the Consumer Expenditure data of Uttar Pradesh.

Following is an example of a dictionary file (for Level 1) for STATA:

dictionary using MH1C01.dat
{
_column(1) str3 CntrcdRndshft %3s
_column(4) str5 LOTFSU %5s
_column(9) str1 Filler %1s
_column(10) str2 Round %2s
_column(12) str3 Schdlno %3s
_column(15) str1 Sample %1s
_column(16) str1 Sector %1s
_column(17) str3 StateReg %3s
_column(20) str2 District %2s
_column(22) str2 Strtmno %2s
_column(24) str2 Substrtum %2s
_column(26) str1 SubRund %1s
_column(27) str1 Subsmple %1s
_column(28) str4 FODSubReg %4s
_column(32) str1 Segmntno %1s
_column(33) str1 SecndStg %1s
_column(34) str2 HHSno %2s
_column(36) str2 Level %2s
_column(43) str2 SlnoInfor %2s "Seriel Number of Informant"
_column(45) str1 Respnscd %1s "Response Code"
_column(46) str1 Survycd %1s
_column(47) str1 Substncd %1s "Substitution Code"
_column(48) str6 DoSurvy %6s
_column(54) str6 DoDespch %6s
_column(60) str3 TCanvs %3s
_column(127) str3 Nss %3s
_column(130) str3 Nsc %3s
_column(133) str10 Mlt %10s
}

Step 5: Once all the seven dictionary files have been written, we would store in a folder, say “NSS extract”.

Step 6: Now we need to write a “do-file” for STATA. I’m pasting below the do-file that I’ve written and used to extract the NSS data for all seven Levels at one go.

infile using "E:\NSS extract\Level1.dct", using("E:\Nss62_1.0\Data\MH1C01.TXT")
drop if StateReg > "094"
keep if Level == "01"
destring, replace
save "E:\How to\Level1.dta", replace
clear

infile using "E:\NSS extract\Level2.dct", using("E:\Nss62_1.0\Data\MH1C01.TXT")
drop if StateReg > "094"
keep if Level == "02"
destring, replace
save "E:\How to\Level2.dta", replace
clear

infile using "E:\NSS extract\Level3.dct", using("E:\Nss62_1.0\Data\MH1C01.TXT")
drop if StateReg > "094"
keep if Level == "03"
destring, replace
save "E:\How to\Level3.dta", replace
clear

infile using "E:\NSS extract\Level4.dct", using("E:\Nss62_1.0\Data\MH1C01.TXT")
drop if StateReg > "094"
keep if Level == "04"
destring, replace
save "E:\How to\Level4.dta", replace
clear

infile using "E:\NSS extract\Level5.dct", using("E:\Nss62_1.0\Data\MH1C01.TXT")
drop if StateReg > "094"
keep if Level == "05"
destring, replace
save "E:\How to\Level5.dta", replace
clear

infile using "E:\NSS extract\Level6.dct", using("E:\Nss62_1.0\Data\MH1C01.TXT")
drop if StateReg > "094"
keep if Level == "06"
destring, replace
save "E:\How to\Level6.dta", replace
clear

infile using "E:\NSS extract\Level7.dct", using("E:\Nss62_1.0\Data\MH1C01.TXT")
drop if StateReg > "094"
keep if Level == "07"
destring, replace
save "E:\How to\Level7.dta", replace

Further Explanation regarding the do-file:
“E” is the drive on your PC where you made the folder for storing the extracted data. It can as well be “C” or whatever.

“NSS extract” is the folder where the “dictionary files” have been stored and where the extracted data would be stored.

“E:\Nss62_1.0\Data\MH1C01” is the path for finding the ASCII-fixed format data. It may differ depending on where one has stored the NSS data which is to be extracted.

“drop if StateReg > “”094” is given to drop all the irrelevant data. When we import data it would also be having data for several other States as they were stored in “MH1C01.txt”. We need here only data for Uttar Pradesh. Remember that Uttar Pradesh has NSS-code 09 and it has been divided into four NSS-regions. So, the data of our interest should be that of Region 1, Region 2, Region 3, and Region 4 in Uttar Pradesh. The “Layout file” has a variable by name “State-Region” that we had named as “StateReg” in our dictionary file. This variable would help us to identify the relevant data for Uttar Pradesh. When we give command “drop is StateReg > “”094”, STATA would drop from its memory the data for all other States except Uttar Pradesh. “094” is the maximum coding that can be applied to the regions of Uttar Pradesh under given data. 094 is enclosed within quotation mark as we have extracted all variables in String format.

“Keep if Level == “X”” makes the STATA to further drop data for all levels except the “Level X” where X can be any whole number between 1 to 7, depending on the level of our choice.

“destring, replace” is used to destring all the data. Remember we have imported data in String format.
So, extract data and have fun...!

N.B. I'm obliged to my mentor who bore the brunt of my experiments with NSS-data for I went to him and discussed my hopelessly frustrating strategies to extract the data at one go...!
This is reposted as per request of so many users.

durgeshonomics

Wednesday, November 28, 2018

We, the Nation

Wednesday, September 5, 2018

Demonetisation: Some tall claims and not so tall answers

Sunday, August 5, 2018

Sending wrong education signals

Tuesday, February 6, 2018

A Note on NSS Data Extraction 1