How To / STATA: Draw a Random Sample from Panel Data

Assume we have a data set containing firm data across years. The variable id uniquely identify a firm. The variable performance is some kind of financial performance of the firm and the variable year indicates when that performance happened. Thus,  we have a small panel where firm-year is the unit of analysis.

If you want to draw a random sample from a data set like that, you shouldn’t directly use the command -sample-. If you use it, then you will lose the panel structure of the data (or at very least you are very likely to lose it!). What you should do instead is to randomly select firm ids and then keep all the observations (all years) for each of the randomly selected firm ids. Below you can see an example of a STATA code to perform this operation. Remember we have three variables: id, year, performance.

use "yourdataset.dta", replace

tempfile paneldata
save `paneldata'

collapse (mean) performance, by(id)
keep id
sample 50

tempfile randomsampleid
save `randomsampleid'

use `paneldata'

merge m:1 id using `randomsampleid'

drop if _merge == 1
drop _merge

After opening the data set, we save a temporary file called paneldata (lines 3-4). Then we get rid of the repeated ids using -collapse- and then we drop all the variables and we keep only id (lines 6-7). In line 8 we use the command -sample- so STATA randomly select, ins this case, a 50% of the total number of unique ids (-help sample- to see other options, such as defining the number of observations you want to draw from the original set). In lines 10-11 we save this subset of ids in a temporary file called randomsampleid.

Finally, we return to the panel data (line 13) and then we merge it using the randomsampleid. It is a m:1 merge because in the panel data the id variable does not uniquely identify each observation but it does that in the using data. Those observations that are successfully merged are the ones that STATA randomly chose for you, so we get rid of the rest in line 17.

Biking around Easter Island

Biking around the Easter Island (Rapa Nui) is a great idea. If you like to ride a bike, it is maybe the second best idea after the one you had when you decided to actually visit Easter Island.

I biked around the Island during winter, which seems to be a great season for biking if you are lucky enough not to see rain (I was lucky enough not to see rain at all during 5 days). I have not been to the Island in summer but I think it is probably not such a good idea to bike around the Island during summer (around 30°C and a lot of humidity).

In total, I biked around 120 km (75 mi) in 4 days. I was on the road (biking/walking/taking pictures) around 3-5 hours per day depending on the route I chose and the places I visited.

LAN Airlines' 767 at Easter Island's Airport

LAN Airlines’ Boeing 767 at Easter Island’s Airport

Continue reading

How To / STATA: Calculate Variables for Groups of Observations

In management research, we usually need to create a variable that measures the experience of firms. Firms accumulate experience as they make acquisitions or invest in companies in certain countries. Sometimes this experience has an effect in future decisions, so we calculate variables that measure the number of times a firm has made an acquisition or has invested in a certain industry or country. In STATA, this can be done using the command -bysort- and -gen- (i.e. -generate-) or -egen-. In this post I will calculate an experience variable using a fictitious dataset.

Continue reading

How To / STATA: Check if a File Exists Before Opening It

The STATA’s command -capture- allows you to check if the file that you are trying to open exists. This command evaluates whether or not the file is in the folder you are using. Then, -capture- assigns a value to the macro _rc depending on whether the file is in the folder or not. Therefore, we can use the value of _rc to continue with our STATA’s script.

Continue reading

How To: Extract Highlighted Text from a PDF File

Some people enjoy reading on paper not only because they can make annotations and highlight text easily, but also because they actually like their handwriting. If you are not one of those, then the following guide may help you. I will show how to extract all the highlighted text and the annotations from a PDF using Acrobat Professional. I did an extensive research (i.e. I tried many different keywords in Google!) before understanding how to extract the annotations from a PDF file. I did not find too many useful articles on the internet. It turned to be an easier process than what most of the site I visited described, so I hope that Google ranks this page well!

Continue reading