https://www.ssc.wisc.edu/sscc/pubs/stata_tables.htm
http://repec.sowi.unibe.ch/stata/estout/esttab.html
https://stats.idre.ucla.edu/stata/faq/can-i-make-regression-tables-that-look-like-those-in-journal-articles/
sysuse auto, clear
eststo: regress price weight mp
search estso
eststo: regress price weight mp
search estout
esttab
eststo: regress price weight mpg foreign
edit
esttab
eststo clear
sysuse auto, clear
regress price weight mpg
estimates store model1
regress price weight mpg i.foreign
estimates store model2
esttab model1 model2
**Standard errors, p-values, and summary statistics
eststo: regress price weight mpg
eststo: regress price weight mpg foreign
esttab, se ar2
esttab, p scalars(F df_m df_r)
****Standardized coefficients****
eststo: regress price weight mpg
eststo: regress price weight mpg foreign
esttab, beta not
****Compressed table**************
sysuse auto, clear
eststo: regress price weight
eststo: regress price weight mpg
eststo: regress price weight mpg foreign
eststo: regress price weight mpg foreign displacement
esttab, compress *so sanh cac mo hinh*
esttab, compress
**Use with Excel***
sysuse auto, clear
eststo: regress price weight mpg
eststo: regress price weight mpg foreign
esttab using example.csv
esttab using example.csv, replace wide plain
eststo clear
esttab using example.rtf
sysuse auto, clear
regress price weight mpg
estimates store model1
regress price weight mpg i.foreign
estimates store model2
esttab model1 model2
esttab using example.rtf
esttab using example1.rtf
****Non-standard contents********
sysuse auto, clear
regress price weight mpg foreign
estadd vif
esttab, aux(vif 2) wide nopar
**https://stats.idre.ucla.edu/stata/faq/how-can-i-use-estout-to-make-regression-tables-that-look-like-those-in-journal-articles/***
**********HOW CAN I USE -ESTOUT- TO MAKE REGRESSION TABLES THAT LOOK LIKE THOSE IN JOURNAL ARTICLES? | STATA FAQ*********
use https://stats.idre.ucla.edu/stat/stata/notes/hsb2, clear
regress read female write
estimates store m1, title(Model 1)
regress read female write math
estimates store m2, title(Model 2)
regress read female write math science socst
estimates store m3, title(Model 3)
estout m1 m2 m3
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2)))
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2))) ///
legend label varlabels(_cons Constant)
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2))) /// legend label varlabels(_cons Constant)
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2)))
legend label varlabels(_cons Constant)
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2))) ///
legend label varlabels(_cons constant) ///
stats(r2 df_r bic)
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2))) ///
legend label varlabels(_cons constant) ///
stats(r2 df_r bic)
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2))) /// legend label varlabels(_cons constant) /// stats(r2 df_r bic)
estout m1 m2 m3, cells(b(star fmt(3)) se(par fmt(2))) / legend label varlabels(_cons constant) / stats(r2 df_r bic)
esttab using example2.rtf
esttab m1 m2 m3
esttab using example2.rtf
esttab using example3.rtf
esttab model1 model2
esttab m1 m2 m3
esttab using example3.rtf
Basic syntax and usage
esttab is a wrapper for estout. Its syntax is much simpler than that of estout and, by default, it produces publication-style tables that display nicely in Stata's results window. The basic syntax of esttab is:
esttab [ namelist ] [ using filename ] [, options estout_options ]
The procedure is to first store a number of models and then apply esttab to these stored estimation sets to compose a regression table. The main difference between esttab and estout is that esttab produces a fully formatted right away. Example:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(2, 71) = 14.74 Model | 186321280 2 93160639.9 Prob > F = 0.0000 Residual | 448744116 71 6320339.67 R-squared = 0.2934 -------------+---------------------------------- Adj R-squared = 0.2735 Total | 635065396 73 8699525.97 Root MSE = 2514 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | 1.746559 .6413538 2.72 0.008 .467736 3.025382 mpg | -49.51222 86.15604 -0.57 0.567 -221.3025 122.278 _cons | 1946.069 3597.05 0.54 0.590 -5226.245 9118.382 ------------------------------------------------------------------------------ (est1 stored) . eststo: regress price weight mpg foreign Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(3, 70) = 23.29 Model | 317252881 3 105750960 Prob > F = 0.0000 Residual | 317812515 70 4540178.78 R-squared = 0.4996 -------------+---------------------------------- Adj R-squared = 0.4781 Total | 635065396 73 8699525.97 Root MSE = 2130.8 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | 3.464706 .630749 5.49 0.000 2.206717 4.722695 mpg | 21.8536 74.22114 0.29 0.769 -126.1758 169.883 foreign | 3673.06 683.9783 5.37 0.000 2308.909 5037.212 _cons | -5853.696 3376.987 -1.73 0.087 -12588.88 881.4934 ------------------------------------------------------------------------------ (est2 stored) . esttab -------------------------------------------- (1) (2) price price -------------------------------------------- weight 1.747** 3.465*** (2.72) (5.49) mpg -49.51 21.85 (-0.57) (0.29) foreign 3673.1*** (5.37) _cons 1946.1 -5853.7 (0.54) (-1.73) -------------------------------------------- N 74 74 -------------------------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . eststo clearCode
The eststo command is used in this example to store the regression models. Models stored by eststo are automatically picked up by esttab (the command eststo clear
on the last line removes the models from memory). An alternative would be to use Stata's official estimates store
as in the following example:
. sysuse auto, clear (1978 Automobile Data) . regress price weight mpg (output omitted) . estimates store model1 . regress price weight mpg foreign (output omitted) . estimates store model2 . esttab model1 model2 -------------------------------------------- (1) (2) price price -------------------------------------------- weight 1.747** 3.465*** (2.72) (5.49) mpg -49.51 21.85 (-0.57) (0.29) foreign 3673.1*** (5.37) _cons 1946.1 -5853.7 (0.54) (-1.73) -------------------------------------------- N 74 74 -------------------------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . estimates clearCode
Standard errors, p-values, and summary statistics
The default of esttab is to display raw point estimates along with t-statistics and to print the number of observations in the table footer. To replace the t-statistics by, e.g., standard errors and add the adjusted R-squared type:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, se ar2 -------------------------------------------- (1) (2) price price -------------------------------------------- weight 1.747** 3.465*** (0.641) (0.631) mpg -49.51 21.85 (86.16) (74.22) foreign 3673.1*** (684.0) _cons 1946.1 -5853.7 (3597.0) (3377.0) -------------------------------------------- N 74 74 adj. R-sq 0.273 0.478 -------------------------------------------- Standard errors in parentheses * p<0.05, ** p<0.01, *** p<0.001Code
The t-statistics can also be replaced by p-values (option p
), confidence intervals (option ci
), or any parameter statistics contained in the estimates (see the aux()
option). Further summary statistics options are, for example, pr2
for the pseudo R-squared and bic
for Schwarz's information criterion. Moreover, there is a generic scalars()
option to include any other scalar statistics contained in the stored estimates. For instance, to print p-values and add the overall F-statistic and information on the degrees of freedom, type:
. esttab, p scalars(F df_m df_r) -------------------------------------------- (1) (2) price price -------------------------------------------- weight 1.747** 3.465*** (0.008) (0.000) mpg -49.51 21.85 (0.567) (0.769) foreign 3673.1*** (0.000) _cons 1946.1 -5853.7 (0.590) (0.087) -------------------------------------------- N 74 74 F 14.74 23.29 df_m 2 3 df_r 71 70 -------------------------------------------- p-values in parentheses * p<0.05, ** p<0.01, *** p<0.001 . eststo clearCode
Standardized coefficients
To display standardized coefficients and suppress the t-statistics type:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, beta not -------------------------------------------- (1) (2) price price -------------------------------------------- weight 0.460** 0.913*** mpg -0.097 0.043 foreign 0.573*** -------------------------------------------- N 74 74 -------------------------------------------- Standardized beta coefficients * p<0.05, ** p<0.01, *** p<0.001 . eststo clearCode
Coeffs and t-stats side-by-side
The wide
option arranges point estimates and t-statistics beside one another instead of beneath one another:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, wide ---------------------------------------------------------------------- (1) (2) price price ---------------------------------------------------------------------- weight 1.747** (2.72) 3.465*** (5.49) mpg -49.51 (-0.57) 21.85 (0.29) foreign 3673.1*** (5.37) _cons 1946.1 (0.54) -5853.7 (-1.73) ---------------------------------------------------------------------- N 74 74 ---------------------------------------------------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . eststo clearCode
Numerical formats
esttab has sensible default settings for numerical display formats. For example, t-statistics are printed using two decimal places and R-squared measures are printed using three decimal places. For point estimates and, for example, standard errors an adaptive display format is used where the number of displayed decimal places depends on the scale of the statistic to be printed (the default format is a3
; see below).
The format applied to a certain statistic can be changed by adding the appropriate display format specification in parentheses. For example, to increase precision for the point estimates and display p-values and the R-squared using four decimal places, type:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, b(a6) p(4) r2(4) nostar wide ---------------------------------------------------------------- (1) (2) price price ---------------------------------------------------------------- weight 1.746559 (0.0081) 3.464706 (0.0000) mpg -49.51222 (0.5673) 21.85360 (0.7693) foreign 3673.060 (0.0000) _cons 1946.069 (0.5902) -5853.696 (0.0874) ---------------------------------------------------------------- N 74 74 R-sq 0.2934 0.4996 ---------------------------------------------------------------- p-values in parentheses . eststo clearCode
Available formats are official Stata's display formats, such as %9.0g
or %8.2f
(see help format
). Alternatively, as is illustrated in the example above, a fixed format can be requested by specifying a single integer indicating the desired number of decimal places. Furthermore, an adaptive format a#
may be specified, where #
determines the minimum number of "significant digits" to be printed (#
should be an integer between 1 and 9) (see the Numerical formats section in the help file).
Labels, titles, and notes
To use variable labels and add some titles and notes, e.g., type:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, label /// > title(This is a regression table) /// > nonumbers mtitles("Model A" "Model B") /// > addnote("Source: auto.dta") This is a regression table ---------------------------------------------------- Model A Model B ---------------------------------------------------- Weight (lbs.) 1.747** 3.465*** (2.72) (5.49) Mileage (mpg) -49.51 21.85 (-0.57) (0.29) Car type 3673.1*** (5.37) Constant 1946.1 -5853.7 (0.54) (-1.73) ---------------------------------------------------- Observations 74 74 ---------------------------------------------------- t statistics in parentheses Source: auto.dta * p<0.05, ** p<0.01, *** p<0.001 . eststo clearCode
The label
option supports factor variables and interactions in Stata 11 or newer:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price mpg i.foreign (output omitted) . eststo: regress price c.mpg##i.foreign (output omitted) . esttab, varwidth(25) --------------------------------------------------------- (1) (2) price price --------------------------------------------------------- mpg -294.2*** -329.3*** (-5.28) (-4.39) 0.foreign 0 0 (.) (.) 1.foreign 1767.3* -13.59 (2.52) (-0.01) 0.foreign#c.mpg 0 (.) 1.foreign#c.mpg 78.89 (0.70) _cons 11905.4*** 12600.5*** (10.28) (8.25) --------------------------------------------------------- N 74 74 --------------------------------------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . esttab, varwidth(25) label --------------------------------------------------------- (1) (2) Price Price --------------------------------------------------------- Mileage (mpg) -294.2*** -329.3*** (-5.28) (-4.39) Domestic 0 0 (.) (.) Foreign 1767.3* -13.59 (2.52) (-0.01) Domestic # Mileage (mpg) 0 (.) Foreign # Mileage (mpg) 78.89 (0.70) Constant 11905.4*** 12600.5*** (10.28) (8.25) --------------------------------------------------------- Observations 74 74 --------------------------------------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . esttab, varwidth(25) label nobaselevels interaction(" X ") --------------------------------------------------------- (1) (2) Price Price --------------------------------------------------------- Mileage (mpg) -294.2*** -329.3*** (-5.28) (-4.39) Foreign 1767.3* -13.59 (2.52) (-0.01) Foreign X Mileage (mpg) 78.89 (0.70) Constant 11905.4*** 12600.5*** (10.28) (8.25) --------------------------------------------------------- Observations 74 74 --------------------------------------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . eststo clearCode
Plain table
The plain
option produces a minimally formatted table with all display formats set to %9.0g
:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, plain est1 est2 b/t b/t weight 1.746559 3.464706 2.723238 5.493003 mpg -49.51222 21.8536 -.5746808 .2944391 foreign 3673.06 5.370142 _cons 1946.069 -5853.696 .541018 -1.733408 N 74 74 . eststo clearCode
Compressed table
The compress
option reduces horizontal spacing to fit more models on screen without line breaking:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight (output omitted) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . eststo: regress price weight mpg foreign displacement (output omitted) . esttab, compress -------------------------------------------------------------- (1) (2) (3) (4) price price price price -------------------------------------------------------------- weight 2.044*** 1.747** 3.465*** 2.458** (5.42) (2.72) (5.49) (2.82) mpg -49.51 21.85 19.08 (-0.57) (0.29) (0.26) foreign 3673.1*** 3930.2*** (5.37) (5.67) displace~t 10.22 (1.65) _cons -6.707 1946.1 -5853.7 -4846.8 (-0.01) (0.54) (-1.73) (-1.43) -------------------------------------------------------------- N 74 74 74 74 -------------------------------------------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . eststo clearCode
Significance stars
The default symbols and thresholds are for the "significance stars" are: *
for p<.05, **
for p<.01, and ***
p<.001. To use +
for p<.10 and *
for p<.05, for example, type:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, star(+ 0.10 * 0.05) ---------------------------------------- (1) (2) price price ---------------------------------------- weight 1.747* 3.465* (2.72) (5.49) mpg -49.51 21.85 (-0.57) (0.29) foreign 3673.1* (5.37) _cons 1946.1 -5853.7+ (0.54) (-1.73) ---------------------------------------- N 74 74 ---------------------------------------- t statistics in parentheses + p<0.10, * p<0.05 . eststo clearCode
Use the nostar
option suppresses the significance stars.
Use with Excel
To produce a table for use with Excel, specify an output filename and apply the csv
format (or the scsv
format depending on the language version of Excel). For example:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab using example.csv (output written to example.csv)Code
A click on "example.csv" in Stata's results window will launch Excel and display the file:
Depending on whether the plain
option is specified or not, esttab uses two different variants of the CSV format. By default, that is, if plain
is omitted, the contents of the table cells are enclosed in double quotes preceded by an equal sign (i.e. ="..."
). This prevents Excel from trying to interpret the contents of the cells and, therefore, preserves formatting elements such as parentheses around t-statistics. One drawback of this approach is, however, that the displayed numbers cannot directly be used for further calculations in Excel. Hence, if the purpose of exporting the estimates is to do additional computations in Excel, specify the plain
option. In this case, the table cells are enclosed in double quotes without the equal sign, and Excel will interpret the contents as numbers. Example:
. esttab using example.csv, replace wide plain (output written to example.csv) . eststo clearCode
Result:
Use with Word
To produce a table for use with Word, specify an output filename with an .rtf
suffix or apply the rtf
option:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab using example.rtf (output written to example.rtf)Code
Result:
Appending is possible. Furthermore, varwidth()
and modelwidth()
may be used to change the column widths (the scale is about 1/12 inch). Example
. esttab using example.rtf, append wide label modelwidth(8)
(output written to example.rtf)
CodeResult:
Another very useful feature is the onecell
option that causes the point estimates and t-statistics (or standard errors, etc.) to be placed beneath one another in the same table cell:
. lab var mpg "The mgp variable has a really long label and that would disturb the tabl > e" . esttab using example.rtf, replace label nogap onecell (output written to example.rtf)Code
Result:
If you know a bit RTF you can also include RTF commands to achieve specific effects, although you have to be careful not to break the document (most importantly, do not introduce unmatched curly braces). Useful are, for example, {\b ...}
for boldface and {\i ...}
for italics. A very helpful reference is the RTF Pocket Guide by Burke (2003):
. esttab using example.rtf, replace nogaps /// > title({\b Table 1.} {\i This is the 1{\super st} table}) (output written to example.rtf) . eststo clearCode
Result:
Use with LaTeX
To create a table to be included in a LaTeX document, type:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab using example.tex, label nostar /// > title(Regression table\label{tab1}) (output written to example.tex)Code
Compiling a document containing
\documentclass{article}
\begin{document}
\input{example.tex}
\end{document}
then produces the following result:
Note that esttab automatically initializes the tabular environment and, if title()
is specified, sets the table as a float object. Use the fragment
option if you prefer to hard-code the table's environment and have esttab just produce the table rows.
The table above looks alright, but a better result is achieved by specifying the booktabs
option and loading LaTeX's booktabs package in the document preamble:
. esttab using example.tex, label nostar replace booktabs ///
> title(Regression table\label{tab1})
(output written to example.tex)
CodeResult:
A further improvement is to load LaTeX's dcolumn package and format the columns using the D
column specifier:
. esttab using example.tex, label replace booktabs ///
> alignment(D{.}{.}{-1}) ///
> title(Regression table\label{tab1})
(output written to example.tex)
CodeResult:
Last but not least, you can space the table out to a certain width:
. esttab using example.tex, label replace booktabs /// > alignment(D{.}{.}{-1}) width(0.8\hsize) /// > title(Regression table\label{tab1}) (output written to example.tex) . eststo clearCode
Result:
Non-standard contents
Sometimes it is necessary to include parameter statistics in a table for which no predefined option exists in esttab. Once the statistics are are stored in an e()
-matrix, they can be displayed using the main()
option (replacing the point-estimates) or the aux()
option (replacing the t-statistics). For example, to include variance inflation factors instead of t-statistics after regress
, you could type:
. sysuse auto, clear (1978 Automobile Data) . regress price weight mpg foreign (output omitted) . estadd vif Variable | VIF 1/VIF -------------+---------------------- weight | 3.86 0.258809 mpg | 2.96 0.337297 foreign | 1.59 0.627761 -------------+---------------------- Mean VIF | 2.81 added matrix: e(vif) : 1 x 4 . esttab, aux(vif 2) wide nopar ----------------------------------------- (1) price ----------------------------------------- weight 3.465*** 3.86 mpg 21.85 2.96 foreign 3673.1*** 1.59 _cons -5853.7 ----------------------------------------- N 74 ----------------------------------------- vif in second column * p<0.05, ** p<0.01, *** p<0.001Code
The second argument in aux()
specifies the display format.
However, if you want to include more than two kinds of parameter statistics, you have to switch to estout syntax and make use of the cells()
option. All estout options are allowed in esttab, but you have to be aware that the specified estout options will take precedence over esttab's own options. For example, specifying cells()
disables b()
, beta()
, main()
, t()
, abs
, not
, se()
, p()
, ci()
, aux()
, star
, staraux
, wide
, onecell
, parentheses
, and brackets
. In the following example the cells()
option is used to print point estimates, t-statistics, and variance inflation factors in one table:
. sysuse auto, clear (1978 Automobile Data) . regress price weight mpg foreign (output omitted) . estadd vif Variable | VIF 1/VIF -------------+---------------------- weight | 3.86 0.258809 mpg | 2.96 0.337297 foreign | 1.59 0.627761 -------------+---------------------- Mean VIF | 2.81 added matrix: e(vif) : 1 x 4 . esttab, cells("b(fmt(a3) star) vif(fmt(2))" t(par fmt(2))) ----------------------------------------- (1) price b/t vif ----------------------------------------- weight 3.465*** 3.86 (5.49) mpg 21.85 2.96 (0.29) foreign 3673.1*** 1.59 (5.37) _cons -5853.7 (-1.73) ----------------------------------------- N 74 -----------------------------------------Code
Similarly, for a complicated summary statistics section in the table footer you might have to use estout's stats()
option (which overwrites esttab options such as r2()
, ar2()
, pr2()
, aic()
, bic()
, scalars()
, sfmt()
, noobs
, and obslast
).
Viewing the internal estout call
Sometimes, an approach is to use esttab to assemble a basic table and then hand-edit and re-run the estout call. The call can be made visible by the noisily
option and is also returned in r(cmdline)
. Example:
. sysuse auto, clear (1978 Automobile Data) . eststo: regress price weight mpg (output omitted) . eststo: regress price weight mpg foreign (output omitted) . esttab, noisily notype estout , cells(b(fmt(a3) star) t(fmt(2) par("{ralign @modelwidth:{txt:(}" "{txt:)}}"))) stats(N, fmt(%18.0g) labels(`"N"')) starlevels(* 0.05 ** 0.01 *** 0.001) varwidth(12) modelwidth(12) abbrev delimiter(" ") smcltags prehead(`"{hline @width}"') posthead("{hline @width}") prefoot("{hline @width}") postfoot(`"{hline @width}"' `"t statistics in parentheses"' `"@starlegend"') varlabels(, end("" "") nolast) mlabels(, depvar) numbers collabels(none) eqlabels(, begin("{hline @width}" "") nofirst) interaction(" # ") notype level(95) style(esttab) . return list scalars: r(nmodels) = 2 r(ccols) = 3 macros: r(names) : "est1 est2" r(m2_depname) : "price" r(m1_depname) : "price" r(cmdline) : "estout , cells(b(fmt(a3) star) t(fmt(2) par("{ralign @mod.." matrices: r(coefs) : 4 x 6 r(stats) : 1 x 2 . eststo clearCode
notype
is specified in this example to suppress the display of the table.
-----------------------
Creating Publication-Quality Tables in Stata
Stata's tables are, in general, clear and informative. However, they are not in the format or of the aesthetic quality normally used in publications. Several Stata users have written programs that create publication-quality tables. This article will discuss esttab (think "estimates table") by Ben Jann. The esttab command takes the results of previous estimation or other commands, puts them in a publication-quality table, and then saves that table in a format you cause use directly in your paper such as RTF or LaTeX. Major topics for this article include creating tables of regression results, tables of summary statistics, and frequency tables.
The estout Package
The esttab command is just one member of a family of commands, or package, called estout. In fact, esttab is just a "wrapper" for a command called estout. The estout command gives you full control over the table to be created, but flexibility requires complexity and estout is fairly difficult to use. The esttab command runs estout for you and handles many of the details estout requires, allowing you to create the most common tables relatively easily. We will also discuss estpost, which puts results like summary statistics in a form esttab can work with. The ability to handle summary statistics and frequencies in addition to regression results is one of the reasons we elected to focus this article on esttab.
On the Workflow of Creating Tables
Keep in mind that you always have an alternative to using esttab: simply create the tables you want in Word or your favorite word processing program, copying and pasting the needed numbers from your Stata output. This is time-consuming and tedious. On the other hand, trying to figure out how to get esttab to give you the table you want can be time-consuming as well, and there's no guarantee it can make exactly the table you want. Be sure to consider the possibility that creating a particular table by hand may be quicker than using esttab. Much depends on how many tables you need to create, and how many numbers they contain. If you can get esttab to give you something close to what you want but are spending a lot of time trying to figure out how to get exactly what you want, consider just editing what you have.
Most people will find it's easier to first obtain a set of (hopefully) final results and then work on how to present them. We would not recommend running esttab until you are reasonably confident you've arrived at the results you want to publish.
Installing esttab
Since the estout package is not part of official Stata, you must install it before using it. It is available from the Statistical Software Components (SSC) archive and can be installed using the ssc install command in Stata:
ssc install estout
You only need to do this once—do not put this command in your research do files.
Check for updates periodically using adoupdate.
Basics
The esttab command needs some results to act on, so load the auto data set that comes with Stata and run a basic regression:
sysuse auto
reg mpg weight foreign
You can see the basic function of esttab simply by running it without any options at all:
esttab
---------------------------- (1) mpg ---------------------------- weight -0.00659*** (-10.34) foreign -1.650 (-1.53) _cons 41.68*** (19.25) ---------------------------- N 74 ---------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001
This puts the model results in a table within Stata's Results window. Viewing it in the Results window is useful for testing a table specification, but when you've got what you want you'll have esttab save it in the file format you're using for your paper. The default table contains many of the features you expect from a table of regression results in a journal article, including rounded coefficients and stars for significance. Note, however, that the numbers in parentheses are the t-statistics. Use the se option if you want to replace them with standard errors:
esttab, se
---------------------------- (1) mpg ---------------------------- weight -0.00659*** (0.000637) foreign -1.650 (1.076) _cons 41.68*** (2.166) ---------------------------- N 74 ---------------------------- Standard errors in parentheses * p<0.05, ** p<0.01, *** p<0.001
The esttab command uses the current contents of the e() vector (information about the last estimation command), not the results the last regression displayed. If you run a logit command with the or option Stata will display odds ratios:
logit foreign mpg, or
Logistic regression Number of obs = 74 LR chi2(1) = 11.49 Prob > chi2 = 0.0007 Log likelihood = -39.28864 Pseudo R2 = 0.1276 ------------------------------------------------------------------------------ foreign | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | 1.173232 .0616975 3.04 0.002 1.05833 1.300608 _cons | .0125396 .0151891 -3.62 0.000 .0011674 .1346911 ------------------------------------------------------------------------------
However, e(b) still contains the coefficients, and by default that is what esttab will display. It also labels the test statistics as t statistics rather than z statistics like the logit output does:
esttab
---------------------------- (1) foreign ---------------------------- foreign mpg 0.160** (3.04) _cons -4.379*** (-3.62) ---------------------------- N 74 ---------------------------- t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001
If you want odds ratios in your table, give esttab the eform (exponentiated form) option. If you want the table to say "z statistics in parentheses" rather than t use the z option (note that the z option does not change the numbers in any way):
esttab, eform z
---------------------------- (1) foreign ---------------------------- foreign mpg 1.173** (3.04) ---------------------------- N 74 ---------------------------- Exponentiated coefficients; z statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001
Specifying the eform option prompts esttab to drop the constant term from the table, because it doesn't make much sense to talk about the odds ratio of the constant. However, you can override this behavior by specifying the constant option.
Saving the Table in the Format of Your Paper
To save a table as an RTF (Rich Text Format) file, add using filename.rtf to the command, right before the comma for options. Also add the replace option so it can overwrite previous versions of the file.
esttab using logit.rtf, replace eform z
Rich Text Format includes formatting information as well as the text itself, and can be opened directly by Word and other word processors. Click here to see what the RTF file looks like.
The process of saving the table as a LaTeX file is identical: just replace .rtf with .tex. There are some special options that apply to LaTeX, such as fragment to create a table fragment that can be added to an existing table. HTML (.html) is another useful format option, and there are many others.
You can save the table as a comma separated variables (CSV) file that can easily read into Excel by setting the file extension to .csv. However, consider carefully whether what you contemplate doing in Excel can't be done better (and especially more reproducibly) within Stata.
Tables with Multiple Models
To create a table containing the estimates from multiple models, the first step is to run each model and store their estimates for future use. You can store the estimates either with the official Stata command estimates store, usually abbreviated est sto, or with the variant eststo included in the estout package. The eststo variant adds a few features, but we won't use any of them in this article so it doesn't matter which command you use. The basic syntax is identical: the command, then the name you want to assign to that set of estimates. Use this to build a set of nested models:
reg mpg foreign
est sto m1
reg mpg foreign weight
est sto m2
reg mpg foreign weight displacement gear_ratio
est sto m3
To have esttab create a table based on a single set of stored estimates, simply specify the name of the estimates you want it to use:
esttab m1
But you are not limited to one set:
esttab m1 m2 m3
------------------------------------------------------------ (1) (2) (3) mpg mpg mpg ------------------------------------------------------------ foreign 4.946*** -1.650 -2.246 (3.63) (-1.53) (-1.81) weight -0.00659*** -0.00675*** (-10.34) (-5.80) displacement 0.00825 (0.72) gear_ratio 2.058 (1.17) _cons 19.83*** 41.68*** 34.52*** (26.70) (19.25) (5.17) ------------------------------------------------------------ N 74 74 74 ------------------------------------------------------------ t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001
Summary (Model-Level) Statistics
The N (number of observations) for each model is shown by default, but you can add other model-level statistics. Options include R-squared (r2), AIC (aic), and BIC (bic). Any other scalar in the e() vector can also be added using the scalar() option. For example, you could add the model's F statistic, stored as e(F), with the option scalar(F). You cannot control the order in which they are listed, but you can move N to the end with obslast. You can remove N entirely with noobs.
esttab m1 m2 m3, se aic obslast scalar(F) bic r2
------------------------------------------------------------ (1) (2) (3) mpg mpg mpg ------------------------------------------------------------ foreign 4.946*** -1.650 -2.246 (1.362) (1.076) (1.240) weight -0.00659*** -0.00675*** (0.000637) (0.00116) displacement 0.00825 (0.0114) gear_ratio 2.058 (1.755) _cons 19.83*** 41.68*** 34.52*** (0.743) (2.166) (6.675) ------------------------------------------------------------ R-sq 0.155 0.663 0.669 AIC 460.3 394.4 396.9 BIC 465.0 401.3 408.4 F 13.18 69.75 34.94 N 74 74 74 ------------------------------------------------------------ Standard errors in parentheses * p<0.05, ** p<0.01, *** p<0.001
Cell (Variable-Level) Statistics
In addition to t statistics, z statistics, and standard errors, esttab can put p-values and confidence intervals in the parentheses with the p and ci options. You can have no secondary quantity in parentheses at all with the not (no t) option.
You can replace the main numbers as well. The beta option replaces them with standardized beta coefficients. The main() option lets you replace them with any other quantity from the e() vector.
If you prefer to have the statistic in parentheses on the same row as the coefficient, use the wide option.
esttab m1 m2 m3, wide ci noobs
--------------------------------------------------------------------------------------------------------------------------------- (1) (2) (3) mpg mpg mpg --------------------------------------------------------------------------------------------------------------------------------- foreign 4.946*** [2.230,7.661] -1.650 [-3.796,0.495] -2.246 [-4.719,0.227] weight -0.00659*** [-0.00786,-0.00532] -0.00675*** [-0.00907,-0.00443] displacement 0.00825 [-0.0145,0.0310] gear_ratio 2.058 [-1.444,5.559] _cons 19.83*** [18.35,21.31] 41.68*** [37.36,46.00] 34.52*** [21.21,47.84] --------------------------------------------------------------------------------------------------------------------------------- 95% confidence intervals in brackets * p<0.05, ** p<0.01, *** p<0.001
Titles, Notes, and Labels
You can give the table an overall title with the title() option. Type the desired title in the parentheses.
If you want to remove the note at the bottom that explains the numbers in parentheses and the meaning of the stars, use the nonotes option. If you want to add notes, use the addnotes() option with the desired notes in the parentheses. If you want multiple lines of notes, put each line in quotes.
By default each model in a table is labeled with a number and a title. If you don't want the number to appear, use the nonumber option. The model title defaults to the the name of the model's dependent variable, but you can change model titles with mtitle(). Each title goes in quotes inside the parentheses, and the order must match the order in which the stored estimates are listed in the main command.
The label option tells esttab to use the variable labels rather than the variable names. That means you can control exactly how a variable is listed by changing its label—just make sure the label provides an adequate description of the variable but is not too long. The labels below illustrate some of the potential problems.
esttab m1 m2 m3, label nonumber title("Models of MPG") mtitle("Model 1" "Model 2" "Model 3")
Models of MPG -------------------------------------------------------------------- Model 1 Model 2 Model 3 -------------------------------------------------------------------- Car type 4.946*** -1.650 -2.246 (1.362) (1.076) (1.240) Weight (lbs.) -0.00659*** -0.00675*** (0.000637) (0.00116) Displacement .. in.) 0.00825 (0.0114) Gear Ratio 2.058 (1.755) Constant 19.83*** 41.68*** 34.52*** (0.743) (2.166) (6.675) -------------------------------------------------------------------- Observations 74 74 74 -------------------------------------------------------------------- Standard errors in parentheses * p<0.05, ** p<0.01, *** p<0.001
If you don't want to change the actual variable labels, you can override them with the coeflabel() option. Put the variable name/label pairs you want to use inside the parentheses. Any variable for which you do not specify a label will be listed with its actual name.
esttab m1 m2 m3, coeflabel(foreign "Foreign Car" displacement "Displacement" gear_ratio "Gear Ratio" _cons "Constant")
------------------------------------------------------------ (1) (2) (3) mpg mpg mpg ------------------------------------------------------------ Foreign Car 4.946*** -1.650 -2.246 (3.63) (-1.53) (-1.81) weight -0.00659*** -0.00675*** (-10.34) (-5.80) Displacement 0.00825 (0.72) Gear Ratio 2.058 (1.17) Constant 19.83*** 41.68*** 34.52*** (26.70) (19.25) (5.17) ------------------------------------------------------------ N 74 74 74 ------------------------------------------------------------ t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001
Formats
In general you can change the format of a number by placing the desired format in parentheses following the option that prompts that number to be displayed. Use b() to format the betas and t() to format t statistics.
esttab m1 m2 m3, b(%9.1f) t(%9.1f) r2(%9.6f)
------------------------------------------------------------ (1) (2) (3) mpg mpg mpg ------------------------------------------------------------ foreign 4.9*** -1.7 -2.2 (3.6) (-1.5) (-1.8) weight -.0066*** -.0068*** (-10) (-5.8) displacement .0082 (.72) gear_ratio 2.1 (1.2) _cons 20*** 42*** 35*** (27) (19) (5.2) ------------------------------------------------------------ N 74 74 74 R-sq .154762 .662703 .669463 ------------------------------------------------------------ t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001
Stars and Significance
The star() option lets you control when stars are used. Inside the parentheses you'll put a list of characters paired with the numeric threshold beneath which they will be applied to a coefficient. The default is equivalent to:
star(* 0.05 ** 0.01 *** 0.001)
Note that star() pays attention to both the numbers and how you format them: if you don't include the leading zeros they will not appear in the table.
esttab m1 m2 m3, p star(+ 0.1 * 0.05 ** 0.01)
--------------------------------------------------------- (1) (2) (3) mpg mpg mpg --------------------------------------------------------- foreign 4.946** -1.650 -2.246+ (0.001) (0.130) (0.074) weight -0.00659** -0.00675** (0.000) (0.000) displacement 0.00825 (0.472) gear_ratio 2.058 (0.245) _cons 19.83** 41.68** 34.52** (0.000) (0.000) (0.000) --------------------------------------------------------- N 74 74 74 --------------------------------------------------------- p-values in parentheses + p<0.1, * p<0.05, ** p<0.01
Tables of Summary Statistics
The esttab command is designed to draw information from the e() vector, which is only used by estimation commands. However, estpost will take the results from the r() vector used by other commands and post them in the e() vector. This allows esttab to create tables based on those results, but you'll generally have to give more guidance about what that table should contain.
To store the results of a command in e(), put the estpost command before it:
estpost sum price foreign mpg
The resulting table is designed to tell you the official name of each quantity. You will use those names in subsequent esttab commands.
| e(count) e(sum_w) e(mean) e(Var) e(sd) e(min) e(max) e(sum) -------------+---------------------------------------------------------------------------------------- price | 74 74 6165.257 8699526 2949.496 3291 15906 456229 foreign | 74 74 .2972973 .2117734 .4601885 0 1 22 mpg | 74 74 21.2973 33.47205 5.785503 12 41 1576
When working with regression results, esttab knows that e(b) is the primary quantity of interest and builds the table accordingly. With summary statistics, you need to tell esttab what the table should contain using the cell() option. This is technically an option for estout rather than esttab, but esttab will pass it along to estout while still doing some of the other work for you. However, if you want to read the full documentation for the cell() option you need to type help estout rather than help esttab.
If you want a table of just means, use cell(mean):
esttab, cell(mean)
------------------------- (1) mean ------------------------- price 6165.257 foreign .2972973 mpg 21.2973 ------------------------- N 74 -------------------------
You can list multiple quantities:
esttab, cell(mean sd)
------------------------- (1) mean/sd ------------------------- price 6165.257 2949.496 foreign .2972973 .4601885 mpg 21.2973 5.785503 ------------------------- N 74 -------------------------
If you want quantities to appear on a single row, you can group them with either quotes or parentheses. The following commands are equivalent:
esttab, cell("mean sd")
esttab, cell((mean sd))
-------------------------------------- (1) mean sd -------------------------------------- price 6165.257 2949.496 foreign .2972973 .4601885 mpg 21.2973 5.785503 -------------------------------------- N 74 --------------------------------------
Note how in this case quotes do not indicate strings!
Model numbers and model titles make little sense for this table (especially since the title is empty at this point), so consider removing them with nonumber and nomtitle:
esttab, cell((mean sd)) nonumber nomtitle
-------------------------------------- mean sd -------------------------------------- price 6165.257 2949.496 foreign .2972973 .4601885 mpg 21.2973 5.785503 -------------------------------------- N 74 --------------------------------------
To control the numeric format of results listed in cell() use the fmt() option:
esttab, cell((mean(fmt(%9.1f)) sd(fmt(%9.2f)))) nonumber nomtitle
-------------------------------------- mean sd -------------------------------------- price 6165.3 2949.50 foreign 0.3 0.46 mpg 21.3 5.79 -------------------------------------- N 74 --------------------------------------
There are many other options. A useful addition to this table is par for parentheses:
esttab, cell((mean sd(par))) nonumber nomtitle
-------------------------------------- mean sd -------------------------------------- price 6165.257 (2949.496) foreign .2972973 (.4601885) mpg 21.2973 (5.785503) -------------------------------------- N 74 --------------------------------------
The column heading labels also leave somewhat to be desired. You can override them with a label() option associated with each quantity in cell(). This is different from the general label option, which tells esttab to replace the variable names at the beginning of each row with the variable labels. You are welcome to use both (or use coeflabel() to set the row labels yourself):
esttab, cell((mean(label(Mean)) sd(par label(Standard Deviation)))) label nonumber nomtitle
---------------------------------------------- Mean Standard D~n ---------------------------------------------- Price 6165.257 (2949.496) Car type .2972973 (.4601885) Mileage (mpg) 21.2973 (5.785503) ---------------------------------------------- Observations 74 ----------------------------------------------
The problem now is that "Standard Deviation" had to be truncated because its column is not wide enough. You can set the width of the columns with the modelwidth() option (recall that when dealing with regression results each column is a model). If you put a single number in the parentheses the width in characters of all the columns will be set to that number. If you give a list of numbers, they will be applied to the columns in order:
esttab, modelwidth(10 20) cell((mean(label(Mean)) sd(par label(Standard Deviation)))) label nomtitle nonumber
---------------------------------------------------- Mean Standard Deviation ---------------------------------------------------- Price 6165.257 (2949.496) Car type .2972973 (.4601885) Mileage (mpg) 21.2973 (5.785503) ---------------------------------------------------- Observations 74 ----------------------------------------------------
Admittedly this will never be publication-quality when rendered as plain text. But consider this RTF version, created by:
esttab using means.rtf, modelwidth(10 20) cell((mean(label(Mean)) sd(par label(Standard Deviation)))) label nomtitle nonumber replace
Frequency Tables
Creating frequency tables also relies on using estpost to put the results in the e() vector:
estpost tab rep78 foreign
foreign | rep78 | e(b) e(pct) e(colpct) e(rowpct) -------------+-------------------------------------------- Domestic | 1 | 2 2.898551 4.166667 100 2 | 8 11.5942 16.66667 100 3 | 27 39.13043 56.25 90 4 | 9 13.04348 18.75 50 5 | 2 2.898551 4.166667 18.18182 Total | 48 69.56522 100 69.56522 -------------+-------------------------------------------- Foreign | 1 | 0 0 0 0 2 | 0 0 0 0 3 | 3 4.347826 14.28571 10 4 | 9 13.04348 42.85714 50 5 | 9 13.04348 42.85714 81.81818 Total | 21 30.43478 100 30.43478 -------------+-------------------------------------------- Total | 1 | 2 2.898551 2.898551 100 2 | 8 11.5942 11.5942 100 3 | 30 43.47826 43.47826 100 4 | 18 26.08696 26.08696 100 5 | 11 15.94203 15.94203 100 Total | 69 100 100 100
These are the same numbers you'd get from tab alone, just organized differently. Note that the frequencies themselves are called e(b), but we'll still use cell() because otherwise esttab will treat them like regression coefficients:
esttab, cell(b)
------------------------- (1) b ------------------------- Domestic 1 2 2 8 3 27 4 9 5 2 Total 48 ------------------------- Foreign 1 0 2 0 3 3 4 9 5 9 Total 21 ------------------------- Total 1 2 2 8 3 30 4 18 5 11 Total 69 ------------------------- N 69 -------------------------
The model number, empty model title, and column label (b) are all useless here, so remove the number and title and change the label with collabels(). You could also remove the column label entirely with collabels(none).
esttab, cell(b) nonumber nomtitle collabels(Frequency)
------------------------- Frequency ------------------------- Domestic 1 2 2 8 3 27 4 9 5 2 Total 48 ------------------------- Foreign 1 0 2 0 3 3 4 9 5 9 Total 21 ------------------------- Total 1 2 2 8 3 30 4 18 5 11 Total 69 ------------------------- N 69 -------------------------
The unstack option converts the three sections into columns:
esttab, cell(b) unstack nonumber nomtitle collabels(none)
--------------------------------------------------- Domestic Foreign Total --------------------------------------------------- 1 2 0 2 2 8 0 8 3 27 3 30 4 9 9 18 5 2 9 11 Total 48 21 69 --------------------------------------------------- N 69 ---------------------------------------------------
To control the label for the row variable use eqlabels(), but esttab thinks of it as being the left-hand-side of an equation (remember esttab was built for models). Thus you have to use the lhs() suboption within eqlabels(). You can adjust the amount of space available to the label with varwidth():
esttab, cell(b) eqlabels(, lhs("Repair Record")) varwidth(15) unstack nonumber nomtitle collabels(none)
------------------------------------------------------ Repair Record Domestic Foreign Total ------------------------------------------------------ 1 2 0 2 2 8 0 8 3 27 3 30 4 9 9 18 5 2 9 11 Total 48 21 69 ------------------------------------------------------ N 69 ------------------------------------------------------
You can add additional quantities to cell() and control their appearance and structure using all the tools we discussed in the section on summary statistics. Consider adding a note to explain what each number represents with the note() option:
esttab, cell(b rowpct(fmt(%5.1f) par)) note(Row Percentages in Parentheses) unstack nonumber nomtitle collabels(none) eqlabels(, lhs("Repair Record")) varwidth(15)
------------------------------------------------------ Repair Record Domestic Foreign Total ------------------------------------------------------ 1 2 0 2 (100.0) (0.0) (100.0) 2 8 0 8 (100.0) (0.0) (100.0) 3 27 3 30 (90.0) (10.0) (100.0) 4 9 9 18 (50.0) (50.0) (100.0) 5 2 9 11 (18.2) (81.8) (100.0) Total 48 21 69 (69.6) (30.4) (100.0) ------------------------------------------------------ N 69 ------------------------------------------------------ Row Percentages in Parentheses
This is just a fraction of what esttab (let alone estout) can do. To learn more, we suggest reading the Stata Journal article that introduced it. For syntax details, type help esttab and/or help estout.
Last Revised: 3/26/2015
Nhận xét