This guide is supposed to work as a brief "online help" for Stata for Windows that makes specific use of the possibilities of the internet. It was modeled after my SPSS guide. Its aim is to provide an intermediate road to learning Stata that hopefully is especially convenient for newbies (even though not for absolute beginners). Throughout, it is assumed that users have already mastered the statistical procedures I am dealing with, as no explanations of these are given. All you can learn here is how to put things into practice with Stata.
Please note also that this guide does not introduce you in any thorough way to the fundamentals of working with Stata for Windows, e.g., how to install the program, what the different "windows" are, how to set up a data base, how exactly to execute commands from a do file, etc. Of course, much of it will be mentioned, but it won't be explained in any depth, as these are things that are quite tiresome to explain in writing and very easy to explain simply by demonstrating and rehearsing (and some trial and error). But when you have just developed a basic idea of how the program works, this guide hopefully may be of some help.
There are great books about working with Stata. Also, the handbooks are very fine. The only advantage of this guide is that you don't have to carry it around. In the age of the internet, almost everyone who is working with a computer has immediate access to the world wide web and can easily retrieve this guide. Of course, it goes not much beyond Stata's help system; and in many, many respects the help system is much more comprehensive. Yet, there is one advantage of this guide: It is structured according to procedures and therefore may help you to find more easily what you need. Another advantage (which in other respects is a disadvantage, of course) is its relative brevity. For instance, as of version 11, Stata handbooks come as PDF files. This is very nice inasmuch you can carry them easily with you, in contrast to the tons of paper that were distributed earlier. But there is a drawback: It's not easy to browse through long texts on a computer. So, perhaps there is still a place for this guide; if it isn't, I don't care much, as I write it basically for myself.
This guide gives only a few examples for the most common Stata procedures. And indeed, I work via examples. For instance, the Stata help system will often present procedures as follows:
STATA HELP SYSTEM: alpha varlist [, options]
which means that "varlist" is to be replaced by a list of variables and "options" by the names of the specific options choosen (the brackets mean that options may be omitted). This guide will typically give simply a list of variables and will also display immediately one or several options that seem helpful to me, as in
THIS GUIDE: alpha trust1 trust2 trust3, i g(trust)
A note on different versions of Stata. As far as I could check, all of the examples I provide should work with Stata for Windows, version 10. It should also work with higher versions (currently, the newest version is 11), but new stuff from higher versions has not yet entered this guide. Most of the stuff will also work with earlier versions. However, I started to work only with version 8, so I do not have any knowledge about what happened before that. Note that I don't have the time to adjust this guide immediately as soon as changes are made to Stata.
How reliable is this guide? Well, apart from the odd typo, everything you find here will work, as I said before. The reason is quite simple: This guide arises from my own work; that is, it is primarily motivated by my own desire to put down what I have found out about Stata in order to retrieve it whenever needed. Publishing this stuff on the internet is just a way of sharing what I have learned. However, occasionally I speculate about things Stata might or might not do (for instance, when comparing it to the capabilities of other software). Typically, this means that I have sought for possibilities to do things in a certain way and was unable to find them. Of course, this does not necessarily mean that these possibilities are absent; rather, it may be well the case that I was only too stupid.
All in all, please consider the following: What you find here is the work of a single person who has many other duties to comply with. This guide is just a spin-off of the data analysis work I am doing and which is my main business – apart from dealing with the university administration, filling in forms, applying for this and that, commenting on the latest ideas of my department, my colleagues, my dean, other deans, the rector, the vice-rector, the vice-vice-rectors, and many others about how to render our university more up-to-date, designing new courses of study, preparing for meetings, going to meetings, thinking about the consequences of meetings, trying to figure out whether and when my university has already given me the money it has promised me, trying to figure out how much I will have to pay for my staff this year (last year, the administration started accounting in autumn, that is, after about three quarters of the money had already gone with me just guessing how much it might be), and so on. Therefore, there may be a lot that you will find wanting in this guide; this refers not only to content, but also to language (including typos) and design. Please accept my apologies.
Perhaps I will collect links at a later stage. For the time being, just use this page from Stata Corp. which provides links to helpful resources for learning Stata.
Note: Only 'major' changes (new keywords, sizable additions to entries that already exist) are reported here. I try to indicate the date of the most recent change at the bottom of each entry, but I am not very good at this. Particularly minor corrections of typos may thus go unnoticed.
January 2012
Added entry on help, search and the like to section "Basics". Also, I am now working with Stata 12, and I will try to make you aware of the most important changes.
December 2011
New entries on factor variables and on multilevel models (currently for metric dependent variables only). Some additions to entries on generating variables and on crosstabulation.
September 2011
Added a small piece on packages for formatting Stata output (e.g., for LaTex, HTML or Word) in the entry on output (section "Basics"). Created a new entry on estimation of confidence intervals (section "Data Analysis").
December 2010
Added a small entry on nonparametric tests.
November 2010
Slightly enlarged the section on crosstabulations to give a little bit more prominence to the tab2 command, and also to explain the firstonly option.
October 2010
Added a section about cumulative density plots (for empirical variables) to the entry about basic charts.
September 2010
Slightly expanded the section on EDA to explain how to influence the display of stem-and-leaf displays. Added a few words about the fre command to the entry about frequency tables.
August 2010
A small section with some basic commands for analyzing multiply imputed data sets with Stata 11 has been added. The entry on correlations has been slightly expanded.
June 2010
I have finally acquired Stata 11. I will try to accomodate this guide to any changes I encounter, but please have some patience. Presently, I notify readers of one major change, i.e. the fact that you may leave open the data window while proceeding with your work.
May 2010
This entry serves mainly to assure people that this is not a dead end. I did not much to improve or enlarge this guide, but every now and then, small changes occur and mistakes of language are corrected (or so I hope). – Near the end of May, I added a very small entry about life table analysis.
April 2009
Every now and then, minor amendments and extensions are being made.
March 2009
What you find here has grown since late 2005. Now, I think enough stuff has accumulated to put it on the world wide web. Please note that this version is far from being satisfying. This does not mean that it contains wrong or useless stuff; rather, it is not very comprehensive. It's really mostly for beginners.
This page is a process initiated and maintained by
Prof. Dr. Wolfgang Ludwig-Mayerhofer
Universität Siegen
FB 1 - Soziologie
57068 Siegen
Homepage at the University of Siegen
Last update: 27 Jan 2012