Another great resource for the analyst is What Statistical Analysis Should I Use? courtesy of the Institute for Digital Research and Education (IDRE) at UCLA. The linked document outlines different tests very clearly, with working examples in Stata. They also have examples for SAS and SPSS which you can look up in this handy table if you are so inclined. Each statistical test gets a short paragraph describing what it tests, for which types of variables it's appropriate, and how the test relates to other methods. I especially like that important assumptions, which can easily be overlooked, are pointed out explicitly.
Take for example the twosample ttest above. For any rangebased variable you can calculate the tstatistic and then use the limiting distribution to estimate the confidence interval/pvalue. However, the limiting distribution is only applicable for rangebased variables that are normally distributed. So if your variable of interest cannot be assumed to be normal, a ttest is absolutely inappropriate. As with most things Stata, the document is geared towards causal analysis. This means that terms like "dependent" and "independent" variable are thrown around. In a twosample example, the "dependent" variable is the variable of interest and the "independent" variable is an indicator of which sample a given observation belongs to. I came across this document when trying to answer the question: how do I test if two nonnormal samples arise from the same distribution? In my case, the WilcoxonMannWhitney test (or the Kruskal Wallis test on two samples) would be more appropriate than a twosample ttest.
The list of methods provided by "What Stat...?" is far from exhaustive. Other tests that I have found and believe would also be appropriate are KolmogorovSmirnov and AndersonDarling. Which yields a new question: how can the analyst reconcile contradictory results from different nonparametric tests? The subject of a later post I believe.

AuthorJust an aggregation of things I like. Categories 