Epidemiology & Biostatistics
- eBooks
- CDC: Public Health 101 Series
- CDC: Crisis & Emergency Risk Communication (CERC)
- UCLA Statistic Computing is a valuable resource for learning statistics and statistic software such as SAS, Stata, and R!.
- Statistic Solutions has some succinct explanation of statistical approaches included Factor Analysis & SEM
- Zuur (2009). A Protocol for data exploration to avoid common statistical problems
- Age-Standardization and Age-Adjustment
- CDC. 2000 projected US population weight and distribution pattern
- SEER. Standard population for age-adjustment
- Institute of Medicine. The future of public health (1988), The future of public's health in the 21st century (2003)
- Smelser (2001): International Encyclopedia of the Social & Behavioral Sciences [1st (2001)] [2nd (2015)]
- Lohr (2001): Sample Surveys: Model-based Approaches
- Dr. David Kleinbaum published his ActivEpi in 2001, which is now online for free (ActivEpi website)
- Majid Ezzati (2006):Global Burden of disease and Risk Factors
- What is epidemiology? and Jokes.
- Interpretation of relative risk
- Poisson regression and count outcome
- Poisson regression and related
- How to calculate confidence interval of incidence rate under the Poisson distribution
- How to get predicted incidence rate using -poisson- of Stata
- How large of SE is too large?
- Walker (2016): A Guide to Section 508 Compliance Using SAS® 9.4 ODS
- Gordon (2014): An exercise in non-linear modeling
- Complex Sampling Survey
- Allen Downey (2014): Think Stats using Python
- Grinstead: Introduction to Probability
- Michael Lavine (2013): Introduction to Statistical Thought
- GRADE website: GRADE guidelines (Grades of Recommendation, Assessment, Development, and Evaluation) (2011)
- CONSORT website- Transparent Reporting of Trials: Guidelines for Reporting Observational & RCT Studies and Flow Diagram (2010)
- STROBE website: STROBE: The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement (2007)
- Clinical Analyte Unit Conversion - Jay Clinic Service.
- OpenEpi provides statistics for counts and measurements in descriptive and analytic studies, stratified analysis with exact confidence limits, matched pair and person-time analysis, sample size and power calculations, random numbers, sensitivity, specificity and other evaluation statistics, R x C tables, chi-square for dose-response, and links to other useful sites.
- Statistical literacy
- Chart Chooser — the favorite tool for improved Excel and PowerPoint charts. there is R! version of Chart Chooser (not many charts on the site, but the idea is great)
- Jon's Excel Charts and Tutorials - Peltier Tech
- Stats + Stories: The statistics behind the stories and the stories behind the statistics.
- Ann Emery: Data visualization blogs
- Broman (2017): Data organization in spreadsheets
- Vincent Granville (2014): 10 types of regressions. Which one to use?
- Moderator vs mediator
- Wikipedia: Mediation, Moderation
- David Kenny: Moderator, Mediator
- Baron & Kenny (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182.
- inference (2018).ML beyond Curve Fitting: An Intro to Causal Inference and do-Calculus. Interview of Judea Pearl.To Build Truly Intelligent Machines, Teach Them Cause and Effect
- Distribution (Probability, CDF, and Quantile)
- Wicklin (2018). Fit a distribution from quantiles (SAS)
- Tony Hey (2009): The Fourth Paradigm Data-Intensive Scientific Discovery (Microsoft/Publications)
- Yanir Seroussi: Causal Inference reading List
- Missing Imputation
- Blog: Multiple Imputation
- Allison (2014). Sensitivity analysis for not missing at random.
- Stata: Yulia Marchenko (2011). Chained equations and more in multipleimputation in Stata 12
- Survey data imputation
- Wells (2018): Approaches to imputing missing data in complex survey data
- Mukhopadhyay (2016): Survey Data Imputation with PROC SURVEYIMPUTE (Video)
- Resampling and Monte Carlo Simulation
- Latent Class Analysis
- Christopher Baum (2016): Introduction to SEM in Stata
- Jones (2012): A Stata plugin for estimating group-based trajectory models (traj)
- Curran-Bauer(2016): Introduction to Growth Curve Modeling: An Overview and Recommendations for Practice
- Nagin (1999): Analyzing developmental trajectories: a semiparametric, group-based approach
- Training Course
- Linear and nonlinear function/relationships/regression
- Khan Academy: Linear and nonlinear functions (1, 2), Exploring nonlinear relationships
- Richard Williams: Nonlinear relationships, Stata highlights
- Minitab: What Is the Difference between Linear and Nonlinear Equations in Regression Analysis?; Linear or Nonlinear Regression? That Is the Question; Curve Fitting with Linear and Nonlinear Regression
- UCLA: Nonlinear Regression in SAS; Nonlinear or Linear Model
- PennState: Logistic, Poisson, and Nonlinear Regression
- datascience+: First steps with Non-Linear Regression in R!
- StackExchange: How to tell the difference between linear and non-linear regression models?
- StatisticsSolutions: Nonlinear regression
- Wikipedia: Linear function; Nonlinear system; Linear regression; nonlinear regression
- Ruckstuhl: Introduction to Nonlinear Regression
- Motulsky (2016): Fitting curves to data using nonlinear regression
- Haan: What are nonlinear regression functions?
- Brannick: Curvilinear Regression
- Wicklin (2018). Solve a system of nonlinear equations with SAS
- Wicklin (2018). Fit a growth curve in SAS
- Trend analysis
- Blog: Trend Analysis
- NIH.Joinpoint (software)
- Bayesian
- Kruschke (2013).Bayesian estimation supersedes the t test
- bayestestR: Become a Bayesian master you will
- McElreath: Statistical Rethinking: Bayesian statistics using R & Stan open access online.
- Scott Cunningham: Causal Inference: the Mixtape (using Stata)
- Stata
- Blog: Stata - my first Stata program
- Stata Online Help and Document
- The Stata Journal
- Tips of Stata
- Dickman. Estimating and modelling relative survival using Stata (strs, stnet)
- Drukker. Programming an estimation command in Stata: a map to posted entries
- Sribney. How can I estimate correlations and their level of significance with survey data
- margins - undocumented and under documented
- margins, gen() creates variables with predictions for each observation
- margins, at(varnm=gen(exp)) generates values for making predictions
- margins dis, at(age=gen(age)) gives average prediction by dis at observed age, which is equal to margins dis
- margins, at(age=gen(age+1)) gives average prediction by dis at observed age plus 1, which is equal to: .replace age=age+1 .margins dis
- margins dis, at(age=gen(age) age=gen(age+1))
- Average prediction at observed plus standard deviation: .sum age .local sd=r(sd) .margins dis, at(age=gen(bmi+`sd'))
- Princeton University. Online Stata Tutorial at DSS
- Tiberlake
- Bayesian analysis in Stata 15
- Stata Tips #7 - dyntext, dyndoc and user-written commands (version 15)
- Stata Tips #8 spatial analysis in Stata 15 (version 15)
- Herrera (2017). Spatial econometrics methods using Stata
- Kondo (2015) Hot and cold spot analysis using Stata (The Stata Journal)
- Pisati (2010). Exploratory spatial data analysis using Stata
- How to estimate intraclass correlation with survey data (VIF)? (Link1, Link2)
- use "correlate" with aweight (it is equivalent to pweight) for point estimates of the correlation coefficient.
- use "svy: regress" for p-values. Do "svy: regress y x" and "svy:regress x y" and take the biggest p-value, which is the conservative thing to do.
- You might try the "corr_svy" statement which a module to compute correlation tables for survey data. It's based on the Sribney's procedures mentioned above.
- Or, you can get the correlation coefficient using "svy: regress y x", then "disp sqrt(e(r2))" to show coefficient (here e(r2) has squared R value. You can also calculated tolerance using "disp 1-e(r2)" and VIF (variance inflation factor) using "disp 1/(1-e(r2))" and , the A rule of thumb is that if VIF>10 then you need examine multicollinearity further.
- Alternative for VIF calculation: "regress y x z [pw=srvyweight]", then "estat vif"
- Cross Validation
- Trevor Hastie's The Elementary of Statistical Learning is a good and free book for more information. Thanks to Hastie.
- k-fold cross validation
- Stata: user written program crossfold (help file).
- R: Petr Keil (2013): AIC & BIC vs. Crossvalidation using R!.
- SAS: Using Validation and Cross Validation using PROC GLMSELECT.
- Deming: Cross Validation Using SAS
- Net reclassification improvement (NRI) Pencina (2011): Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers (an example of application of cross-validation).
- Data Visualization
- Survival Analysis
- Stephen Jenkins: Survival Analysis with Stata (U of Essex)
- Princeton German Rodriguez: Survival Analysis Pop 509 course notes
- Roberto Gutierrez: On Frailty Models in Stata (used the same dataset (bc.dta) by Jenkins' course)
- Austin (2017): A Tutorial on Multilevel Survival Analysis: Methods, Models and Applications
- How to split single observation into multiple observations by event time (Lexis Diagram): Stata - stsplit, R! - survival::survSplit and Epi::Lexis, SAS - Lexis.sas (pdf)
- Multilevel and Small Area Estimation (SAE) Analysis
- Princeton German Rodriguez: Multilevel Models Pop 510 course notes
- NIH (2000). Progress and promise in research on social and cultural dimensions of health - A research agenda (Video)
- University of Bristol. Centre for Multilevel Modelling
- Joop Hox (The author of Multilevel Analysis)'s Homepage has papers, programs and lectures to download.
- Rabe-Hesketh (2006). Multilevel modelling of complex survey data (Slides 2007)
- Multilevel models for complex survey data - The slides/articles of tutorial at 2011 BRFSS conference
- Paul Allison (2017).Using "Between-Within" models to estimate contextual effects
- Suchindran.Sampling weights and Regression Analysis
- Zaccarin (2008). The effects of sampling weights in multilevel analysis of PISA data. (Slides - Kiel 2009)
- Carle (2009). Fitting multilevel models in complex survey data with design weights: Recommendations
- D’Agostino (SASGF 2013). Multilevel Reweighted Regression Models to Estimate County-Level Racial Health Disparities Using PROC GLIMMIX
- Chantala. Software to Compute Sampling Weights for Multilevel Analysis
- UCLA papers on multilevel modeling
- Effects of Multicollinearity on Completed Model
- Suzuki (2012). Clarifying the use of aggregated exposures in multilevel models. In this article discussed the multicollinearity issue between an individual variable and the aggregated variable.
- Feaster (2011). Multilevel models to identify contextual effects on individual group member outcomes
- Andr ́es Guti ́errez. Small Area Estimation 101
- Sun (2015) Analysis of spatial and temporal data
- Sptatiotemporal analysis
- Skrondal (2009).Prediction in multilevel generalized linear models
- SAS
- PROC GLIMMIX (pdf v13.1) (SAS Documentation)
- Example 43.18 Weighted Multilevel Model for Survey Data (v13.1)
- Smith (2012). SAS Proc GLIMMIX for spatial analysis
- Kiernan (2012). Tips and Strategies for Mixed Modeling with SAS/STAT® Procedures
- Predictive Modeling Tips | Free Best Practices Guide
- Stata
- Multilevel mixed-effects models
- me: multilevel models to fit random-intercept and random-slope models.
- xt: Random-effects panael-data estimators
- Multilevel linear models in Stata, part 1: Components of variance (xt, Video)
- Multilevel linear models in Stata, part 2: Longitudinal data (xt, Video)
- Multilevel generalized linear models (me, Video)
- User-written program: gllamm
- Huber (2014). How to simulate multilevel/longitudinal data
- Rabe-Hesketh (2008). Prediciton in multilevel logistic regression
- Blogs etc.
- Statistic notes in the BMJ
- Statistical Horizons Blog (Paul Allison)
- Rick Wicklin Blog (SAS)
- R Bloggers (R!)
- The Stata Blog (Stata)
- Bendix Carstensen (R! and Age-Period-Cohort analysis)
- medRxiv.org: the preprint server for health sciences
- arXiv.org: Open access to e-prints in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics
- Allison (2018). For causal analysis of competing risks, don't use Fine & Gray's subdistribution method
- Kevin Markham (2015). Should you teach Python or R for data science?
- Allison (2014). Prediction vs. Causation in Regression Analysis
- Norm Matloff (2014). Why are we still teaching T-tests?
- Rick Wicklin (2014). How to choose colors for maps and heat maps.
- Nathan Yau. How to Visualize and Compare Distribution
- Daniel Lakens: The 20% Statistician
- Estimating the covariance of the means from two samples?
- Jonas Kristoffer Lindelov: Coommon statistical tests are linear models (Chinese version)
- Rodriguez. Generalized linear models - 7.4 the piecewise exponential model
- Statistical Reflections of a Medical Doctor (2012). Survival Analysis via Hazard Based Modeling and Generalized Linear Models
- How to do bootstrapping/Jackknife using Stata?
- Journal: Significance communicates and demonstrates statistic practice in an entertaining, thought-provoking and non-technical way.
- Journal: Statistic Science has some good philosophic articles and interviews with experts.
- Journal: Survey Statistician
- Public-accessible Health-related Datasets
- Google Public Data Explorer The Google Public Data Explorer makes large datasets easy to explore, visualize and communicate. As the charts and maps animate over time, the changes in the world become easier to understand. You don't have to be a data expert to navigate between different views, make your own comparisons, and share your findings.
- How to Estimate Percentiles and Confidence Intervals
- Timeline of Statistics (pdf)
- Distinguished Lectures of The Joint Program in Survey Methodology, University of Maryland
- Sorensen:The Use and Misuse of the Coefficient of Variation in Organizational Demography Research
- Predict a value and estimate the variance of a single response instead of average response, individual vs. marginal, standard error of prediction (Engineering Statistics Handbook, SAS JMP, StackExchange, R-bloggers, Martha Smith) marginal distribution (Statistics How To, Jason Brownlee, StackExchange, Khan Academy)
- UCLA: Delta Method in R!
- Video Clip/Webinar
- Hans Rosling shows the best stats you've ever seen
- David McCandless: The beauty of data visualization
- Demo: Stunning data visualization in the AlloSphere
- US Department of Veterans Affairs: Cyberseminar of HSR&D (Health Services Research & Development)
- Science Webinar (2019): Selling without selling out: How to communicate your science
- ESRI: Spatial Statistics Presentations
- SAS: A hands-on introduction to SAS data step hash programming techniques
- Math
- Why is the limit (1−1/n)^n equal to 1/e?
- Limit of (1+x/n)^n when n tends to infinity
- L’Hopital’s Rule is a powerful technique for finding the limit of an indeterminate form 0/0 or ∞/∞. What we need to do is differentiate the numerator and denominator and then take the limit
- Alder (2001): An introduction to mathematical modelling