Monday, October 24, 2011

Abramowitz and Stegun: Handbook of Mathematical Functions



“The present volume is an outgrowth of a Conference on Mathematical Tables held at Cambridge, Mass., on September 15-16, 1954, under the auspices of the National Science Foundation and the Massachusetts Institute of Technology. The purpose of the meeting was to evaluate the need for mathematical tables in the light of the availability of large scale computing machines. It was the consensus of opinion that in spite of the increasing use of the new machines the basic need for tables would continue to exist.


Numerical tables of mathematical functions are in continual demand by scientists and engineers. A greater variety of functions and higher accuracy of tabulation are now required as a result of scientific advances and, especially, of the increasing use of automatic computers. In the latter connection, the tables serve mainly for preliminary surveys of problems before programming for machine operation. For those without easy access to machines, such tables are, of course, indispensable...”


You can view or download the tenth printing of this famous reference here: http://people.math.sfu.ca/~cbm/aands/

Wednesday, October 19, 2011

How to Estimate Percentiles and Confidence Intervals

How to Estimate Percentiles and Confidence Intervals
By CDC

Including percentiles whose estimate falls on a value that is repeated multiple times in the dataset
A common practice to calculate confidence intervals from survey data is to use large-sample normal approximations. Ninety-five percent confidence intervals on point estimates of percentiles are often computed by adding and subtracting from the point estimate a quantity equal to twice its standard error. This normal approximation method may not be adequate, however, when estimating the proportion of subjects above or below a selected value, especially when the proportion is near 0.0 or 1.0 or when the effective sample size is small. In addition, confidence intervals on proportions deviating from 0.5 are not theoretically expected to be symmetric around the point estimate. Further, adding and subtracting a multiple of the standard error to an estimate near 0.0 or 1.0 can lead to impossible confidence limits (i.e., proportion estimates below 0.0 or above 1.0). The approach used for the Report data tables (and for previous Reports) produces asymmetric confidence intervals consistent with skewed (nonnormal) biologic data distributions. ...

You can read the whole article here: http://www.cdc.gov/exposurereport/data_tables/appendix_a.html

Monday, October 17, 2011

Recipes and Meal Planning from the American Diabetes Association

Recipes and Meal Planning from the American Diabetes Association

If you are looking for healthy recipes to lose weight, prevent, or manager diabetes, you may like the MyFoodAdvisor – Recipes for healthy Living from the American Diabetes Association. You must register to access these recipes, which I don’t like the way of access, but it’s FREE.

Wednesday, October 12, 2011

Top 50 Statistics Blogs of 2011

The Best Colleges has published a long list of statistic related blogs. Wish they could have reduced the number to 10.

Monday, October 03, 2011


Tips: SAS code matched to Stata code and others
Usually, I think that SAS is an analog to a product of Microsoft, Stata is an analog to a product of Apple, and R is an analog to a product of Google. I have used the SAS for many years, but have been attracted by Stata for some new features. I am always trying to find the same handy data management features using Stata, and notice there are a few websites providing these equivalents.

Resources for Disability Research
                                            - these materials highly recommended by my colleague J.C., an expert in this disability area.

Monday, September 19, 2011

How to get orthogonal polynomial coefficient/vector/codes

Tips - R & Stata & SAS: How to get orthogonal polynomial coefficient/vector/codes

When we do "contrast {lvl #1 #2 #3}" for trend analyses using Stata or other software for unequally spaced levels/categories, we need the orthogonal polynomial coefficient (#1 #2 #3), which is hard to be find in books. We can get these coefficients using R, Stata, or SAS. My favorite software for this purpose is R. Below I show the examples using these different kinds of software. Note: Stata has an operator (p. for orthogonal polynomial in the level values) for unequally spaced levels, for example, "contrast p.lvl".

R
  mostly I use R to get these coefficients:
  >cntr<-poly(c(1,2,5,6),3)
  >cntr

Stata
Step 1: create a dataset with one variable [lvl]:
  .input lvl
  1. 1
  2. 2
  3. 5
  4. 6
  5. end
Step 2-b: use 'orthpoly'
  .orthpoly lvl, generate(cntr1 cntr2 cntr3) degree(3)

  Now, in the dataset, you can find three new variables 'cntr1' for the orthogonal polynomial coefficients of degree 1 (linear), and 'cntr2' for the orthogonal polynomial coefficients of degree 2 (quadratic), and 'cntr3' for the orthogonal polynomial coefficients of degree 3 (cubic).

SAS
  PROC IML;
   lvl = {1 3 5 6}
   cntrl=ORPOL(lvl);
   PRINT cntrl;
  QUIT;

Friday, September 16, 2011

Tips: Stata - my first Stata program

capture program drop tabm
 program tabm
   version 12
   syntax varlist [if][in],cell count column row se ci ///

          cv percent proportion]
   local varnum : word count `varlist'
   local x : word 1 of `varlist'
   forvalues i=2/`varnum' {
       local y: word `i' of `varlist'
       svy: tabulate `x' `y',`col' `cell' `se' `percent' ///

            format(%5.1f)
  }
end


.tabm sex race5grp diabetes,cell se percent

why I get error message, when using 'margins' for complex sampling data

Tips - Stata: why I get error message, when using 'margins' for complex sampling data

When I use 'margins' for complex sampling data, after a logistic regression:
  . svy, subpop(if suball==1): logit arthritis i.diabetes c.age i.sex i.bmi4grp
  . margins diabetes, vce(unconditional) post

I've got an error message sometime:
  "missing predicted values encountered within the estimation sample r(322)"


The answer is to include 'subpop' in the 'margins' command:
  .margins diabetes, subpop(if suball==1) vce(unconditional) post

HbA1c: what do the numbers really mean?
The Lancet, Volume 378, Issue 9796, Pages 1068 - 1069, 17 September 2011
The Comment by Shivani Misra and colleagues (April 30, p 1476)1 addresses the topic of changing the way glycated haemoglobin (HbA1c) is reported from the traditional percentage units (used in the Diabetes Control and Complications Trial [DCCT] and UK Prospective Diabetes Study [UKPDS]) to the International Federation of Clinical Chemistry's (IFCC's) mmol/mol units. This is an important communication. Unfortunately, the Comment contains both misleading and erroneous information.
The remark about “variations of between 3% and 14% being reported” is misleading. The paper cited refers to between-laboratory coefficients of variation obtained from old (1996) data, before implementation of method standardisation by the National Glycohemoglobin Standardization Program (NGSP). Virtually all current methods have coefficients of variation of 5% or less, with some less than 2%.2
Moreover, Misra and colleagues advise clinicians not to convert the IFCC mmol/mol results to DCCT-aligned percentage units and claim that “the DCCT-aligned results are now untraceable and cannot be linked… to the original reference measurement, making them effectively meaningless”. This statement is completely incorrect. An established master equation with documented stability, which describes a linear relation between IFCC and NGSP/DCCT units, permits traceability of DCCT results to the IFCC reference system, and allows direct conversion of numbers between the two systems.3 This is vital to allow health-care providers to compare a patient's HbA1c value to the large body of published outcome data that use DCCT-aligned results.
A third miscommunication is “One untimed… blood sample for diagnosis”. The guidelines4 recommend that, in the absence of unequivocal hyperglycaemia (an uncommon finding), HbA1c be confirmed by repeat testing. It is essential for the medical community to understand these changes in HbA1c clearly to avoid negatively affecting care of diabetic patients.
We declare that we have no conflicts of interest.
References
1 Misra S, Hancock M, Meeran K, Dornhorst A, Oliver NS. HbA1c: an old friend in new clothes. Lancet 2011; 377: 1476-1477. Full Text | PDF(46KB) | CrossRef | PubMed
2 College of American Pathologists. GH2-A glycohemoglobin participant summary, 2011. Northfield, IL: CAP, 2011.
3 Geistanger A, Arends S, Berding C, et al. Statistical methods for monitoring the relationship between the IFCC reference measurement procedure for hemoglobin A1c and the designated comparison methods in the United States, Japan, and Sweden. Clin Chem 2008; 54: 1379-1385. CrossRef | PubMed
4 International Expert Committee. International Expert Committee report on the role of the A1C assay in the diagnosis of diabetes. Diabetes Care 2009; 32: 1327-1334. CrossRef | PubMed
The Lancet, Volume 378, Issue 9796, Pages 1069 - 1070, 17 September 2011
HbA1c: what do the numbers really mean? — Authors' reply
We do not believe that we have misled readers. The stated coefficients of variation refer to figures before the National Glycohemoglobin Standardization Program (NGSP) was implemented and were quoted to illustrate the different coefficients of variation in existence at the time of the Diabetes Control and Complications Trial (DCCT). Furthermore, the next paragraph clearly states that “harmonisation of results to DCCT-based calibrants in the 1990s partly alleviated this variation”. Although effective, the NGSP did not provide a reference measurement system, which has been the underlying driving force behind the International Federation of Clinical Chemistry (IFCC) standardisation.
In quoting “the DCCT-aligned results are now untraceable and cannot be linked… to the original reference measurement, making them effectively meaningless”, Randie Little and David Sacks chose to omit the phrase “through successive calibrations”. This statement referred to the use of DCCT-calibrated analysers, which are not in any way linked to the IFCC reference system. This practice would generate untraceable results. The consensus statement1 clearly indicates that the IFCC reference represents the only valid anchor to standardisation. We acknowledge that the use of the IFCC-NGSP master equation does permit traceability to the IFCC reference system. However, there are some crucial limitations, which underpin our reluctance to encourage physicians to undertake this conversion routinely.
First, although a linear relation exists between the IFCC-standardised and DCCT-aligned results, the latter cannot be considered a “pure” HbA1c measurement.2 Now that a pure HbA1c standard exists, one must question the validity of continuing to report DCCT-aligned results. To suggest that comparisons to outcome data necessitate interconversion is, in our opinion, ill-considered since the master equation can equally convert targets into new units.
Second, the use of the master equation generates further uncertainty in the derived DCCT-aligned values.3 Irrespective of whether this is significant, should the use of an equation to derive values from a reference be considered as robust as a system in which an unbroken chain of calibrations links the reference to the designated comparison method?4
Third, in the UK, DCCT percentage units will cease to be reported from October, 2011. We therefore actively encourage clinicians to familiarise themselves with the new units now. This is a fundamental course of action to avoid confusion later, which would undoubtedly be detrimental to patients' care.
We accept that a single measurement is not proposed; however, Little and Sacks have misunderstood the message being conveyed. Since guidelines5 advise repeat testing of an abnormal result by the same method, a second HbA1c measurement in a patient with an interfering factor will simply duplicate the error. It is important for clinicians to understand the limitations of a test, no matter how many times it is repeated.
References
1 Hanas R, John G. 2010 consensus statement on the worldwide standardization of the hemoglobin A1C measurement. Diabetes Care 2010; 33: 1903-1904. CrossRef | PubMed
2 European Association for the Study of Diabetes. Report of the ADA/EASD/IDF Working Group of the HbA1c Assay. London, UK, 20 January 2004. http://www.ifcchba1c.net/files/2004_Diabetologia2004_46_R53_54.pdf. (accessed Aug 3, 2011).
3 Geistanger A, Arends S, Berding C, et al. Statistical methods for monitoring the relationship between the IFCC reference measurement procedure for hemoglobin A1c and the designated comparison methods in the US, Japan and Sweden. Clin Chem 2008; 54: 1379-1385. CrossRef | PubMed
4 Joint Committee for Guides in Metrology. International vocabulary of metrology—basic and general concepts and associated terms. 3rd edn. http://www.bipm.org/utils/common/documents/jcgm/JCGM_200_2008.pdf. (accessed Aug 31, 2011).
5 WHO. Use of glycated haemoglobin (HbA1c) in the diagnosis of diabetes mellitus: abbreviated report of a WHO consultation. http://www.who.int/diabetes/publications/report-hba1c_2011.pdf. (accessed Aug 31, 2011).
a Imperial Healthcare NHS Trust, Charing Cross Hospital, London W6 8RF, UK

Thursday, September 15, 2011


Bariatric Surgery and Obesity and Diabetes – International Journal of Obesity, 09/2011
OVERVIEW
Collaboration between basic science researchers and bariatric surgeons is a win/win proposition: view from the chair
F-S Hould
Int J Obes 2011 35: S3-S6; 10.1038/ijo.2011.140

REVIEWS
Bariatric surgery, adipose tissue and gut microbiota
K Clément
Int J Obes 2011 35: S7-S15; 10.1038/ijo.2011.141

Bariatric surgery for treatment of obesity
S Eldar, H M Heneghan, S A Brethauer and P R Schauer
Int J Obes 2011 35: S16-S21; 10.1038/ijo.2011.142

Diabetes remission after bariatric surgery: is it just the incretins?
B Laferrère
Int J Obes 2011 35: S22-S25; 10.1038/ijo.2011.143

The mechanism of weight loss with laparoscopic adjustable gastric banding: induction of satiety not restriction
P R Burton and W A Brown
Int J Obes 2011 35: S26-S30; 10.1038/ijo.2011.144

Sunday, September 11, 2011

Multiple Imputation (MI)

Multiple Imputation (MI)

Wednesday, September 07, 2011

Writing, Speaking, and Reading

Writing, Speaking, and Reading

Thursday, September 01, 2011

Resampling and Monte Carlo Simulation



Resampling and Monte Carlo Simulation
Introduction
Generate data from a multivariate normal distribution (MVN)
Stata
SAS
R!

Thursday, August 25, 2011


Tips - Stata: Conference Articles

Survey data analysis in Stata
Jeff Pitblado
StataCorp
In this presentation, I cover how to use Stata for survey data analysis assuming a fixed population. We will begin by reviewing the sampling methods used to collect survey data, and how they affect the estimation of totals, ratios, and regression coefficients. We will then cover the three variance estimators implemented in Stata’s survey estimation commands. Strata with a single sampling unit, certainty sampling units, subpopulation estimation, and poststratification will be also covered in some detail.

Additional informationca09_pitblado_presentation.pdf
ca09_pitblado_handout.pdf
ca09_pitblado_stata.zip
Graphics tricks for models
Bill Rising
StataCorp
Visualizing interactions and response surfaces can be difficult. In this talk, I will show how to do the former by graphing adjusted means and the latter by showing how to roll together contour plots. I will demonstrate this for both linear and nonlinear models.

Additional informationchi11_rising.pdf
chi11_rising_files.zip
Multiple imputation in Stata
Bill Rising
StataCorp LP
Multiple imputation is a method for trying to retrieve power lost by missing values in a dataset. In this session, I will demonstrate how the suite of mi commands introduced in Stata 11 can be used to impute data, estimate models, and pool results, as well as manage various forms of multiply imputed datasets.

Additional informationrising_sug.pdf
Multiple imputation using Stata’s mi command
Yulia Marchenko
StataCorp
Stata’s mi command can be used to perform multiple-imputation analysis, including imputation, data management, and estimation. mi impute provides a number of univariate and multivariate imputation methods, including multivariate normal (MVN) data augmentation. mi estimate combines the estimation and pooling steps of the multiple-imputation procedure into one easy step. mi also provides an extensive ability to manage multiply imputed data. I give a brief overview of all of mi’s capabilities, with emphasis on mi impute and mi estimate, and I also demonstrate examples of some of mi’s unique data-management features.

Additional informationboston10_marchenko.pdf
Using the margins command to estimate and interpret adjusted predictions and marginal effects
Richard Williams
University of Notre Dame
As Long and Freese show, it can often be helpful to compute predicted and expected values for hypothetical or prototypical cases. Stata 11 introduced new tools—factor variables and the margins command—for making such calculations. These can do many of the things that were previously done by Stata’s own adjust and mfx commands, as well as Long and Freese’s spost9 commands like prvalue. Unfortunately, the complexity of the margins syntax, the daunting 50-page reference manual entry that describes it, and a lack of understanding about what margins offers over older commands may have dissuaded researchers from using it. This paper therefore shows how margins can easily replicate analyses done by older commands. It demonstrates how margins provides a superior means for dealing with interdependent variables (for example, X and X2; X1, X2, and X1 × X2; multiple dummies created from a single categorical variable), and is also superior for data that are svyset. The paper explains how the new asobserved option works and the substantive reasons for preferring it over the atmeans approach used by older commands. The paper primarily focuses on the computation of adjusted predictions, but also shows how margins has the same advantages for computing marginal effects.

Additional informationchi11_williams.pptx
Estimating partial effects using margins in Stata 11
David Drukker
StataCorp LP
This session introduces the use of the margins command to estimate the partial effects at the mean and the mean of the partial effects. Both the Stata syntax and the underlying statistical methods will be discussed. The presentation will also include some discussion of factor variables.

Additional informationdrukker_sug.pdf
Thirty graphical tips Stata users should know
Nicholas J. Cox
Department of Geography, Durham University
Stata’s graphics were completely rewritten for Stata 8, with further key additions in later versions. Its official commands have, as usual, been supplemented by a variety of user-written programs. The resulting variety presents even experienced users with a system that undeniably is large, often appears complicated, and sometimes seems confusing. In this talk, I provide a personal digest of graphics strategy and tactics for Stata users emphasizing details large and small that, in my view, deserve to be known by all.

Additional informationUKSUG10.Cox.zip
An overview of meta-analysis in Stata
A comprehensive range of user-written commands for meta-analysis is available in Stata and documented in detail in the recent book Meta-Analysis in Stata (Sterne, ed., 2009, [Stata Press]).The purpose of this session is to describe these commands, with a focus on recent developments and areas in which further work is needed. We will define systematic reviews and meta-analyses and will introduce the metan command, which is the main Stata meta-analysis command. We will distinguish between meta-analyses of randomized controlled trials and observational studies, and we will discuss the additional complexities inherent in systematic reviews of the latter.

Meta-analyses are often complicated by heterogeneity, variation between the results of different studies beyond that expected due to sampling variation alone. Meta-regression, implemented in the metareg command, can be used to explore reasons for heterogeneity, although its utility in medical research is limited by the modest numbers of studies typically included in meta-analyses and the many possible reasons for heterogeneity. Heterogeneity is a striking feature of meta-analyses of diagnostic-test accuracy studies. We will describe how to use the midas and metandi commands to display and meta-analyse the results of such studies.

Many meta-analysis problems involve combining estimates of more than one quantity: for example, treatment effects on different outcomes or contrasts among more than two groups. Such problems can be tackled using multivariate meta-analysis, implemented in the mvmeta command. We will describe how the model is fit, and when it may be superior to a set of univariate meta-analyses. Will will also illustrate its application in a variety of settings.

Additional informationUKSUG10.Sterne.pdf
UKSUG10.White.ppt
UKSUG10.Harbord.pdf
Competing-risks regression in Stata 11
Roberto G. Gutierrez
StataCorp
Competing-risks survival regression provides a useful alternative to Cox regression in the presence of one or more competing risks. For example, say that you are studying the time from initial treatment for cancer to recurrence of cancer in relation to the type of treatment administered and demographic factors. Death is a competing event: The person under treatment may die, impeding the occurence of the event of interest, recurrence of cancer. Unlike censoring, which merely obstructs you from viewing the event, a competing event prevents the event of interest from occurring altogether. Depending on the scope of your statistical inference, your analysis may need to be adjusted for competing risks.

Stata’s new stcrreg command implements competing-risks regression based on Fine and Gray’s proportional subhazards model. In this talk, I focus on that new command and compare the method of Fine and Gray to a method based on directly modeling cause-specific hazards. Regardless of method, the focus is on estimating the cumulative incidence function (CIF) for the event of interest in the presence of competing events.

Additional informationboston10_gutierrez.pdf
The fourth quarter Stata News came out today. Among other things, it contains an article by Bobby Gutierrez, StataCorp’s Director of Statistics, about competing risks survival analysis. If any of you are like me, conversant in survival analysis but not an expert, I think you will enjoy Bobby’s article. In a mere page and a half, I learned the primary differences between competing risks analysis and the Cox proportional hazards model and why I will sometimes prefer competing risks. Bobby’s article can be read at http://www.stata.com/news/statanews.25.4.pdf.