Friday, December 30, 2011

Thoughts: Hazards of dependence on surrogate endpoints


Thoughts: Hazards of dependence on surrogate endpoints


HK: Professor Yudkin et al. (The idolatry of the surrogate (2011)) offer a very important insight into the use of surrogate endpoints in medical science.   They argue that this practice - so widespread, especially in diabetes care - is tantamount to losing sight of the patient's interest.   It's a very readable piece -- a challenge to much of our work in public health & chronic-disease epidemiology.   It adds to the growing chorus of criticisms directed at the "guidelines" we (used to) cherish.

YC:  I like to read this kind of article.We are fool only when we think surrogate is a real/only cause of the disease. I agree that the hard event is the most important for evaluation of a treatment. However, the authors seem ignoring the whole spectrum of development of science, which is from short range (quick surrogate) to long range (hard event). If a drug even cannot control the level of meaningful surrogate such as glucose and blood pressure, I don’t think that the drug is good for controlling of the hard event beyond the surrogate.


BTW, the example of rosiglitazone exactly demonstrates that the scientists or physicians (maybe I need say good scientists ^_^) didn’t forget the target beyond the glucose and stop digging up more evidences. Anyhow, this is a heads-up for someone.


Selected References of The idolatry of the surrogate (2011):

Thursday, December 29, 2011

The Fat Trap - NYTimes.com

The Fat Trap
Source: NYTimes.com
For 15 years, Joseph Proietto has been helping people lose weight. When these obese patients arrive at his weight-loss clinic in Australia, they are determined to slim down. And most of the time, he says, they do just that, sticking to the clinic's program and dropping excess pounds. But then, almost without exception, the weight begins to creep back. In a matter of months or years, the entire effort has come undone, and the patient is fat again. "It has always seemed strange to me," says Proietto, who is a physician at the University of Melbourne. "These are people who are very motivated to lose weight, who achieve weight loss most of the time without too much trouble and yet, inevitably, gradually, they regain the weight." ...
Full text: here

Wednesday, December 28, 2011

Obesity

Obesity
Source: TheLancet.com Published August 26, 2011
"This four-part Series critically examines what we know about the global obesity pandemic: its drivers, its economic and health burden, the physiology behind weight control and maintenance, and what science tells us about the kind of actions that are needed to change our obesogenic environment and reverse the current tsunami of risk factors for chronic diseases in future generations."


"The first paper looks at the global drivers of the epidemic; the second paper analyses obesity trends in the USA and UK, and their impact on prevalence of diseases and healthcare spending. The third paper introduces a new web-based bodyweight simulation model, that incorporates metabolic adaptations that occur with decreasing bodyweight; and the final paper assesses the interventions needed to halt and reverse the epidemic. Its authors conclude that the changes needed are likely to require many sustained interventions at several levels, but that national governments should take the lead. "
Full Text: Here

Thursday, December 22, 2011

How to get the CPS for the NHANES?

Tips - NHANES: Where and how to get the CPS population for the NHANES?

The Current Population Survey (CPS) is a monthly survey of about 50,000 households conducted by the Bureau of the Census for the Bureau of Labor Statistics. The survey has been conducted for more than 50 years.

The NCHS of CDC used the the civilian noninstitutionalized U.S. population information of CPS at specific time point for the post-stratification to match the population control totals for each sampling subdomain, usually the post-stratification structure defined by age, sex, and race/ethnicity. For NHANES III, the structure has 12 age groups, 2 sex groups, and 4 race groups (Non-Hispanic black, Non-Hispanic black, Mexican American, and Other); this means there are 96 cells. You can find the response rate and CPS population for different surveys here. or you can find it from original NHANES demographic data using the same age, sex, and race/ethnicity structure and interview weights [for example, Stata: .svy: tab agesexracegrp, count obs format(%12.0f)]. Theoretically, to a picky epidemiologist/statistician, CPS population information is important to get correct national total estimates, dealing with missing data (reweight), or doing bootstrap analysis of complex survey data.

Meta-Analysis for Linear and Nonlinear Dose-Response Relations: Examples, an Evaluation of Approximations, and Software

Meta-Analysis for Linear and Nonlinear Dose-Response Relations: Examples, an Evaluation of Approximations, and Software
Two methods for point and interval estimation of relative risk for log-linear exposure-response relations in meta-analyses of published ordinal categorical exposure-response data have been proposed by Nicola Orsini (Stata and SAS code), who is the author of Stata ado of -glst-.

Tuesday, December 20, 2011

Circulation's Diabetes Mellitus Studies

Circulation's Diabetes Mellitus Studies 2009 - 2010
"The following articles are being highlighted as part of Circulation's Topic Review series. This series will summarize the most important manuscripts, as selected by the editors, published in Circulation and the Circulation subspecialty journals. The studies included in this article represent the articles related to diabetes mellitus that were published in Circulation in 2009 and 2010. ..."
full text: here

Monday, December 19, 2011

Piece-wise Regression

Tips: Piecewise/Segmented Regression Related
  • Ryan SE, Porth LS (2007). A tutorial on the piecewise regression approach applied to bedload transport data.(pdf) A very good tutorial article from U.S. Forest Service.
  • Nonlinear relationships
    • additivity vs. non-additivity, linearity vs. non-linearity.
    • a few types of non-linearity modeling: polynomial models, exponential models, piecewise regression models 
  • Example(Stata vs. SAS): If we are looking for the relation of AGE and BMI. Visually there is a reflection/change point/break point at age around 65. We may create two regression: BMI=a1 + b1*AGE for persons age<65, and BMI=a2 + b2*AGE for persons age >= 65. To make the regression continuous at the reflection point: a1 + b1*(age = 65) = a2 + b2*(age = 65), so a2 = a1 + (age=65)*(b1 - b2).
    • Stata: .nl (BMI = cond(AGE < {k}, {a1} + {b1}*AGE, {a1} + {k}*({b1} - {b2}) + {b2}*AGE)), initial(a1 1 b1 1 b2 1 k 60) // here k = reflection point of age,  {} = name of expected parameters of the model.
    • SAS: PROC NLIN; PARMS a1=1 b1=1 b2=1 k=60; IF AGETHEN DO; MODEL BMI=a1 + b1*AGE; END; ELSE IF AGE>=k THEN DO; MODEL BMI=a1 + k*(b1 - b2) + b2*AGE; END; RUN;


      Friday, December 16, 2011

      Doing bootstrap/jackknife in Stata


      Doing Repeated Replication Methods (Bootstrap/Jackknife) for complex survey data
      • Bootstrap sampling and estimation, Survey data analysis in Stata
      • Starting from Stata 10, you can just use -svy jackknife- instead of creating jackknife weight at first.
        • By using information on PSUs and strata, -svy jackknife:- will automatically adjust the sampling weights to produce the replicates using the delete-1 jackknife method
        • if you want do delete-k jackknife, you need provide the replicate weight variables using -svyset-
        • by default jackknife variance is computed by using deviations of the replicates from their mean. If you want to variance from the observed value of statistics based on the entire dataset, you need use -svy jackknife, mse- (mean square error)option. The -mse- method providea larger variance estimation because of the addition of the familiar sqared bias tern in the mean square error
      • The Stata 14 has "strata()" and "cluster()" options and sounds like for the complex survey data, but seemingly it cannot deal with the sampling weight correctly.
      • Jackknife for simple random sampling data:
        • jknife r(mean): summarize mpg
      • Jackknife for complex survey data
        • webuse nhanes2, replace
        • svyset psu [pw=finalwgt], strata(strata)
        • svy jackknife slope=_b[height] constant=_b[_cons]: regress weight height
      • Jackknife for complex survey data using a user-written program:
        • svy jackknfie _b[, options]: intcens
      • User-written program using Jackknife for complex survey data.
        • Notes: 
        • -svy jackknife- is allowed as long as they follow standard Stata -syntax-, allow the 
        • -if- qualifier, and allow -pweights- and -iweights-.
        • Notes: "anything", "namelist", "name", "weight, "if", "in", "varlist", "using", "exp", etc. are special macros. "anything" is used to tell the -syntax- command what can appear immediately after the name of the command. "anythingcould be anything like SILLY in this example, which passes an arguement into the program. Things between "[" and "]" are optional. In order to use -margins-, -set buildfvinfo- needs set as -on-.
      • Here is a modified example in the manual: with replication-based variance estimators
        • program mymargins, eclass
            syntax anything [if] [iw pw]
            if "`weight'" != "" {
              local wgtexp "[`weight' `exp']"
            }
            set buildfvinfo on
            `anything' `if' `wgtexp'
            margins race, post
          end
          global myanything "logistic highbp height weight i.race c.age##c.age" //!!! using `anything' with caution !!!
          svy jackknife _b: mymargins $myanything
      • UCLA: How do I write my own bootstrap program?
      • SSCC: Bootstrapping in Stata
      • Stata Journal(2003): Bootstrapped standard errors
      • Schmidheiny(2016): The Bootstrap
      • How can I analyze multiple mediators in Stata? 
      • Stata Journal(2004): From the help desk: Some bootstrapping techniques
      • 'svr' is a module/package to compute estimates with survey replication (SVR) based standard errors written by Nick Winter. -survwgt-, one of commands of 'svr' creates sets of replicate weights for complex sampling data including balanced repeated replication (BRR) and several version of the survey jackknife (JK*). In addition, survwgt performs oststratification, raking, and non-response adjustments to survey weights. Starting from Stata 10, you can just use -svy jackknife- instead of creating jackknife weight at first.
      • The jackknife (n-1) estimate of the standard error is equal to , where n is the total number of observations (or clusters). The factor in the jackknife's standard error is about n times (inflation factor) larger than bootstrap [1/(B-1)].
      • The delete-d jackknife estimate of the standard error is using (n-d)/C(n,d) instead of (n-1)/n
      • Efron (1981). Nonparametric Estimates of Standard Error: The Jackknife, the Bootstrap and OtherMethods
      • McIntosh."The Jackknife Estimation Method"
      • UCLA:How can I sample from a dataset with frequency weights?

      Journal Article - Heart Disease and Stroke Statistics—2012 Update from the AHA


      "Each year, the American Heart Association (AHA), in conjunction with the Centers for Disease Control and Prevention, the National Institutes of Health, and other government agencies, brings together the most up-to-date statistics on heart disease, stroke, other vascular diseases, and their risk factors and presents them in its Heart Disease and Stroke Statistical Update. The Statistical Update is a valuable resource for researchers, clinicians, healthcare policy makers, media professionals, the lay public, and many others who seek the best national data available on disease morbidity and mortality and the risks, quality of care, medical procedures and operations, and costs associated with the management of these diseases in a single document." … 

      Full text: here (pdf)

      Wednesday, December 14, 2011

      Is a Chow test the correct test to determine whether data can be pooled together?

      Source: Stata FAQs by Willam Gould

      A Chow test is simply a test of whether the coefficients estimated over one group of the data are equal to the coefficients estimated over another, and you would be better off to forget the word Chow and remember that definition.

      History: In the days when statistical packages were not as sophisticated as they are now, testing whether coefficients were equal was not so easy. You had to write your own program, typically in FORTRAN. Chow showed a way you could perform a Wald test based on statistics that were commonly reported, and that would produce the same result as if you performed the Wald test.

      Full text: here

      Other related articles about Chow test:

      Role of Environmental Chemicals in Diabetes 1 and Obesity

      Tips - Stata: How do I fit a linear regression with interval (inequality) constraints in Stata?

      Source: Stata FAQs by Isabel Canette

      If you need to fit a linear model with linear constraints, you can use the Stata command cnsreg. If you need to fit a nonlinear model with interval constraints, you can use the -ml- command, as explained at http://www.stata.com/support/faqs/stat/intconst.html. However, if you have a linear regression, the simplest way to include these kinds of constraints is by using the -nl- command.
       

       Full text: here

      Tips: Stata - Is a Chow test the correct test to determine whether data can be pooled together?

      Source: Stata FAQs by Willam Gould

      A Chow test is simply a test of whether the coefficients estimated over one group of the data are equal to the coefficients estimated over another, and you would be better off to forget the word Chow and remember that definition.

      History: In the days when statistical packages were not as sophisticated as they are now, testing whether coefficients were equal was not so easy. You had to write your own program, typically in FORTRAN. Chow showed a way you could perform a Wald test based on statistics that were commonly reported, and that would produce the same result as if you performed the Wald test.

      Full text: here

      Other related article:
      How can I compute the Chow test statistic? http://www.stata.com/support/faqs/stat/chow.html

      Monday, December 12, 2011

      Tips - Stata: a few useful ado-related commands

      Tips - Stata: the handy Stata command/function
      • 'statsby', 'tabstat', 'scalar themean=r(mean)'.
      • 'contrast'(How to get orthogonal polynomial coefficient), 'pwmean', 'pwcompare', and 'margins'.
        • p. and q. of 'Contrast' (Orthogonal polynomials) allow to partition the effects of a factor variable into linear, quadratic, cubic, and higher-order polynomial components (I like to use p., q. assumes having equal space between groups). They are only meaningful with factor variables that have a natural ordering in the levels. For exmaple: .contrast p(2 3 4).bmigrp, noeffects
        • User defined contrast of race(3 levels) and age (2 levels) without comparing the middle race group: .contrast {race#age -1 -1 0 0 1 1}
      • 'destring', 'tostring','string()': Convdrting between numeric variable and string/character variable.
      • 'duplicate': Report, tag, or drop duplicate observations.
      • 'postfile' posts results in Stata dataset.
      • ... [Contents of Stata Help]

      Friday, December 02, 2011

      Tips - Stata: outputting/exporting Stata results

      Tips - Stata: outputting/exporting Stata results

      Tuesday, November 15, 2011

      Tips - Stata: How to test significance of interactions of categorical variables

      Tips - Stata: How to test significance of interactions of categorical variables

      We have the model
         . webuse fvex
         . regress y i.sex##i.group age

      We can test the overall significance of the sex#group interaction by typing
         . contrast sex#group

      Monday, November 14, 2011

      Thoughts: Education is The Key to Reduce Health Disparities


      Thoughts: Education is The Key to Reduce Health Disparities

      This is great. Even more, I would like to see teachers can team up to reduce disparities. My Mom’s experience tells me that the key to reduce the poverty is education, education, and still education. My Mom was a teach of elementary school in China. Sixty years ago, she went a remote poor countryside teaching not just kids but also the parents how to read. Now this area is one of the richest counties in China. My Mom is so proud of what she had done. By the way, my 80-year old Mom is still going to a senior college as a student to have fun and meet friends.

      Subject: Lawyers, Doctors Team Up To Reduce Health Disparities
      California Watch: Lawyers, Doctors Team Up To Reduce Health Disparities
      On Kate Marr's first day practicing law at The Children's Clinic in Long Beach last week, she met with the mother of an asthmatic 7-year-old. ... The Long Beach program is the latest effort by community clinics and hospitals across the country to add lawyers to their medical teams as a way to resolve issues associated with the "social determinants of health," such as housing, domestic violence and poverty (Yeung, 11/10).

      Thursday, November 10, 2011

      Tips - Stata: How to generate composite categorical variables and indicate/dummy variables

      Tips - Stata: How to generate composite categorical variables and indicate/dummy variables, convert a continuous variable into a categorical variable

      1. to generate composite categorical variables:
         .egen compvar=group(var1 var2 var3), label
         Now, the dataset has a categorical variable with the different combinations of three variables: var1, var2, and var3.

      2. to generate indicate variables:
         .tabulate compvar, generate(compvar)
         Now, the dataset has multiple indicate variables with a prefix of ‘compvar’


      3. to convert a continuous variable into a categorical variable
         .gen agecat = recode(age, 24,29,34,39,44,49,54,59,64, ///
              69, 74, 79, 90) if !missing(age)
         or
         .gen age13grp = 1+irecode(age,24,29,34,39,44,49,54,59,64, ///
              69, 74, 79, 90) if !missing(age)

      Tuesday, November 08, 2011

      Tips - Stata: how to do 'lincom' after a three-way 'margins'?

      Tips - Stata: how to do 'lincom' after a three-way 'margins'?
      The easiest way to determine how to refer to the margins is to type
         .margins, coeflegend
      after the -margins- command. This will list the estimated margins as well as the _b[] notation that can be used to refer to them. Here is an example:
         .webuse nhanes2, clear
         .svy, subpop(female): logit highbp i.race##i.diabetes##c.age weight
         .margins race, over(diabetes) at(age = (30(5)50)) subpop(female) ///

                  vce(svy) post
        .margins, coeflegend

      This tells us that if we want to compute the difference in the margins for age=30, diabetes=0, race=1 and age=35, diabetes=0, race=1, we could type
         .lincom _b[1bn._at#0bn.diabetes#1bn.race] - ///
                 _b[2._at#0bn.diabetes#1bn.race]
      -- by a senior statistician of Stata

      Thursday, November 03, 2011

      Nice gadgets (freeware) from Microsoft

      Nice gadgets (freeware) from Microsoft

      Image Composite Editor (ICE)
      ICE is an simple advanced panoramic image stitcher.
      Mathematics 
      This is an amazing powerful calculator has many functions with graphing options. It also has unit conversion tool.
      You may also download and install Mathematics Add-In for Word and OneNote hereyou can easily to plot graphs in 2D and 3D, calculate numerical results, solve equations or inequalities, and simplify algebraic expressions in your Word documents and OneNote notebooks.
      You may find more information for teachers/education here.
      Sysinternals Suite
      Sysinternals Suite, a troubleshooting utility package, was bought by Microsoft a few years ago.

      Final Data Collection Standards for Race, Ethnicity, Primary Language, Sex, and Disability Status Required by Section 4302 of the Affordable Care Act


      HHS on Oct. 31, 2011, published final standards for data collection on race, ethnicity, sex, primary language and disability status, as required by Section 4302 of the Affordable Care Act [PDF | 1.6 MB].
      The law requires that data collection standards for these measures be used, to the extent practicable, in all national population health surveys. They will apply to self-reported information only. The law also requires any data standards published by HHS comply with standards created by the Office of Management and Budget (OMB).

      Proposed standards were published on June 29, 2011, and public comments were accepted until August 1, 2011.

      The standards, effective upon publication today, apply to population health surveys sponsored by HHS, where respondents either self-report information or a knowledgeable person responds for all members of a household. HHS will begin implementation of these new data standards in all new surveys and at the time of major revisions to current surveys.

      Tuesday, October 25, 2011

      Can 'margins' be used after 'mi estimate'?

      Can 'margins' be used after 'mi estimate'?

      The -margins- command may not be used in the usual way after -mi estimate-. You'll need to write a short "wrapper command" that can be run with the -mi estimate- prefix. My colleagues outlined the method on Statalist: Average marginal effects for a multiply imputed complex survey. You'll want to change the 11 in the -version 11- statement to 12. One of our FAQs will also be helpful: How can I combine results other than coefficients in e(b) with multiply imputed data? - by A Statistician, Stata


      Here is my modified program based on codes of UCLA Stata Portal

      use http://www.ats.ucla.edu/stat/data/hsbmar, clear
      /* set MI dataset */
      mi set mlong
      mi register imputed female math read science socst
      mi svyset [pw=write], strata(ses)
      mi impute chain (logit) female (regress) math read science socst ///
             =ses write awards, add(10) rseed(123456)
      /* program */
      capture program drop mimargins
      program mimargins, eclass properties(mi)
        version 12
        svy: logit honors i.female##i.prog read math science socst
        margins female#prog, post
      end
      /* run the program */ 
      mi estimate, cmdok: mimargins 1
      matlist r(table)'*100, tw(20) format(%8.2f)
      mi estimate (_b[1.female#2.prog]/_b[1.female#1bn.prog]): mimargins 1

      Notes: The 'cmdok' option forces 'mi estimate' to estimate the current Stata unsupported model. The 'cmdok' is not necessary because of already including the 'properties(mi)' in the program.

      mi estimate: svy: logit honors i.female##i.prog read math science socst
      mi test 1.female#2.prog 1.female#3.prog

      mi estimate (diff:1.female#2.prog-1.female#3.prog), saving(miest, replace): svy: logit honors i.female##i.prog read math science socst
      mi testtransform diff

      mi estimate (rdiff:1.female#2.prog/1.female#3.prog - 1) using miest
      mi testtr rdiff


      Useful Resources: