Wednesday, March 28, 2007

A bug of SAS?

A bug of SAS?
We may have a problem on create A1c categories by using NHANES III data, which is a discrepancy between multilevel categories and two-level categories, then I figured out 'ROUND()' function can fix this kind discrepancy. However, this issue is still haunting our a lot. After I looked into more, I found the SAS did not pick up 5.2 into the '1' group, when I use 'GHP >= 5.2' (see codes and output below).
There is no similar problem with NHANES 99-04. The differences between NHANES III and NHANES 99 are: 1) NHANES III dataset is a SAS version 6 dataset; 2) GHP of NHANES III has been formatted as F6.1.
     
LIBNAME NHANES3 V6 'Q:\epistat\datasets\NHANES\ORIGINAL\NHANES3\';
DATA N3;
  SET NHANES3.LABNEW (KEEP=GHP);       * This is a SAS v6 dataset;
    IF . LT GHP LT 7777;               * GHP is in a F6.1 format;
    GHP2=ROUND(GHP,0.1);               * What is 'ROUND()' doing here?;
        IF GHP >= 5.2 THEN GHPGRP1=1 ELSE GHPGRP1=2;
        IF GHP > 5.19 THEN GHPGRP2=1 ELSE GHPGRP2=2;
    IF GHP2>=5.2 THEN GHPGRP3=1 ELSE GHPGRP3=2;
        LABEL GHPGRP1='ORIGINAL GHP VALUE, CUTPOINT 5.2'
              GHPGRP2='ORIGINAL GHP VALUE, CUTPOINT 5.19'
                  GHPGRP3='ROUNDED GHP VALUE, CUTPOINT 5.2';
RUN;
PROC FREQ DATA=N3;
  TABLES GHPGRP1 GHPGRP2 GHPGRP3 GHPGRP1*GHPGRP3; RUN;
============ OUTPUT ============
                                                                                             379
The FREQ Procedure
              ORIGINAL GHP VALUE, CUTPOINT 5.2
                                    Cumulative    Cumulative
GHPGRP1    Frequency     Percent     Frequency      Percent
------------------------------------------------------------
      1       11535       49.14         11535        49.14
      2       11941       50.86         23476       100.00
 
              ORIGINAL GHP VALUE, CUTPOINT 5.19
                                    Cumulative    Cumulative
GHPGRP2    Frequency     Percent     Frequency      Percent
------------------------------------------------------------
      1       13463       57.35         13463        57.35
      2       10013       42.65         23476       100.00
 
               ROUNDED GHP VALUE, CUTPOINT 5.2
                                    Cumulative    Cumulative
GHPGRP3    Frequency     Percent     Frequency      Percent
------------------------------------------------------------
      1       13463       57.35         13463        57.35
      2       10013       42.65         23476       100.00
 
Table of GHPGRP1 by GHPGRP3
GHPGRP1(ORIGINAL GHP VALUE, CUTPOINT 5.2)
          GHPGRP3(ROUNDED GHP VALUE, CUTPOINT 5.2)
Frequency|
Percent  |
Row Pct  |
Col Pct  |       1|       2|  Total
---------+--------+--------+
       1 |  11535 |      0 |  11535
         |  49.14 |   0.00 |  49.14
         | 100.00 |   0.00 |
         |  85.68 |   0.00 |
---------+--------+--------+
       2 |   1928 |  10013 |  11941
         |   8.21 |  42.65 |  50.86
         |  16.15 |  83.85 |
         |  14.32 | 100.00 |
---------+--------+--------+
Total       13463    10013    23476
            57.35    42.65   100.00
=========================================
Thank you all. J and I discussed this underlying issue yesterday as well. I could not find any exact 5.2 of GHP value. I don't think NCHS inputted this kind of GHP value.
=========================================
I tried changing the following statement,  and GHPGRP2 was assigned a value of 2 in record 39.  Removing one of the decimal places resulted in a value of 1.
IF GHP >= 5.1999999999999999 THEN GHPGRP2=1ELSE GHPGRP2=2;
I tried changing the format and unformatting GHP, but I could only get it to display 5.2 in record 39.  Apparently, the value is not exactly 5.2, but the precision is so deep that it cannot be displayed.
=========================================
Thank you B and D. We are pinpointed the issue. I re-run my codes and get outputs below. Usually SAS gives a little bit more from what we see. However, this time (NHANES III) SAS gives a little bit less from what we see. Keep tune and be aware. Using round() if your want fix this issue now.
LIBNAME NHANES3 V6 'Q:\epistat\datasets\NHANES\ORIGINAL\NHANES3\';
DATA N3;
  SET NHANES3.LABNEW (KEEP=GHP);       * This is a SAS v6 dataset;
    IF ROUND(GHP,.1) EQ 5.2;
        DIFF_GHP_FROM_5POINT2=GHP-5.2;
        GHP_GE_5POINT2=(GHP GE 5.2);
        GHP_GE_5POINT19=(GHP GE 5.19);
RUN;
TITLE 'OUTPUT OF NHANES III';
PROC PRINT DATA=N3 (OBS=5); FORMAT GHP DIFF_GHP_FROM_5POINT2 F32.31; RUN;
data two;
   input a b @@;
   c=b*0.1;
   ca_diff=c-a;
   c_ge_point3=(a ge 0.3);
   c_le_point3=(c le 0.3);
cards;
0.1 1 0.2 2 0.3 3 0.4 4 0.5 5
;
run;
title 'output of testing dataset';
proc print data=two; format c ca_diff f32.31; run;
title;run;
 
OUTPUT OF NHANES III
                                                                              GHP_GE_    GHP_GE_
 Obs                                GHP              DIFF_GHP_FROM_5POINT2   5POINT2   5POINT19
   1   5.200000000000000000000000000000   -.000000000000000888178419700125      0          1
   2   5.200000000000000000000000000000   -.000000000000000888178419700125      0          1
   3   5.200000000000000000000000000000   -.000000000000000888178419700125      0          1
   4   5.200000000000000000000000000000   -.000000000000000888178419700125      0          1
   5   5.200000000000000000000000000000   -.000000000000000888178419700125      0          1
output of testing dataset
                                                                                  c_ge_   c_le_
Obs   a   b                                 c                           ca_diff  point3  point3
 1   0.1  1  .1000000000000000000000000000000  .0000000000000000000000000000000     0       1
 2   0.2  2  .2000000000000000000000000000000  .0000000000000000000000000000000     0       1
 3   0.3  3  .3000000000000000000000000000000  .0000000000000000277555756156289     1       0
 4   0.4  4  .4000000000000000000000000000000  .0000000000000000000000000000000     1       0
 5   0.5  5  .5000000000000000000000000000000  .0000000000000000277555756156289     1       0
 

No comments: