We may have a problem on create
A1c categories by using NHANES III data, which is a discrepancy between
multilevel categories and two-level categories, then I figured out 'ROUND()'
function can fix this kind discrepancy. However, this issue is still haunting
our a lot. After I looked into more, I found the SAS did not pick up 5.2 into
the '1' group, when I use 'GHP >= 5.2' (see codes and output below).
There is no similar problem with
NHANES 99-04. The differences between NHANES III and NHANES 99 are: 1) NHANES
III dataset is a SAS version 6 dataset; 2) GHP of NHANES III has been formatted
as F6.1.
LIBNAME NHANES3 V6 'Q:\epistat\datasets\NHANES\ORIGINAL\NHANES3\';
DATA N3;
SET
NHANES3.LABNEW (KEEP=GHP); * This is a SAS v6 dataset;
IF . LT GHP LT 7777;
* GHP is in a F6.1 format;
GHP2=ROUND(GHP,0.1); * What is 'ROUND()' doing here?;
IF GHP >=
5.2 THEN GHPGRP1=1; ELSE GHPGRP1=2;
IF GHP >
5.19 THEN GHPGRP2=1; ELSE GHPGRP2=2;
IF
GHP2>=5.2 THEN GHPGRP3=1; ELSE GHPGRP3=2;
LABEL GHPGRP1='ORIGINAL GHP VALUE, CUTPOINT 5.2'
GHPGRP2='ORIGINAL GHP VALUE, CUTPOINT 5.19'
GHPGRP3='ROUNDED GHP VALUE, CUTPOINT 5.2';
RUN;
PROC FREQ DATA=N3;
TABLES GHPGRP1 GHPGRP2 GHPGRP3 GHPGRP1*GHPGRP3; RUN;
============
OUTPUT ============
379
The FREQ Procedure
ORIGINAL GHP VALUE, CUTPOINT 5.2
Cumulative Cumulative
GHPGRP1
Frequency Percent
Frequency Percent
------------------------------------------------------------
1 11535
49.14
11535 49.14
2 11941
50.86
23476 100.00
ORIGINAL GHP VALUE, CUTPOINT 5.19
Cumulative Cumulative
GHPGRP2
Frequency Percent
Frequency Percent
------------------------------------------------------------
1 13463
57.35
13463 57.35
2 10013
42.65
23476 100.00
ROUNDED GHP VALUE, CUTPOINT 5.2
Cumulative Cumulative
GHPGRP3
Frequency Percent
Frequency Percent
------------------------------------------------------------
1 13463
57.35
13463 57.35
2 10013
42.65
23476 100.00
Table of GHPGRP1 by GHPGRP3
GHPGRP1(ORIGINAL GHP VALUE, CUTPOINT
5.2)
GHPGRP3(ROUNDED GHP VALUE, CUTPOINT 5.2)
Frequency|
Percent |
Row Pct |
Col Pct
| 1|
2| Total
---------+--------+--------+
1 | 11535 | 0 | 11535
| 49.14 | 0.00 | 49.14
| 100.00 | 0.00 |
| 85.68 | 0.00 |
---------+--------+--------+
2 | 1928 | 10013
| 11941
| 8.21 | 42.65 | 50.86
| 16.15 | 83.85 |
| 14.32 | 100.00 |
---------+--------+--------+
Total
13463 10013 23476
57.35 42.65 100.00
=========================================
=========================================
Thank you all. J and I
discussed this underlying issue yesterday as well. I could not find any exact
5.2 of GHP value. I don't think NCHS inputted this kind of GHP value.
=========================================
I tried changing the
following statement, and GHPGRP2 was assigned a value of 2 in record
39. Removing one of the decimal places resulted in a value of 1.
IF GHP >= 5.1999999999999999 THEN GHPGRP2=1; ELSE GHPGRP2=2;
I tried changing the
format and unformatting GHP, but I could only get it to display 5.2 in record
39. Apparently, the value is not exactly 5.2, but the precision is so
deep that it cannot be displayed.
=========================================
Thank you B and D. We
are pinpointed the issue. I re-run my codes and get outputs below. Usually SAS
gives a little bit more from what we see. However, this time (NHANES III) SAS
gives a little bit less from what we see. Keep tune and be aware. Using
round() if your want fix this issue now.
LIBNAME NHANES3 V6 'Q:\epistat\datasets\NHANES\ORIGINAL\NHANES3\';
DATA N3;
SET
NHANES3.LABNEW (KEEP=GHP); * This is a SAS v6 dataset;
IF
ROUND(GHP,.1) EQ 5.2;
DIFF_GHP_FROM_5POINT2=GHP-5.2;
GHP_GE_5POINT2=(GHP GE 5.2);
GHP_GE_5POINT19=(GHP GE 5.19);
RUN;
TITLE 'OUTPUT OF NHANES III';
PROC PRINT DATA=N3 (OBS=5); FORMAT GHP
DIFF_GHP_FROM_5POINT2 F32.31; RUN;
data two;
input a b @@;
c=b*0.1;
ca_diff=c-a;
c_ge_point3=(a ge 0.3);
c_le_point3=(c le 0.3);
cards;
0.1 1 0.2 2 0.3 3 0.4 4
0.5 5
;
run;
title 'output of testing dataset';
proc print data=two; format c ca_diff f32.31; run;
title;run;
OUTPUT OF NHANES III
GHP_GE_ GHP_GE_
Obs
GHP
DIFF_GHP_FROM_5POINT2 5POINT2 5POINT19
1
5.200000000000000000000000000000
-.000000000000000888178419700125
0 1
2
5.200000000000000000000000000000
-.000000000000000888178419700125
0 1
3
5.200000000000000000000000000000
-.000000000000000888178419700125
0 1
4
5.200000000000000000000000000000
-.000000000000000888178419700125
0 1
5
5.200000000000000000000000000000 -.000000000000000888178419700125
0 1
output of testing dataset
c_ge_ c_le_
Obs a
b
c
ca_diff point3 point3
1 0.1
1 .1000000000000000000000000000000
.0000000000000000000000000000000
0 1
2 0.2
2 .2000000000000000000000000000000
.0000000000000000000000000000000
0 1
3 0.3
3 .3000000000000000000000000000000 .0000000000000000277555756156289
1 0
4 0.4
4 .4000000000000000000000000000000
.0000000000000000000000000000000
1 0
5 0.5 5
.5000000000000000000000000000000
.0000000000000000277555756156289
1 0
No comments:
Post a Comment