Skip to content

Vertical Datasets

Shane Rosanbalm edited this page Mar 20, 2017 · 9 revisions

Some datasets store information about multiple parameters in a vertical orientation. One variable will be used to identify each parameter (e.g., PARAMCD or LBTESTCD) and another variable will be used to identify the result (e.g., AVAL or LBSTRESN). The codebook-generic is not the best tool for summarizing vertical datasets. Enter codebook-vertical. This is a spin-off of the original codebook-generic that summarizes vertical datasets one parameter at a time.

!!! SAS® Bug !!!

As of SAS version 9.4M3, there is a bug in ODS LAYOUT. When the list of variables specified in the var= parameter gets long enough to produce multiple pages of output within a by= level, SAS crashes spectacularly. If you have more than 7-8 variables that you want to put in your var= list, the hack-around is to make multiple macro calls, each with a subset of the list of variables that you're interested in. You have been forewarned!!!

Parameters

There are several files contained in the download, but the one that you will use to produce the codebook-vertical is codebook_vertical.sas.

The required parameters are slightly different than in codebook-generic, but all optional parameters are identical.

Required

  • data= A two-level dataset name. Only one dataset is allowed.
    E.g., data=work.fred
  • by= A list of variables used to identify a parameter.
    E.g., by=parcat1 parcat2 paramn paramcd param
    E.g., by=lbcat lbscat lbtestcd lbtest
  • var= A list of variables to summarize within each parameter.
    E.g., var=aval avalc anrlo anrhi atoxn atox
    E.g., var=lbstresn lbstresc lbstnrlo lbstnrhi lbtox lbtoxn

Optional

  • pdfpath= The folder in which to save the PDF report.
    Default: the folder in which the dataset lives.
  • pdfprefix= A prefix to add to the PDF file name.
    Default: no prefix.
  • dotlength= Length after which long text is replaced with 3 dots (...).
    Default: 20.
  • maxfreqs= Maximum number of categories to show.
    Default: 5.
  • minfreqs= Minimum number of categories needed to avoid frequencies for numeric variables.
    Default: 2.
  • plotheight= Height of plots in inches.
    Default: 1.0.
  • uniquepct= Highest allowed percent of unique values for showing frequencies.
    Default: 90.
  • lowestpct= Lowest allowed percent for showing frequencies.
    Default: 0.5.
  • catplot= Type of categorical plot: dot | hbar.
    Default: dot.
  • debug= Set to 1 if you wish to retain work datasets.
    Default: 0.

Notes

  • Formats must be loaded prior to calling the macro.
  • Both the unformatted and formatted values will be presented in categorical summaries. E.g.,
    1 = MALE (22, 44%), 2 = FEMALE (28, 56%).
  • If you increase maxfreqs=, you are likely to experience tick mark thinning on the y-axis. You can compensate by increasing plotheight=.
  • Both the uniquepct= and lowestpct= parameters are an attempt to prevent meaningless categorical summaries for variables which contain mostly unique values (e.g., subject number).

Example Call #1

*--- a vertical ADaM dataset --;

%codebook_vertical
   (data=adam.adsl
   ,by=parcat1 parcat2 paramn paramcd param
   ,var=aval avalc chg anrlo anrhi ady
   )

Example Call #2

*--- a vertical SDTM dataset ---;

%codebook_vertical
   (data=sdtm.lb
   ,by=lbcat lbscat lbtestcd lbtest
   ,var=lbstresn lbstresu lbstresc lbstnrlo lbstnrhi lbdy 
   )

A Complete Example

An example use of the vertical codebook.

Clone this wiki locally