Skip to content

Files

Latest commit

26acfba · Jan 26, 2025

History

History
845 lines (615 loc) · 24.5 KB

bputil.md

File metadata and controls

845 lines (615 loc) · 24.5 KB

Documentation for the bpUTiL package.


Basic Popular UTiLity programs


Version information:

  • Package: bpUTiL
  • Version: 1.1.1
  • Generated: 2025-01-26T11:11:10
  • Author(s): Bartosz Jablonski (yabwon@gmail.com)
  • Maintainer(s): Bartosz Jablonski (yabwon@gmail.com)
  • License: MIT
  • File SHA256: F*4C8E477515813767772892FCADE9E90BF688FB22084D540C8F22E1A8AE1BC21C for this version
  • Content SHA256: C*9A7EADF853673EB42E4FF799F2C934A376C7C53A549876F2D77A136925556D66 for this version

The bpUTiL package, version: 1.1.1;


The bpUTiL package, supported by basePlus and macroArray packages, provides a set of basic popular utility programs to make daily programming work easier.



Required SAS Packages:

  • basePlus (2.1.0)
  • macroArray (1.2.6)


SAS package generated by SAS Package Framework, version 20241207


The bpUTiL package content

The bpUTiL package consists of the following content:

  1. bputil_int_oneliners exec

  2. bputil_int_oneliners clean

  3. $bputil_int_listnonmissvars. format/informat

  4. %bputil_int_1varattrsds1() macro

  5. %bputil_int_1varattrsds2() macro

  6. %bputil_int_listnonmissvars() macro

  7. %bputil_attributes() macro

  8. %bputil_convertn2c() macro

  9. %bputil_includesasnotebook() macro

  10. %bputil_listnonmissvars() macro

  11. %bputil_varstoonecasesize() macro

  12. License note


bputil_int_oneliners exec

A bunch of internal "one-liners" built with BasePlus %GenerateOneLiners() macro.

Those macros provided are convenience "wrappers" over the %sysfunc(<someFunction>(...)) code. Log from bpUTiL package loading shows detailed list of created macros.

Of course dedicated code cleaning those macros from SAS session is executed when bpUTiL package is unloaded.


bputil_int_oneliners clean

Cleaning internal "one-liners" built with BasePlus %GenerateOneLiners() macro.


$bputil_int_listnonmissvars. format/informat

Both formats are internal for use of %bpUTiL_listNonMissVars() macro.

Both display missing value as a blank space (" ") and any non-missing value as asterisk ("*").



%bputil_int_1varattrsds1() macro

An internal macro to extract attributes information about i-th variable from an open() data set with identifier dsid.

Dedicated for DATA STEP format.


%bputil_int_1varattrsds2() macro

An internal macro to extract attributes information about i-th variable from an open() data set with identifier dsid.

Dedicated for PROC DS2 format.


%bputil_int_listnonmissvars() macro

An internal macro for exclusive use inside %bpUTiL_listNonMissVars() macro.

Lists data set's variables name, their type, and checks if not missing.



%bputil_attributes() macro

DESCRIPTION:

The bpUTiL_attributes() macro allow for automatic variables attributes extraction from a data set for a given list of variables and use them in your program.

Generated code (depending on type parameter value) allows to provide syntax for data step and for proc ds2 attributes definitions.

Idea for the macro was inspired by Stu Sztukowski's post: https://www.linkedin.com/posts/statsguy_sastiptuesday-sas-programming-activity-7259609656396779521-YO1D?utm_source=share&utm_medium=member_desktop and https://github.com/stu-code/sas/blob/master/utility-macros/getattribs.sas but with implementation done from scratch with macro language.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%bpUTiL_attributes(
   ds
 <,vars=>
 <,type=>
 <,statement=>
 <,attrs=>
)

Arguments description:

  1. ds - Required. Valid data set name. No data set options can be used.
  • vars= - Optional. A list of variables for which attributes should be extracted. Default value is _ALL_ and means "use all variables". Can take any standard SAS data variables list format, e.g., A B C, A1-A5, _NUMERIC_, A-character-C, or mixture of those.

  • type= - Optional. Indicates what format should be used. Allowed values are 1 and 2. For any other value type is re-set to 1. When 1 is used Data Step format with ATTRIB ... ; statement is generated. When 2 is used Proc DS2 format with DCL ... ; statement is generated. Default value is 1

  • statement= - Optional. Indicates if generated list of attributes should be provided inside ATTRIB/DCL statement for direct use in Data step/Proc DS2 code, or without for example for information purpose in %put statement. Default value is 1 and means "use ATTRIB/DCL statement".

  • attrs= - Optional. A list of attributes to be extracted. The length is always extracted and provided. Default value is format informat label and means "extract, if they exist, attributes for formats, informats, and label".


EXAMPLES AND USECASES:

EXAMPLE 1. Basic use case - extract all attributes:

options mprint;
data _null_;
 %bpUTiL_attributes(sashelp.cars)
 call missing(of _all_);
run;
options nomprint;

EXAMPLE 2. Display attributes of _nuimeric_ variables:

%put %bpUTiL_attributes(sashelp.cars,vars=_numeric_);

EXAMPLE 3. Use explicit ATTRIB statement:

data _null_;
ATTRIB
%bpUTiL_attributes(sashelp.cars
,vars=_numeric_
,statement=0)
;
run;

EXAMPLE 4. Data set with big number of variables:

data AAA;
  ARRAY A A1 - A1024;
  format A1 - A1024 best17.;
  call missing(of A[*]);
run;

data BBB;
 %bpUTiL_attributes(AAA)

 ARRAY A A1 - A1024;
 call missing(of A[*]);
run;

proc compare base=AAA compare=BBB;
run;

data AAA;
  ARRAY A A1 - A1024;
  format A1 - A1024 best17.;
  call missing(of A[*]);
run;

EXAMPLE 5. Works with PROC DS2 too:

data cars;
  set sashelp.cars;
run;

%put %bpUTiL_attributes(sashelp.cars, type=2);

options Mprint;
proc ds2;
    data work.cars2 / overwrite=yes;

        %bpUTiL_attributes(sashelp.cars, type=2)

        method run();
            set cars;
        end;

    enddata;
    run;

quit;


%bputil_convertn2c() macro

DESCRIPTION:

The bpUTiL_convertN2C macro converts a variable in the Numeric <-> Character fashion and keeps variable original position in the data set.

Conversion is done "quiet" way, i.e., no conversion errors are displayed in the log. Hence if converted text is "not convertible" result will be missing data.

By default best32. is used for conversion, but other formats/informats can be used too, e.g., yymmdd10. to convert "date as text".

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%bpUTiL_attributes(
   ds
 <,vars=>
 <,type=>
 <,statement=>
 <,attrs=>
)

Arguments description:

  1. ds - Required. Valid data set name. No data set options can be used.

  2. vn - Required. Valid, space-separated list of variables to convert. No variables lists can be used.

  • fmt= - Optional. Name of a format or informat to use for conversion. Default is best32., but other, like e.g., yymmdd10., can be used.

EXAMPLES AND USECASES:

EXAMPLE 1. Convert single variable:

data class(label="Test Label 1" compress=yes index=(name wh=(height weight)));
  set sashelp.class;
  label age = 'Variable with age and has %percents and &amps in it!!'
        sex = "Variable with sex";
  ;
run;

proc contents data=class;
run;

options MPRINT;
%bpUTiL_convertN2C(class, age)

ods select variables;
proc contents data=class;
run;

options MPRINT;
%bpUTiL_convertN2C(class, age)

ods select variables;
proc contents data=class;
run;

ods select all;

EXAMPLE 2. Convert multiple variables:

data class2(label="Test Label 1" compress=yes index=(name wh=(height weight)));
  set sashelp.class;
  label age = 'Variable with age and has %percents and &amps in it!!'
        sex = "Variable with sex"
        weight = "Weight, but not in kilograms.";
  ;
run;

proc contents data=class2;
run;

options MPRINT;
%bpUTiL_convertN2C(class2, age weight)

ods select variables;
proc contents data=class2;
run;

options MPRINT;
%bpUTiL_convertN2C(class2, weight age XXX)

ods select variables;
proc contents data=class2;
run;

ods select all;

EXAMPLE 3. Convert date variable:

data class3(label="Test Label 1" compress=yes index=(name wh=(height weight)));
  set sashelp.class;
  label age = 'Variable with age and has %percents and &amps in it!!'
        sex = "Variable with sex"
        weight = "Weight, but not in kilograms.";
  ;
  age_as_date = put("2feb2020"d+123*age,yymmdd10.);
run;

proc contents data=class3;
run;
proc print data=class3;
run;

options MPRINT;
%bpUTiL_convertN2C(class3, age_as_date, fmt=yymmdd10.)

ods select variables;
proc contents data=class3;
run;

ods select all;
proc print data=class3;
  format age_as_date date11.;
run;


%bputil_includesasnotebook() macro

DESCRIPTION:

The bpUTiL_includeSASNotebook macro extracts SAS, SQL, and Python code snippets of arbitrary size (tested on a code block containing all SAS Packages Framework code) from a SAS Notebook *.sasnb file.

Macro works in two modes. By default, it extracts code and includes it in SAS session, just like: %include "../code_extractersd_by_hand.sas";. Alternatively, the macro just prints out all extracted code in the log.

For SQL nad Python code blocks prefixes and suffixes are added. For SQL they are:

PROC SQL;

...

QUIT:

For Python they are:

PROC PYTHON;
SUBMIT;

...

ENDSUBMIT;
QUIT;

Inspirations.

Idea for the macro was inspired by Chris Hemedinger's post: https://communities.sas.com/t5/SAS-Viya-Workbench-Discussion/Weekly-Workbench-Tip-Run-a-SAS-Notebook-INCLUDE-style/m-p/943690

Implementation for "small" notebooks (every single code cell in a notebook has length no longer than 32767 bytes) is based on Chris idea, implementation for "big" notebooks (direct code extraction is used in that case) has been written from scratch.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%bpUTiL_includeSASNotebook(
   path
 <,print=>
 <,big=0>
 <,outpath=>
 <,source2=>
)

Arguments description:

  1. path - Required. Valid path to SAS Notebook *.sasnb file or an existing fileref.
  • print= - Optional. Indicates weather notebook code should be printed (1) or included (0). The default is 0.

  • big= - Optional. Indicates which algorithm for source code extraction should be used. If every single code cell in the notebook has length no longer than 32767 bytes use big=0 (the JSON libname is used in that case). If at least one code cell is bigger than 32767 bytes use big=1 (direct code extraction is used in that case) The default is 0.

  • outpath= - Optional. Valid path to a file where the extracted code should be saved. The default is null and a temporary file in work directory is used in that case.

  • source2= - Optional. Indicates weather source2 option should be used for %include (1) or not (0). The default is 0.


EXAMPLES AND USECASES:

EXAMPLE 1. This prints out code from notebook in SAS log:

%bpUTiL_includeSASNotebook(C:\notebooks\SASnotebook1.sasnb, print=1);

EXAMPLE 2. This %include code from notebook in SAS log:

%bpUTiL_includeSASNotebook(~/SASnotebook2.sasnb);


%bputil_listnonmissvars() macro

DESCRIPTION:

The bpUTiL_listNonMissVars() macro allow for automatic listing of variables that have at least one non-missing value.

Macro behaves in two possible ways:

  1. result can be stored a SAS data set (parameter out=), or
  2. macro prints a list of values in a "function-like" style (parameter list=).

Behaviors are exclusive, i.e., you can select only 1) or 2) for a single macro call.

Idea for the macro was inspired by SAS communities post: https://communities.sas.com/t5/SAS-Programming/How-to-find-out-if-all-the-values-of-a-variable-numeric-or/m-p/931919#M366632

Macro executes as pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%bpUTiL_listNonMissVars(
   ds
 <,out=>
 <,print=>
 <,list=>
 <,putlog=>
 <,condition=>
)

Arguments description:

  1. ds - Required. Valid data set name. No data set options can be used.
  • out= - Optional. Name of a data sets to store produced list of variables. Default value is: WORK.anyNonMissingData. When print= or list= are used the data set is not created.

  • print= - Optional. Indicates if the list of variables should be stored as a SAS data set (print=0) or the list should be return in "function-like" style (print=1). Print= is an alias to list=. Default value is 0

  • list= - Optional. Indicates the if list of variables should be stored as a SAS data set (list=0) or the list should be return in "function-like" style (list=1). List= is an alias to print=. Default value is 0

  • putlog= - Optional. Indicates if the list of variables should also displayed in the SAS LOG. Default value is 0 and means "do not print".

  • condition= - Optional. Allows for manual filtering the number of non missing values to select a variable. Default value is NE 0 and means "select variables with at least one non missing value". Example values: EQ 0 - "select 100% missing variables", GT 20 - "select variables with at least 21 non missing values"


EXAMPLES AND USECASES:

EXAMPLE 1. Basic use case:

%let size=20;
data have haveN(keep=_numeric_) haveC(keep=_character_);
  ffffffffffffffffffffffffffffffff = "F";
  gggggggggggggggggggggggggggggggg = 123;
  array   MissC[&size.] $1             ;
  array noMissC[&size.] $1 (&size.*"A");
  array   MissN[&size.]                ;
  array noMissN[&size.]    (&size.* 1 );
  do i = 1 to 5e5;
    output;
  end;
  drop i;
run;

%bpUTiL_listNonMissVars(have,putlog=1)
proc print data = anyNonMissingData;
run; 

%bpUTiL_listNonMissVars(haveN)
proc print data = anyNonMissingData;
run; 

%bpUTiL_listNonMissVars(haveC,out=allMissingData,condition=EQ 0)
proc print data = allMissingData;
run; 
%put %bpUTiL_listNonMissVars(haveC,print=1);

EXAMPLE 2. Use macro to keep non-empty variables:

data have1k;
infile cards dlm="|" dsd;
input a $ b $ c d e $ f;
cards;
z||.|26||.
a||.|1|a|1
b||.|2| |.
c||.|3|c|3
d||.|4| |.
run;

proc print data = have1k;
run; 

%bpUTiL_listNonMissVars(have1k,out=WORK.anyNonMissingData1)
proc print data = anyNonMissingData1;
run; 

%put %bpUTiL_listNonMissVars(have1k,list=1,putlog=1);

data want1k;
  set have1k(keep = %bpUTiL_listNonMissVars(have1k,list=1));
run;

proc print data = want1k;
run;

EXAMPLE 3. Use macro to drop empty variables:

data have1d;
infile cards dlm="|" dsd;
input a $ b $ c d e $ f;
cards;
z||.|26||.
a||.|1|a|1
b||.|2| |.
c||.|3|c|3
d||.|4| |.
run;

proc print data = have1d;
run; 

%bpUTiL_listNonMissVars(have1d,out=WORK.allMissingData1,condition=EQ 0)
proc print data = allMissingData1;
run; 

%put %bpUTiL_listNonMissVars(have1d,list=1,putlog=1,condition=EQ 0);

data want1d;
  set have1d(drop = %bpUTiL_listNonMissVars(have1d,list=1,putlog=1,condition=EQ 0));
run;

proc print data = want1d;
run;

EXAMPLE 4. Use macro to select only certain amount (> 2) of nonmissing:

data have1c;
infile cards dlm="|" dsd;
input a $ b $ c d e $ f;
cards;
z||.|26||.
a||.|1|a|1
b||.|2| |.
c||.|3|c|3
d||.|4| |.
run;

proc print data = have1c;
run; 

%bpUTiL_listNonMissVars(have1c,out=WORK.moreThan2nonMissingData,condition=GT 2)
proc print data = moreThan2nonMissingData;
run; 

%put %bpUTiL_listNonMissVars(have1c,list=1,putlog=1,condition=GT 2);

data want1c;
  set have1c(keep = %bpUTiL_listNonMissVars(have1c,list=1,putlog=1,condition=GT 2));
run;

proc print data = want1c;
run;


%bputil_varstoonecasesize() macro

DESCRIPTION:

The bpUTiL_VarsToOneCaseSize macro converts all variables names in given dataset to "low case only" or to "UPCASE CASE ONLY", e.g., variables Aa bb cC are transformed into aa bb cc or AA BB CC, respectively.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%VarsToOneCaseSize(
   libds
 <,case=>
)

Arguments description:

  1. libds - Required. Valid data set name. No data set options can be used.
  • case - Optional. Indicates weather the upper case (U) or the lower case (L) transformation should be used. L is the default.

EXAMPLES AND USECASES:

EXAMPLE 1. Convert variables names:

data class;
set sashelp.class;
run;
proc print data=class(obs=3);
run;

options mprint;
%bpUTiL_VarsToOneCaseSize(class) %* all to lowcase *;


proc print data=class(obs=3);
run;

options mprint;
%bpUTiL_VarsToOneCaseSize(work.class) %* no update done *;

proc print data=class(obs=3);
run;

options mprint;
%bpUTiL_VarsToOneCaseSize(work.class,case=U) %* all to upcase *;

proc print data=class(obs=3);
run;

NOTES:

NOTE 1. Macro overcome the following issue too:

data test;
  a=42;
run;

PROC DATASETS LIB=work NOLIST NOWARN;
MODIFY test;
  RENAME
    a=a      %* keep lower case - gives an error ;
  ;;;;
  RUN;
QUIT;

PROC DATASETS LIB=work NOLIST NOWARN;
MODIFY test;
  RENAME
    a=A     %* change to upper case - no error ;
  ;;;;
  RUN;
QUIT;

data test2;
  set test;
  rename a=a; %* no error in data step ;
run;



License

Copyright (c) Bartosz Jablonski, since 2024

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.