Thursday, December 22, 2022
HomeData ScienceProc SQL Debug Queries - 9TO5SAS

Proc SQL Debug Queries – 9TO5SAS


PROC SQL is a robust device, however it may be difficult to debug queries. This weblog put up will talk about some Proc SQL choices to debug queries.

1. INOBS= and OUTOBS= Choices

INOBS – The INOBS= choice limits the variety of rows that PROC SQL processes. This feature will be useful when growing a brand new question and testing on a restricted variety of data. 

proc sql inobs=5;
	choose * from sashelp.coronary heart;

The SAS log shows a warning if you use the INOBS choice.

WARNING: Solely 5 data had been learn from SASHELP.HEART as a consequence of INOBS= choice.

OUTOBS – The OUTOBS= choice restricts the variety of rows PROC SQL shows or writes to a desk. For instance, should you specify OUTOBS=10 and insert values right into a desk utilizing a question, then PROC SQL inserts 10 rows into the ensuing desk. OUTOBS= is much like the SAS knowledge set choice OBS=.

proc sql outobs=6;
	choose * from sashelp.coronary heart;

The SAS log shows a warning if you use the OUTOBS choice.

WARNING: Assertion terminated early as a consequence of OUTOBS=6 choice.

INOBS and OUTOBS may not seem to vary in a easy question. Nonetheless, generally it’s important to decide on the proper choice. For instance, the common of a column with INOBS=10 returns solely 10 values.

2. Limiting Iterations with the LOOPS= Choice

In PROC SQL, the LOOPS= choice specifies what number of iterations can be carried out contained in the inside loop. The restrict will be set to forestall extreme pc utilization. For instance, becoming a member of three giant tables with out assembly the join-matching circumstances might lead to a sizeable inner desk that will be inefficient to course of.

proc sql loops=5;
	choose p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.merchandise p, sql2.producers m the place p.manunum=m.manunum;
stop;

The SAS log shows a warning if you use the Loops= choice.

WARNING: PROC SQL assertion interrupted by LOOPS=5 choice.

3. Checking Syntax with the NOEXEC Choice and the VALIDATE Assertion

NOEXEC – Through the use of the NOEXEC choice or the VALIDATE assertion, you possibly can confirm the syntax of a PROC SQL step with out executing it.

The NOEXEC choice can solely be used as soon as in a PROC SQL assertion. This feature will be certain that the syntax of all queries in that PROC SQL step is checked for accuracy with out executing them.

proc sql noexec;
	choose p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.merchandise p, sql2.producers m the place p.manunum=m.manunum;
stop;

The SAS log shows the next message if the question is legitimate and all referenced columns and tables exist.

NOTE: Assertion not executed as a consequence of NOEXEC choice.

SAS shows the usual error messages within the log if there are any errors within the question.

Whenever you invoke the NOEXEC choice, SAS checks the syntax of all queries in that PROC SQL step for accuracy however doesn’t execute them.

VALIDATE – SELECT statements are checked for accuracy by the VALIDATE assertion earlier than being executed. A message is written to the SAS log if the syntax is appropriate. An error message is displayed if the syntax is inaccurate.

proc sql;
validate
choose prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.merchandise p, sql2.producers m the place p.manunum=m.manunum;
stop;

The SAS log shows the next message if the question is legitimate.

NOTE: PROC SQL assertion has legitimate syntax.

SAS shows commonplace error messages within the log if there are errors within the question.

VALIDATE impacts solely the SELECT assertion instantly following it, whereas the NOEXEC choice impacts all the SELECT assertion. In case you are working with a PROC SQL question containing a number of SELECT statements, specify the VALIDATE key phrase earlier than every SELECT assertion you need to verify.

4. SELECT * with the FEEDBACK Choice

The FEEDBACK choice expands a SELECT * assertion into all the chosen variables, including any macro variables. This can be very useful to show the variable order when joining multiple tables. The SAS log displays the results of the expanded list of columns you have selected. 

proc sql feedback;
            select * from sashelp.class;

Expanded SELECT * Statement

proc sql feedback option

5. Reset option

The RESET statement adds, drops, or changes one or more PROC SQL options without restarting the procedure. This example shows the Print option is turned on and then off.

proc sql noprint outobs=5;
	select * from sashelp.class;
	reset print;
	select * from sashelp.class;
quit;
7 PROC SQL Options You Should Use to Debug Queries

6. Timing PROC SQL with the STIMER Option

There are multiple ways to accomplish certain operations. For example, a join often operates like a subquery. The query that runs fastest is generally more efficient based on factors such as readability and maintenance. A SAS system option, STIMER shows you an entire procedure’s cumulative time.

PROC SQL STIMER shows how fast individual statements run in a PROC SQL step. You can optimize your query this way.

The example below compares the execution times of two queries. Both queries list the product names and manufacturer numbers from the products table with a manufacturer number of 500. The first query does this with a join; the second with a subquery. Comparing the run times of two queries shows the STIMER results from the SAS log.

proc sql stimer;
	select prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum and m.manunum=500;
	Select * from sql2.products where manunum in
(Select manunum from sql2.manufacturers where manunum=500);
Quit;

Log results:

7 PROC SQL Options You Should Use to Debug Queries
Comparing Run Times of Two Queries

A join query takes 0.01 seconds, while a subquery takes 0.05 seconds. A join runs faster than an equivalent subquery, regardless of many factors.

7. PROC SQL _METHOD Option

PROC SQL includes a powerful method option called METHOD. Since its implementation, many SAS ® SQL users have expressed very favourable comments for the value-added information it provides on the SAS Log. Furthermore, the METHOD option is well worth exploring due to the benefits of understanding the processes involved during specific PROC SQL operations, query evaluation, algorithm selection and use by the optimizer, or testing and debugging operations.

MSGLEVEL=I

By specifying MSGLEVEL= in an Options statement, users can control how much information the SAS System logs. MSGLEVEL= supports two possible values: N (default) for standard notes, warnings, and error messages; and I for details about sort, merge, and index processing.

Join algorithms

  • Nested Loop – The optimizer selects this algorithm when Sort-Merge, Index, and Hash are eliminated.
  • Sort-Merge – Used when the Index and Hash algorithms have been eliminated from consideration by the optimizer.
  • Index – The optimiser selects a user-defined index when performance will be enhanced.
  • Hash – Used by the optimizer when the smaller of the two tables can fit into the available memory

_METHOD

The _METHOD option returns several codes to the log. You can use these options to understand the choices made by the SQL optimizer, the index used, which joins to use and any sorting performed.

The SQL optimiser performs query operations using one of four join algorithms. The four algorithms include:

Code Description
SQXCRTA Create table as Select
SQZSLCT Select statement or clause
SQXISL Step loop join (Cartesian).
SQXIM Merge join operation.
SQXINDX Index join operation
SQXIHSH Hash join operation.
SQXSORT Sort operation.
SQXSRC Source rows from table.
SQXFIL Rows filtration.
SQXSUMG Summary stats (aggregates) with GROUP BY clause.
SQXSUMN Summary stats with no GROUP BY clause.
The _METHOD Option and Code Descriptions

Note: You need to specify the MSGLEVEL=I options to activate this setting.

These options can be used to debug any issues, for example, detect if the correct join strategy is being used, are my indexes being used in the join.

As part of the two-way equijoin, the _METHOD option is specified to show the processing hierarchy.

options msglevel=i;

proc sql _method;
	select p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum and 
		p.manunum=500;
quit;

The SQL optimizer selected a hash join algorithm based on the join query.

7 PROC SQL Options You Should Use to Debug Queries

8. METHOD AND _TREE OPTIONS WITH MSGLEVEL=I

The _TREE option provides a graphical representation of how the query optimizer executed the query. This option shows the query plan based on several factors (estimated CPU and I/O). There may be more efficient plans. Nevertheless, TREE can help debug.

options msglevel=i;
proc sql _method _tree;
	select p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum and 
		p.manunum=500;
quit;
7 PROC SQL Options You Should Use to Debug Queries



RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments