Tuesday, June 7, 2022
HomeProgrammingHow you can Forestall Execution Plan Troubles when Querying Skewed Information, with...

How you can Forestall Execution Plan Troubles when Querying Skewed Information, with jOOQ – Java, SQL and jOOQ.


One of many greatest benefits of utilizing jOOQ is you could change your entire advanced software’s generated SQL with just some traces of code. On this article, we’ll look into the best way to resolve some widespread bind peeking points similar to that, with out touching your software code, with out the necessity to clarify this superior SQL efficiency challenge to each group member, and better of all: for good.

What are Bind Values Good For?

Initially, bind variables are a superb factor. They:

The latter bullet doesn’t have an effect on all dialects, similar to this text doesn’t. Business DBMS like Oracle, SQL Server, and many others. ship with a strong execution plan cache. They’re designed to run 1000’s of distinct queries with very advanced plans. Planning these queries takes time (I’ve seen Oracle SQL being deliberate for five seconds!) and also you don’t need the DBMS to re-execute this planning work each time the question is run, which can be 1000’s of instances per second.

Normally, this execution plan cache takes the SQL string (or a hash of it), and associates meta information like different doable execution plans to it. When the SQL string modifications, the cache lookup fails and the “new” question needs to be deliberate once more.

I say “new”, as a result of it may be the “identical” question to the consumer, even when the string is totally different. For instance:

SELECT * FROM e book WHERE id = 1;
SELECT * FROM e book WHERE id = 2;

Now now we have two instances the “identical” question, however every of them are “new”. Oracle will plan them each afresh. So, we use bind variables, as a substitute:

SELECT * FROM e book WHERE id = ?;
SELECT * FROM e book WHERE id = ?;

What’s the Bind Peeking Challenge?

In some instances, the cached plan isn’t optimum. That is the case when the precise bind worth is important to the planning, e.g. the worth 1 would produce a vastly totally different plan than the worth 2, or more likely the worth DELETED produces a special plan from PROCESSED or NEW.

This downside has been mentioned in our earlier weblog publish Why You Ought to Design Your Database to Optimise for Statistics.

“Bind Peeking” is a method by the Oracle database (it’s performed by others, too, however perhaps not known as “Bind Peeking”) to have a “peek” on the bind variable to get a extra correct plan than the common plan if we don’t know the bind worth. This could go each methods, good or dangerous, so there have been quite a few fixes / patches / workarounds in earlier Oracle variations. Some fascinating articles on the subject:

Databases slowly get to really adaptive question execution fashions the place execution plans might be mounted in-flight when the estimates had been clearly mistaken. Db2 is sort of robust at this, and Oracle is getting higher.

However even then, some instances planners get it mistaken, just because they will’t fairly estimate the cardinalities produced by a easy predicate like

… simply because the general question could be very advanced and a few SQL transformation didn’t apply.

Stopping the Drawback by Avoiding Bind Values

Once more. Please use bind values by default. They’re a superb factor by default. Not all information is as skewed because the one I introduced in my different weblog publish. However some information is sort of at all times skewed: Enumeration sorts.

When you may have an enum like:

enum ProcessingState {
  NEW,
  PROCESSING,
  EXECUTED,
  DELETED
}

Or in PostgreSQL:

CREATE TYPE processing_state AS ENUM (
  'new',
  'processing',
  'executed',
  'deleted'
);

And even simply encoded as a CHECK constraint:

CREATE TABLE transaction (
  -- ...
  processing_state VARCHAR(10) CHECK (processing_state IN (
    'new',
    'processing',
    'executed',
    'deleted'
  ))
  -- ...
);

In that case, you’ll very probably have extremely skewed information. For instance, a fast question would possibly yield:

SELECT processing_state, depend(*)
FROM transaction
GROUP BY processing_state

Leading to:

+------------------+----------+
| processing_state |    depend |
+------------------+----------+
| new              |    10234 |
| processing       |       15 |
| executed         | 17581684 |
| deleted          |    83193 |
+------------------+----------+

Now, do you suppose you’ll revenue from indexing PROCESSING_STATE equally, when in search of NEW or PROCESSING versus when in search of EXECUTED values? Would you like the identical plans? Would you like a median plan, which could not use the index, when actually you must use it (in search of PROCESSING)?

Not solely that, however it’s additionally fairly unlikely that your queries are so generic as that the person PROCESSING_STATE values can be utilized interchangeably. For instance, a question in search of DELETED states may be run by a housekeeping batch job that wishes to take away the logically deleted transactions for good. It should by no means question something aside from DELETED states. So, would possibly as nicely inline, proper?

Now, should you write a question like this:

SELECT *
FROM transaction
WHERE processing_state="processing";

With jOOQ, you may create an “inline” on a per-query foundation utilizing DSL.inline("processing") (versus DSL.val("processing"), which is utilized by default, implicitly:

// These generate a ? bind worth
ctx.selectFrom(TRANSACTION)
   .the place(TRANSACTION.PROCESSING_STATE.eq("processing"))
   .fetch();

ctx.selectFrom(TRANSACTION)
   .the place(TRANSACTION.PROCESSING_STATE.eq(val("processing")))
   .fetch();

// This creates an inline 'processing' literal
ctx.selectFrom(TRANSACTION)
   .the place(TRANSACTION.PROCESSING_STATE.eq(inline("processing")))
   .fetch();

As at all times, assuming this static import:

import static org.jooq.impl.DSL.*;

However now, you must consider doing that each single time you question PROCESSING_STATE, and all the opposite related columns.

Forestall it for Good

A lot better, stop it for good. You possibly can create a quite simple Binding like this:

class AlwaysInlineStringBinding implements Binding<String, String> {

    last Binding<?, String> delegate = VARCHAR.getBinding();

    @Override
    public Converter<String, String> converter() {
        return Converters.id(String.class);
    }

    @Override
    public void sql(BindingSQLContext<String> ctx) 
    throws SQLException {
        ctx.render().go to(inline(ctx.worth()));
    }

    @Override
    public void register(BindingRegisterContext<String> ctx) 
    throws SQLException {
        delegate.register(ctx);
    }

    // No have to set something
    @Override
    public void set(BindingSetStatementContext<String> ctx) 
    throws SQLException {}

    @Override
    public void set(BindingSetSQLOutputContext<String> ctx) 
    throws SQLException {
        delegate.set(ctx);
    }

    @Override
    public void get(BindingGetResultSetContext<String> ctx) 
    throws SQLException {
        delegate.get(ctx);
    }

    @Override
    public void get(BindingGetStatementContext<String> ctx) 
    throws SQLException {
        delegate.get(ctx);
    }

    @Override
    public void get(BindingGetSQLInputContext<String> ctx) 
    throws SQLException {
        delegate.get(ctx);
    }
}

Or, ranging from jOOQ 3.15, even easier, and generic:

class AlwaysInlineStringBinding 
extends DefaultBinding<String, String> {
    public AlwaysInlineStringBinding() {
        tremendous(DefaultBinding.binding(VARCHAR));
    }

    @Override
    public void sql(BindingSQLContext<String> ctx) 
    throws SQLException {
        ctx.render().go to(inline(ctx.worth()));
    }

    // No have to set something
    @Override
    public void set(BindingSetStatementContext<T> ctx) 
    throws SQLException {}
}

And even generic:

class AlwaysInlineBinding<T> extends DefaultBinding<T, T> {
    public AlwaysInlineBinding(DataType<T> sort) {
        tremendous(DefaultBinding.binding(sort));
    }

    @Override
    public void sql(BindingSQLContext<T> ctx) 
    throws SQLException {
        ctx.render().go to(inline(ctx.worth()));
    }

    // No have to set something
    @Override
    public void set(BindingSetStatementContext<T> ctx) 
    throws SQLException {}
}

All this does is generate inline values as a substitute of the ? bind parameter marker, and skip setting any worth to the JDBC PreparedStatement (or reactive R2DBC Assertion, ranging from jOOQ 3.15. This may work simply the identical!)

Attempt it very simply, your self (utilizing the jOOQ 3.15 model):

@Take a look at
public void testAlwaysInlineBinding() {
    DSLContext ctx = DSL.utilizing(DEFAULT);
    DataType<Integer> t = INTEGER.asConvertedDataType(
        new AlwaysInlineBinding<>(INTEGER));

    Discipline<Integer> i = subject("i", INTEGER);
    Discipline<Integer> j = subject("j", t);
    Param<Integer> a = val(1);
    Param<Integer> b = val(1, INTEGER.asConvertedDataType(
        new AlwaysInlineBinding<>(INTEGER)));

    // Bind worth by default
    assertEquals("?", ctx.render(a));
    assertEquals("1", ctx.renderInlined(a));
    assertEquals("1", ctx.render(b));
    assertEquals("1", ctx.renderInlined(b));

    // Bind worth by default in predicates
    assertEquals("i = ?", ctx.render(i.eq(a)));
    assertEquals("i = 1", ctx.renderInlined(i.eq(a)));
    assertEquals("i = 1", ctx.render(i.eq(b)));
    assertEquals("i = 1", ctx.renderInlined(i.eq(b)));
    assertEquals("i = ?", ctx.render(i.eq(1)));
    assertEquals("i = 1", ctx.renderInlined(i.eq(1)));

    // No extra bind values in predicates!
    assertEquals("j = 1", ctx.render(j.eq(a)));
    assertEquals("j = 1", ctx.renderInlined(j.eq(a)));
    assertEquals("j = 1", ctx.render(j.eq(b)));
    assertEquals("j = 1", ctx.renderInlined(j.eq(b)));
    assertEquals("j = 1", ctx.render(j.eq(1)));
    assertEquals("j = 1", ctx.renderInlined(j.eq(1)));
}

In fact, as a substitute of doing the above programmatically, you’ll connect this Binding to all related columns utilizing the code generator’s pressured sort configuration

Conclusion

Please use bind values by default. Each in jOOQ and elsewhere. It’s an excellent default.

However generally, your information is skewed, and also you as a developer, you most likely know that. In these instances, generally, “inline values” as we name them (or constants, literals, and many others.) often is the better option to assist the optimiser get higher at estimating. Even when the optimiser’s estimates are good the primary time, the plan might swap in manufacturing for some bizarre motive, together with some plans being purged as a result of the cache is full, or the DBA clicked a button, or no matter.

And that’s when your question may be unnecessarily gradual unexpectedly. No extra want. When you may have enum sorts, or related, simply use the above easy trick, apply to your entire schema the place it is sensible, and neglect this downside for good.

Aspect word:

In fact, the opposite means spherical is simply as straightforward. When you may have inline literals that you just need to swap over to bind values, you are able to do it simply the identical means, e.g. while you use jOOQ’s parsing connection to translate between dialects, or to patch your mistaken ORM-generated SQL!

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments