Tuesday, September 13, 2022
HomeProgrammingUtilizing jOOQ’s Implicit Be part of From Inside the JOIN .. ON...

Utilizing jOOQ’s Implicit Be part of From Inside the JOIN .. ON Clause – Java, SQL and jOOQ.


Beginning with jOOQ 3.11, kind protected implicit JOIN have been made out there, and so they’ve been enhanced to be supported additionally in DML statements in jOOQ 3.17. At present, I’d wish to concentrate on a considerably bizarre however actually highly effective use-case for implicit JOIN, when becoming a member of further tables from inside an specific JOIN‘s ON clause.

The use case

The jOOQ code generator makes heavy use of jOOQ when querying the assorted dictionary views. In PostgreSQL, most queries go to the SQL customary information_schema, however now and again, the usual meta knowledge is inadequate, and we even have to question the pg_catalog, which is extra full but in addition way more technical.

For lots of information_schema views, there exists an nearly equal pg_catalog desk which comprises the identical info. For instance:

information_schema pg_catalog
schemata pg_namespace
tables or user_defined_types pg_class
columns or attributes pg_attribute

Apparently, PostgreSQL being an ORDBMS, tables and person outlined sorts are the identical factor and infrequently interchangeable within the kind system, however that’s a subject for a future weblog publish.

The purpose of this weblog publish is that always, when querying a view like information_schema.attributes, we even have to question pg_catalog.pg_attribute to get further knowledge. For instance, to be able to discover the declared array dimension of a UDT (Consumer Outlined Sort) attribute, we’ve got to entry pg_catalog.pg_attribute.attndims, as this info is nowhere to be discovered within the information_schema. See additionally jOOQ function request #252, the place we’ll add help for H2 / PostgreSQL multi dimensional arrays.

So, we’d have a UDT like this:

CREATE TYPE u_multidim_a AS (
  i integer[][],
  n numeric(10, 5)[][][],
  v varchar(10)[][][][]
);

The canonical SQL solution to entry the pg_attribute desk from the attributes view is:

SELECT 
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  pg_a.attndims 
FROM information_schema.attributes AS is_a
  JOIN pg_attribute AS pg_a 
    ON is_a.attribute_name = pg_a.attname 
  JOIN pg_class AS pg_c
    ON is_a.udt_name = pg_c.relname 
    AND pg_a.attrelid = pg_c.oid
  JOIN pg_namespace AS pg_n
    ON is_a.udt_schema = pg_n.nspname 
    AND pg_c.relnamespace = pg_n.oid
WHERE is_a.data_type="ARRAY"    
ORDER BY
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  is_a.ordinal_position 

To visualise:

                +----- udt_schema = nspname ------> pg_namespace
                |                                      ^
                |                                      |
                |                                     oid
                |                                      =
                |                                     relnamespace
                |                                      |
                |                                      v 
                +------- udt_name = relname ------> pg_class
                |                                      ^
                |                                      |
                |                                     oid
                |                                      =
                |                                     attrelid
                |                                      |
                |                                      v 
is.attributes <-+- attribute_name = attname ------> pg_attribute

And now, we will see just a few of our integration take a look at person outlined sorts, containing multi dimensional arrays:

|udt_schema|udt_name    |attribute_name|attndims|
|----------|------------|--------------|--------|
|public    |u_multidim_a|i             |2       |
|public    |u_multidim_a|n             |3       |
|public    |u_multidim_a|v             |4       |
|public    |u_multidim_b|a1            |1       |
|public    |u_multidim_b|a2            |2       |
|public    |u_multidim_b|a3            |3       |
|public    |u_multidim_c|b             |2       |

However have a look at all these JOIN expressions. They’re undoubtedly no enjoyable. We’ve got to spell out the complete path from pg_attribute to pg_namespace, solely to verify we’re not fetching any ambiguously named knowledge from different UDTs or different schemata.

Utilizing implicit joins as a substitute

And that’s the place the facility of implicit JOIN are available in play. What we actually need to write down in SQL is that this:

SELECT 
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  pg_a.attndims 

-- This desk we want
FROM information_schema.attributes AS is_a

-- And in addition this one
JOIN pg_attribute AS pg_a 
  ON is_a.attribute_name = pg_a.attname 

-- However the path joins from pg_attribute to pg_namespace ought to 
-- be implicit
  AND pg_a.pg_class.relname = is_a.udt_name
  AND pg_a.pg_class.pg_namespace.nspname = is_a.udt_schema
WHERE is_a.data_type="ARRAY"    
ORDER BY
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  is_a.ordinal_position 

It’s not that a lot shorter, however it’s undoubtedly very handy to not have to consider the best way to be a part of the completely different steps. Be aware that not like different instances, the place we used implicit joins through these paths in SELECT or WHERE, this time we’re utilizing them from inside a JOIN .. ON clause! In jOOQ, we will write:

Attributes isA = ATTRIBUTES.as("is_a");
PgAttribute pgA = PgAttribute.as("pg_a");

ctx.choose(
       isA.UDT_SCHEMA,
       isA.UDT_NAME,
       isA.ATTRIBUTE_NAME,
       pgA.ATTNDIMS)
   .from(isA)
   .be a part of(pgA)
     .on(isA.ATTRIBUTE_NAME.eq(pgA.ATTNAME))
     .and(isA.UDT_NAME.eq(pgA.pgClass().RELNAME))
     .and(isA.UDT_SCHEMA.eq(pgA.pgClass().pgNamespace().NSPNAME))
   .the place(isA.DATA_TYPE.eq("ARRAY"))
   .orderBy(
       isA.UDT_SCHEMA,
       isA.UDT_NAME,
       isA.ATTRIBUTE_NAME,
       isA.ORDINAL_POSITION)
   .fetch();

The generated SQL seems barely completely different from the unique one, as jOOQ’s implicit JOIN algorithm won’t ever flatten the JOIN tree to be able to protect any potential JOIN operator priority, which is vital within the occasion of there being LEFT JOIN, FULL JOIN or different operators current. The output seems extra like this:

FROM information_schema.attributes AS is_a
  JOIN (
    pg_catalog.pg_attribute AS pg_a
      JOIN (
        pg_catalog.pg_class AS alias_70236485
          JOIN pg_catalog.pg_namespace AS alias_96617829
            ON alias_70236485.relnamespace = alias_96617829.oid
      )
        ON pg_a.attrelid = alias_70236485.oid
    )
    ON (
      is_a.attribute_name = pg_a.attname
      AND is_a.udt_name = alias_70236485.relname
      AND is_a.udt_schema = alias_96617829.nspname
    )

As you may see, the “readable” desk aliases (is_a and pg_a) are the user-provided ones, whereas the “unreadable,” system generated ones (alias_70236485 and alias_96617829) are those originating from the implicit JOIN. And, once more, it’s vital that these implicit joins are embedded proper the place they belong, with the trail root pg_a, from which we began the trail expressions. That’s the one approach we will retain the right JOIN operator priority semantics, e.g. if we had used a LEFT JOIN between is_a and pg_a

Future enhancements

Sooner or later, there is likely to be even higher JOIN paths that permit for connecting such graphs immediately, as a result of each time you need to be a part of information_schema.attributes and pg_catalog.pg_attribute, you’ll must repeat the identical equalities on the (udt_schema, udt_name, attribute_name) tuple, and whereas implicit JOIN have been useful, it’s straightforward to see how this may be additional improved. The best question can be:

SELECT 
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  pg_a.attndims 
FROM information_schema.attributes AS is_a

-- Magic right here
MAGIC JOIN pg_attribute AS pg_a 
  ON jooq_do_your_thing
WHERE is_a.data_type="ARRAY"    
ORDER BY
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  is_a.ordinal_position 

However we’re not fairly there but.

Gaining access to these be a part of paths

Neither the information_schema views, nor the pg_catalog tables expose any international key meta knowledge, that are a prerequisite for implicit be a part of path expressions and different jOOQ code technology options. This isn’t an enormous drawback as you may specify artificial international keys to the code generator, for exactly this purpose. See additionally our earlier weblog publish about artificial international keys for info schema queries. On this case, all we want is at the least this specification:

<configuration>
  <generator>
    <database>
      <syntheticObjects>
        <foreignKeys>
          <foreignKey>
            <tables>pg_attribute</tables>
            <fields><area>attrelid</area></fields>
            <referencedTable>pg_class</referencedTable>
          </foreignKey>
          <foreignKey>
            <tables>pg_class</tables>
            <fields><area>relnamespace</area></fields>
            <referencedTable>pg_namespace</referencedTable>
          </foreignKey>
        </foreignKeys>
      </syntheticObjects>
    </database>
  </generator>
</configuration>

And ta-dah, we’ve got our JOIN paths as seen within the earlier examples.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments