Implementing a generic REDUCE mixture perform with SQL – Java, SQL and jOOQ.

June 9, 2022

1

So, @rotnroll666 nerd sniped me once more. Apparently, the Neo4j Cypher question language helps arbitrary reductions, identical to any useful assortment API, oh say, the JDK Stream API:

Stream.of(2, 4, 3, 1, 6, 5)
      .cut back((i, j) -> i * j)
      .ifPresent(System.out::println); // Prints 720

SQL doesn’t have this, but it could be very helpful to have the ability to often do this. An arbitrary discount might be carried out “simply” in SQL. Let’s have a look at the above multiplication discount. In PostgreSQL, you’d write it like this:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose 
  (
    with recursive
      u(i, o) as (
         choose i, o
         from unnest(array_agg(t.i)) with ordinality as u(i, o)
      ),
      r(i, o) as (
        choose u.i, u.o from u the place o = 1
        union all 
        choose r.i * u.i, u.o from u be a part of r on u.o = r.o + 1
        --     ^^^^^^^^^ discount
      )
    choose i from r
    order by o desc
    restrict 1
  )
from t;

Woah. That’s a little bit of a syntactic beast. Let’s decompose it.

The mixture perform

First off, if we had been summing the values, we’d use the built-in SUM perform, like this:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose sum(i)
from t;

That might produce 21. In case you’re keen to lose precision, you possibly can emulate PRODUCT() utilizing logarithms. However we wrote REDUCE(), a hypothetical one, like this:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose cut back(
  t1.i * t2.i referencing gathered as t1, accumulating as t2
)
from t;

That is SQL, so the lambda expression would clearly use a ton of key phrases, utterly novel and distinctive to this explicit perform, and also you’d want jOOQ to make it composable . Primarily, we’d have some kind of discount expression based mostly on two pseudo tables:

The gathered desk containing the end result
The accumulating desk (or fairly row)

A discount is a generic mixture perform that operates on teams. So, we should re-use some SQL mixture perform mechanism to realize the specified behaviour.

Utilizing ARRAY_AGG() to get the aggregation impact

First off, let’s do some aggregation. PostgreSQL’s ARRAY_AGG() is ideal for this job, as a result of it

Aggregates
But kinda leaves the information untouched, not like e.g. SUM()

In a means, it’s a set like Stream.acquire(), not a discount.

If we use ARRAY_AGG() in a correlated subquery, we’ll nonetheless get the aggregation impact, however we will unnest the array once more to a desk, with a view to function on it. You’ll be able to see this within the following instance:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose 
  (
    choose string_agg(i::textual content, ', ')
    from unnest(array_agg(t.i)) as u(i)
  )
from t;

This yields:

2, 4, 3, 1, 6, 5

Not a really helpful factor to do, mixture, unnest, and mixture once more, however it exhibits the facility of nesting an mixture perform in a correlated subquery’s FROM clause. In case your RDBMS doesn’t have arrays, perhaps you are able to do the identical factor utilizing JSON_ARRAYAGG and JSON_TABLE, or XMLAGG and XMLTABLE.

Disclaimer: PostgreSQL usually Does The Proper Factor™. I believe you’d be extra onerous pressed to juggle with SQL syntax as elegantly in most different RDBMS, so this strategy isn’t transportable. However as Lætitia Avrot so elegantly put it:

Subsequent step, generate row numbers

There are primarily 2 methods how we will generate row numbers in our instance:

Adapting our earlier instance for some visualisation:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose 
  (
    choose string_agg(row(i, o)::textual content, ', ')
    from unnest(array_agg(t.i)) with ordinality as u(i, o)
  )
from t;

(Superior, that row constructor!)

This produces:

(2,1), (4,2), (3,3), (1,4), (6,5), (5,6)

Doesn’t look fancy, however think about we group by even numbers:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose 
  i % 2,
  (
    choose string_agg(row(i, o)::textual content, ', ')
    from unnest(array_agg(t.i)) with ordinality as u(i, o)
  )
from t
group by i % 2;

The result’s now:

i % 2	string_agg
0	(2,1), (4,2), (6,3)
1	(3,1), (1,2), (5,3)

It’s a bit bizarre, proper? We GROUP BY within the outer question, and your complete correlated subquery is the combination perform based mostly on the truth that its FROM clause incorporates ARRAY_AGG(). This isn’t a lot totally different from this question:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose 1 + sum(i) + 2
from t;

We’re used to constructing scalar expressions from mixture features on a regular basis. That is nothing fancy. We are able to simply additionally simply wrap the perform in one other subquery:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose (choose 1 + sum(i) + 2)
from t;

From right here, it’s not far fetched to increase the aggregate-function-in-scalar-subquery strategy to the FROM clause, after which unnesting the aggregation once more. This may occasionally not “click on” instantly. The GROUP BY clause in SQL is a bit bizarre, syntactically.

Comment: Regrettably, PostgreSQL doesn’t enable utilizing mixture features within the FROM clause on the identical question stage like in a correlated subquery. I used to be going to indicate a elaborate LATERAL model, however this doesn’t work (but).

Now, recurse

The ultimate bit is the recursion with the r desk:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose 
  (
    with recursive
      u(i, o) as (
         choose i, o
         from unnest(array_agg(t.i)) with ordinality as u(i, o)
      ),
      r(i, o) as (
        choose u.i, u.o from u the place o = 1
        union all 
        choose r.i * u.i, u.o from u be a part of r on u.o = r.o + 1
        --     ^^^^^^^^^ discount
      )
    choose i from r
    order by o desc
    restrict 1
  )
from t;

We merely recurse on the ordinality. The primary subquery of UNION ALL produces the primary row of our information, particularly (1, 1). The subsequent iterations simply at all times multiply the results of r.i by the worth of u.i from the following row by ordinality. That is in all probability finest proven visually:

r.i	r.o	u.i
2 = u.i (first iteration)	1	2
8 = prev r.i * u.i	2	4
24 = prev r.i * u.i	3	3
24 = prev r.i * u.i	4	1
144 = prev r.i * u.i	5	6
720 = prev r.i * u.i	6	5

Lastly, we don’t care about SQL’s set-based means of working. I.e. we don’t care about the entire set of multiplications which can be proven within the desk above. We solely care in regards to the final row, ordered by the ordinality, which incorporates our end in r.i

Achieved!

Utilizing group by

Simply as proven earlier than, we will simply add a GROUP BY clause to the outer question. E.g. let’s multiply odd and even numbers individually:

with t(i) as (values (2), (4), (3), (1), (6), (5))
choose 
  i % 2,
  (
    with recursive
      u(i, o) as (
         choose i, o
         from unnest(array_agg(t.i)) with ordinality as u(i, o)
      ),
      r(i, o) as (
        choose u.i, u.o from u the place o = 1
        union all 
        choose r.i * u.i, u.o from u be a part of r on u.o = r.o + 1
      )
    choose i from r
    order by o desc
    restrict 1
  ),
  string_agg(i::textual content, ' * ')
from t
group by i % 2

I’ve added one other mixture perform STRING_AGG() for good measure to get:

i % 2	i	string_agg
0	48	2 * 4 * 6
1	15	3 * 1 * 5

Great, isn’t it? Now, I wasn’t in a position to simply add an OVER() clause proper there. That produced

SQL Error [42P20]: ERROR: window features aren’t allowed in features in FROM

Possibly that may work as properly, within the close to future? Or, I’d give you one other hack to make it work, in case of which I’ll replace this put up.

jOOQ help

Clearly, this shall be supported in jOOQ quickly: https://github.com/jOOQ/jOOQ/points/11385. The syntax shall be once more rather more bearable:

ctx.choose(T.I.mod(inline(2)), cut back(T.I, (i1, i2) -> i1.occasions(i2)))
   .from(T.I)
   .groupBy(T.I.mod(inline(2)))
   .fetch();

Different emulations utilizing precise CREATE AGGREGATE FUNCTION shall be investigated as properly, within the close to future.

Previous articleFixing miscommunication points that decelerate enterprise targets

Next articleCisco provides to community coaching lineup with Cisco U. launch

Implementing a generic REDUCE mixture perform with SQL – Java, SQL and jOOQ.

The mixture perform

Utilizing ARRAY_AGG() to get the aggregation impact

Subsequent step, generate row numbers

Now, recurse

Utilizing group by

jOOQ help

Like this:

The Nice Decentralization? Geographic shifts and the place tech expertise is transferring subsequent

Translating Saved Procedures Between Dialects – Java, SQL and jOOQ.

12 Totally different Kinds of Pc Community

LEAVE A REPLY Cancel reply

Most Popular

Cisco provides to community coaching lineup with Cisco U. launch

Fixing miscommunication points that decelerate enterprise targets

Automobile Stereo Will not Energy on, Nevertheless it Has Energy

unity – Sphere rolling shader like ice cream

Recent Comments

ABOUT US

POPULAR POSTS

Cisco provides to community coaching lineup with Cisco U. launch

Fixing miscommunication points that decelerate enterprise targets

Automobile Stereo Will not Energy on, Nevertheless it Has Energy

POPULAR CATEGORY