Questions that is likely to be a bit tougher to unravel utilizing abnormal SQL are questions of the sort:
What movies have the identical actors as a given movie X?
As at all times, we’re utilizing the sakila database for this instance. What could be a doable method to remedy this with SQL (for instance, PostgreSQL, to be particular)? The next question offers an outline of actors per movie:
SELECT
film_id,
array_agg(actor_id ORDER BY actor_id) actors
FROM film_actor
GROUP BY film_id
It produces one thing like this:
|film_id|actors | |-------|----------------------------------| |1 |{1,10,20,30,40,53,108,162,188,198}| |2 |{19,85,90,160} | |3 |{2,19,24,64,123} | |4 |{41,81,88,147,162} | |5 |{51,59,103,181,200} | |6 |{21,23,62,108,137,169,197} | |... |... |
Word that in SQL, arrays behave like lists, i.e. they keep their ordering, so ordering the array explicitly is essential to have the ability to evaluate the actors with one another. Now, we wish to discover all movies that share the identical actor set, from the above:
WITH t AS (
-- Earlier question
SELECT
film_id,
array_agg(actor_id ORDER BY actor_id) actors
FROM film_actor
GROUP BY film_id
)
SELECT
array_agg(film_id ORDER BY film_id) AS movies,
actors
FROM t
GROUP BY actors
ORDER BY depend(*) DESC, movies
The result’s now:
|movies |actors | |--------|----------------------------------| |{97,556}|{65} | |{1} |{1,10,20,30,40,53,108,162,188,198}| |{2} |{19,85,90,160} | |{3} |{2,19,24,64,123} | |{4} |{41,81,88,147,162} | |{5} |{51,59,103,181,200} | |{6} |{21,23,62,108,137,169,197} | |... |... |
So, as we are able to see, there are solely 2 movies which share the identical set of actors, and people movies are FILM_ID IN (97, 556)
. (The Sakila database is a bit boring as the info units are generated).
Utilizing MULTISET comparisons
Whereas the above is already fairly cool, on this article, I’d prefer to showcase a lesser identified characteristic of the jOOQ 3.15 MULTISET
help, particularly the truth that they are often in contrast with each other.
And as is the character of SQL customary MULTISET
, ordering is irrelevant, so we don’t have so as to add any specific ORDER BY
clause for such a comparability. In truth, it’s not 100% irrelevant. You can order a MULTISET
for projection functions, so the ordering shall be maintained by jOOQ. However whenever you use them in predicates, jOOQ will override your ORDER BY
clause.
Utilizing jOOQ, we are able to write:
ctx.choose(FILM.FILM_ID, FILM.TITLE)
.from(FILM)
.the place(
multiset(
choose(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.the place(FILM_ACTOR.FILM_ID.eq(FILM.FILM_ID))
).eq(multiset(
choose(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.the place(FILM_ACTOR.FILM_ID.eq(97L))
))
)
.orderBy(FILM_ID)
.fetch();
It’s a bit much less environment friendly than a question of the earlier kind because it accesses the FILM_ACTOR
desk from two subqueries, although solely one in every of them is correlated. Utilizing the default JSONB
emulation, the next question is generated:
SELECT movie.film_id, movie.title
FROM movie
WHERE (
SELECT coalesce(
jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
jsonb_build_array()
)
FROM (
SELECT film_actor.actor_id AS v0
FROM film_actor
WHERE film_actor.film_id = movie.film_id
) AS t
) = (
SELECT coalesce(
jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
jsonb_build_array()
)
FROM (
SELECT film_actor.actor_id AS v0
FROM film_actor
WHERE film_actor.film_id = 97
) AS t
)
ORDER BY movie.film_id
I promised that no ORDER BY
clause was wanted for MULTISET
, and that is nonetheless true for the jOOQ code. Nonetheless, behind the scenes, jOOQ has to order the JSON arrays by their contents to be sure that two MULTISET
values are the identical, no matter their order.
The outcome is identical two IDs because the earlier outcome confirmed:
+-------+--------------+ |film_id|title | +-------+--------------+ | 97|BRIDE INTRIGUE| | 556|MALTESE HOPE | +-------+--------------+
Evaluating MULTISET_AGG, as an alternative
In the event you want utilizing joins and GROUP BY
to generate the movie’s actor MULTISET
, you are able to do that as nicely, with jOOQ. This time, we’re utilizing:
- Implicit joins to simplify entry to the
FILM.TITLE
fromFILM_ACTOR
- A
MULTISET
predicate within theHAVING
clause, utilizingMULTISET_AGG
Right here’s the jOOQ model:
ctx.choose(FILM_ACTOR.FILM_ID, FILM_ACTOR.movie().TITLE)
.from(FILM_ACTOR)
.groupBy(FILM_ACTOR.FILM_ID, FILM_ACTOR.movie().TITLE)
.having(multisetAgg(FILM_ACTOR.ACTOR_ID).eq(multiset(
choose(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.the place(FILM_ACTOR.FILM_ID.eq(97L))
)))
.orderBy(FILM_ACTOR.FILM_ID)
.fetch();
The backing, generated SQL seems to be like this:
SELECT film_actor.film_id, alias_75379701.title
FROM film_actor
JOIN movie AS alias_75379701
ON film_actor.film_id = alias_75379701.film_id
GROUP BY film_actor.film_id, alias_75379701.title
HAVING jsonb_agg(
jsonb_build_array(film_actor.actor_id) ORDER BY film_actor.actor_id
) = (
SELECT coalesce(
jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
jsonb_build_array()
)
FROM (
SELECT film_actor.actor_id AS v0
FROM film_actor
WHERE film_actor.film_id = 97
) AS t
)
ORDER BY film_actor.film_id
Discover how the implicit be a part of is expanded mechanically, whereas the HAVING
predicate once more makes use of the standard JSONB
emulation for MULTISET
and MULTISET_AGG
.
Alternate options
Within the above examples, we’ve in contrast MULTISET
expressions that venture single columns, in different phrases, Outcome<Record1<Lengthy>>
nested assortment sorts. Nothing retains you from including extra columns to the equation. jOOQ will at all times make sure that your question kind checks and that the generated SQL is appropriate.
A substitute for utilizing MULTISET
could be utilizing ARRAY_AGG
and ARRAY
(now you must ORDER BY
explicitly, once more). With jOOQ:
ctx.choose(FILM_ACTOR.FILM_ID, FILM_ACTOR.movie().TITLE)
.from(FILM_ACTOR)
.groupBy(FILM_ACTOR.FILM_ID, FILM_ACTOR.movie().TITLE)
.having(arrayAgg(FILM_ACTOR.ACTOR_ID)
.orderBy(FILM_ACTOR.ACTOR_ID).eq(array(
choose(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.the place(FILM_ACTOR.FILM_ID.eq(97L))
.orderBy(FILM_ACTOR.ACTOR_ID)
)))
.orderBy(FILM_ACTOR.FILM_ID)
.fetch();
With SQL:
SELECT film_actor.film_id, movie.title
FROM film_actor
JOIN movie
ON film_actor.film_id = movie.film_id
GROUP BY film_actor.film_id, movie.title
HAVING array_agg(film_actor.actor_id ORDER BY film_actor.actor_id) =
ARRAY (
SELECT film_actor.actor_id
FROM film_actor
WHERE film_actor.film_id = 97
ORDER BY film_actor.actor_id
)
ORDER BY film_actor.film_id