Wednesday, June 8, 2022
HomeProgrammingCalculating Pagination Metadata With out Further Roundtrips in SQL – Java, SQL...

Calculating Pagination Metadata With out Further Roundtrips in SQL – Java, SQL and jOOQ.


When paginating ends in SQL, we use customary SQL OFFSET .. FETCH or a vendor particular model of it, resembling LIMIT .. OFFSET. For instance:

SELECT first_name, last_name
FROM actor
ORDER BY actor_id
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY

As at all times, we’re utilizing the Sakila database for this instance.

That is slightly easy. It’ll give us web page 2 out of N pages, with a web page dimension of 10. However how will we calculate these values? How do we all know we’re on web page 2? How do we all know the variety of pages N? Can we calculate this with out an additional round-trip e.g. to calculate the full variety of actors:

-- Yuck, a second round-trip!
SELECT COUNT(*)
FROM actor

We are able to do it with a single SQL question and window features, however earlier than I clarify how to do that, please contemplate studying this text on why OFFSET pagination is a foul factor to your efficiency

In the event you’re nonetheless satisfied OFFSET pagination is what you want, versus keyset pagination, let’s take a look at the best way to calculate the above meta knowledge with SQL.

What Metadata Do We Want?

The metadata we usually have to paginate utilizing OFFSET are these:

  • TOTAL_ROWS: The full variety of data if we hadn’t paginated
  • CURRENT_PAGE: The present web page we’re on
  • MAX_PAGE_SIZE: The utmost web page dimension
  • ACTUAL_PAGE_SIZE: The precise web page dimension (when on the final web page)
  • ROW: The precise offsets of the returned rows
  • LAST_PAGE: Whether or not we’re on the final web page

The utmost web page dimension is one thing we set to the question, so it doesn’t should be calculated. Every part else must be calculated. And right here’s how to try this in a single question

SELECT 
  t.first_name, 
  t.last_name,

  -- Calculate some meta knowledge
  COUNT(*) OVER () AS actual_page_size,
  MAX(row) OVER () = total_rows AS last_page,

  -- Metadata from the subquery
  total_rows,
  row,
  ((row - 1) / :max_page_size) + 1 AS current_page
FROM (
  SELECT
    u.*,

    -- Calculate some meta knowledge, repeating the ORDER BY from
    -- the unique question
    COUNT(*) OVER () AS total_rows,
    ROW_NUMBER () OVER (ORDER BY u.actor_id) AS row

  -- Authentic question with all of the predicates, joins, as a derived desk
  FROM (
    SELECT *
    FROM actor
  ) AS u

  -- Ordering and pagination carried out right here, the place :offset is
  -- The utmost row worth of the earlier web page + 1
  ORDER BY u.actor_id
  OFFSET :offset ROWS
  FETCH NEXT :max_page_size ROWS ONLY
) AS t
ORDER BY t.actor_id

That’s it. Spectacular? Don’t be scared, I’ll stroll you thru this stuff step-by-step. And if you happen to ever get confused by SQL syntax, contemplate this text explaining the logical order of SQL operations, which is, for our instance:

  • FROM (recurse ordering for derived tables)
  • WHERE (which the instance omitted)
  • WINDOW calculations
  • SELECT (the projection)
  • ORDER BY
  • OFFSET .. FETCH

Annotating our question, ordering operations logically as 1.1, 1.2, 2.1, 2.2, 2.3, 2.4, 2.5, 3.1, 3.2, 3.3, 3.4:

-- 3.3
SELECT 
  t.first_name, 
  t.last_name,

  -- 3.2
  COUNT(*) OVER () AS actual_page_size,
  MAX(row) OVER () = total_rows AS last_page,

  -- 3.3
  total_rows,
  row,
  ((row - 1) / :max_page_size) + 1 AS current_page
-- 3.1
FROM (
  -- 2.3
  SELECT
    u.*,

    -- 2.2
    COUNT(*) OVER () AS total_rows,
    ROW_NUMBER () OVER (ORDER BY u.actor_id) AS row

  -- 2.1
  FROM (

    -- 1.2
    SELECT *
    -- 1.1
    FROM actor
  ) AS u

  -- 2.4
  ORDER BY u.actor_id

  -- 2.5
  OFFSET :offset ROWS
  FETCH NEXT :max_page_size ROWS ONLY
) AS t

-- 3.4
ORDER BY t.actor_id

Step-by-step clarification

First off, the unique question SELECT * FROM actor is wrapped as a derived desk known as u. You are able to do nearly something you need with this authentic question, making use of only some transformations:

  • 1.1, 1.2, 2.1: You should venture (SELECT clause) the columns that your authentic question projected, plus the columns that you simply want for ORDER BY. As a result of I’m projecting the fitting issues within the outermost question, and since there’s no DISTINCT clause within the authentic question, I conveniently projected *. Alternatively, I may have projected FIRST_NAME, LAST_NAME (as a result of that’s projected within the authentic question), and ACTOR_ID (as a result of that’s what we ORDER BY).
  • 2.2: On that derived desk u, we’re now capable of calculate some metadata, together with TOTAL_ROWS as COUNT(*) OVER () and ROW as ROW_NUMBER () OVER (ORDER BY t.actor_id). The COUNT(*) OVER () window operate has an empty window specification OVER (), that means it calculates all of the rows that outcome from the FROM, WHERE, GROUP BY, HAVING clauses, i.e. from u in our explicit instance. And not using a second round-trip! The ROW_NUMBER () OVER (ORDER BY u.actor_id) orders all of the rows in u by u.actor_id and assigns distinctive row numbers to them, in keeping with that ordering.
  • 2.3: The window features are calculated implicitly as a result of they’re positioned within the projection of this derived desk. We’re additionally once more going to conveniently venture every thing from u.*, as a result of the outer-most question is the one which tasks columns explicitly.
  • 2.4: The unique ordering has been moved right here as a result of there is no such thing as a assure that the ordering would have been maintained if we had ordered the contents of u. However we’d like the ordering to calculate OFFSET .. FETCH proper after
  • 2.5: That is the place we paginate. The OFFSET corresponds to the utmost ROW worth that we’ve encountered earlier than. We begin at 0, and with a web page dimension of 15, we use 15 on the subsequent web page. Do not forget that whereas indexes are 1 primarily based in SQL, OFFSET is 0 primarily based.
  • 3.1: All the above is wrapped once more in a derived desk, to be able to make additional calculations on it, particularly:
  • 3.2: We are able to once more calculate COUNT(*) OVER (), calculating the full variety of rows that outcome from the FROM, WHERE, GROUP BY, HAVING clauses, i.e. from t in our explicit instance. This time, the variety of rows might be not more than MAX_PAGE_SIZE, as a result of that’s what the FETCH (or LIMIT) clause within t says. However it may be much less, too, so that is what we use to calculate the ACTUAL_PAGE_SIZE. Lastly, we examine MAX(row) OVER () = total_rows to see if we’re on the final web page, that means the very best worth for row within the present web page ensuing from t is in comparison with the full row rely. One other technique to calculate the LAST_PAGE worth can be if ACTUAL_PAGE_SIZE < MAX_PAGE_SIZE, i.e. COUNT(*) OVER () < :MAX_PAGE_SIZE.
  • 3.3: Along with the same old projection of the unique columns FIRST_NAME, LAST_NAME (we’re now not projecting * now!), we’re doing a little remaining calculations together with dividing ROW / TOTAL_ROWS to get the web page quantity. You’ll be able to calculate extra issues, resembling TOTAL_ROWS / MAX_PAGE_SIZE to get the TOTAL_PAGES worth.
  • 3.4: Lastly, we should ORDER BY t.actor_id once more, don’t let anybody inform you in any other case. In SQL, if you don’t ORDER BY, then the ordering is undefined. Certain, it might be foolish for an optimiser to re-order issues with none good purpose. We’ve already ordered the contents of our subqueries in 2.4, however there may be no assure, that this ordering is steady. Simply add DISTINCT, UNION, or a JOIN leading to a hash be a part of or some random different operator to your question, and the ordering breaks. So, at all times ORDER BY if ordering is necessary to you.

And we’re carried out!

How one can do it in jOOQ?

That is the sort of use-case the place jOOQ actually actually shines, as a result of all of that is about dynamic SQL. Your precise enterprise logic is contained within the deeply nested u desk. Every part else is “presentation logic”, which is applied in SQL for very apparent causes: To enhance efficiency.

And since you wish to implement all of this solely as soon as in some library of yours, as an alternative of getting to play this recreation on each question, you make this sort of question dynamic. The utility will seem like this:

// Assuming as at all times the same old static imports, together with:
// import static org.jooq.impl.DSL.*;
// import com.generated.code.Tables.*;

static Choose<?> paginate(
    DSLContext ctx,
    Choose<?> authentic,
    Discipline<?>[] type,
    int restrict,
    int offset
) {
    Desk<?> u = authentic.asTable("u");
    Discipline<Integer> totalRows = rely().over().as("total_rows");
    Discipline<Integer> row = rowNumber().over().orderBy(u.fields(type))
        .as("row");

    Desk<?> t = ctx
        .choose(u.asterisk())
        .choose(totalRows, row)
        .from(u)
        .orderBy(u.fields(type))
        .restrict(restrict)
        .offset(offset)
        .asTable("t");

    Choose<?> outcome = ctx
        .choose(t.fields(authentic.getSelect().toArray(Discipline[]::new)))
        .choose(
            rely().over().as("actual_page_size"),
            area(max(t.area(row)).over().eq(t.area(totalRows)))
                .as("last_page"),
            t.area(totalRows),
            t.area(row),
            t.area(row).minus(inline(1)).div(restrict).plus(inline(1))
                .as("current_page"))
        .from(t)
        .orderBy(t.fields(type));

    // System.out.println(outcome);
    return outcome;
}

Discover the println for debugging? It’ll print once more one thing like our authentic question (however you’ll additionally see that in your debug log output, by default, with jOOQ):

choose
  t.ACTOR_ID,
  t.FIRST_NAME,
  t.LAST_NAME,
  rely(*) over () as actual_page_size,
  (max(t.row) over () = t.total_rows) as last_page,
  t.total_rows,
  t.row,
  ((t.row / 15) + 1) as current_page
from (
  choose
    u.*,
    rely(*) over () as total_rows,
    row_number() over (order by u.ACTOR_ID) as row
  from (
    choose
      ACTOR.ACTOR_ID,
      ACTOR.FIRST_NAME,
      ACTOR.LAST_NAME
    from ACTOR
  ) as u
  order by u.ACTOR_ID
  offset 30 rows
  fetch subsequent 15 rows solely
) as t
order by t.ACTOR_ID

And right here’s the way you name the utility:

System.out.println(
    paginate(
        ctx,
        ctx.choose(ACTOR.ACTOR_ID, ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
           .from(ACTOR),
        new Discipline[] { ACTOR.ACTOR_ID },
        15,
        30
    ).fetch()
);

Discover which you could plug in arbitrary SQL fragments into that utility and paginate them. No matter the complexity (together with joins, different window features, grouping, recursion, and what not), jOOQ could have you lined and can now paginate issues for you.

The results of the above is:

+--------+----------+---------+----------------+---------+----------+----+------------+
|ACTOR_ID|FIRST_NAME|LAST_NAME|actual_page_size|last_page|total_rows| row|current_page|
+--------+----------+---------+----------------+---------+----------+----+------------+
|      31|SISSY     |SOBIESKI |              15|false    |       200|  31|           3|
|      32|TIM       |HACKMAN  |              15|false    |       200|  32|           3|
|      33|MILLA     |PECK     |              15|false    |       200|  33|           3|
|      34|AUDREY    |OLIVIER  |              15|false    |       200|  34|           3|
|      35|JUDY      |DEAN     |              15|false    |       200|  35|           3|
|      36|BURT      |DUKAKIS  |              15|false    |       200|  36|           3|
|      37|VAL       |BOLGER   |              15|false    |       200|  37|           3|
|      38|TOM       |MCKELLEN |              15|false    |       200|  38|           3|
|      39|GOLDIE    |BRODY    |              15|false    |       200|  39|           3|
|      40|JOHNNY    |CAGE     |              15|false    |       200|  40|           3|
|      41|JODIE     |DEGENERES|              15|false    |       200|  41|           3|
|      42|TOM       |MIRANDA  |              15|false    |       200|  42|           3|
|      43|KIRK      |JOVOVICH |              15|false    |       200|  43|           3|
|      44|NICK      |STALLONE |              15|false    |       200|  44|           3|
|      45|REESE     |KILMER   |              15|false    |       200|  45|           3|
+--------+----------+---------+----------------+---------+----------+----+------------+

Or, on the final web page, with offset 195

+--------+----------+---------+----------------+---------+----------+----+------------+
|ACTOR_ID|FIRST_NAME|LAST_NAME|actual_page_size|last_page|total_rows| row|current_page|
+--------+----------+---------+----------------+---------+----------+----+------------+
|     196|BELA      |WALKEN   |               5|true     |       200| 196|          14|
|     197|REESE     |WEST     |               5|true     |       200| 197|          14|
|     198|MARY      |KEITEL   |               5|true     |       200| 198|          14|
|     199|JULIA     |FAWCETT  |               5|true     |       200| 199|          14|
|     200|THORA     |TEMPLE   |               5|true     |       200| 200|          14|
+--------+----------+---------+----------------+---------+----------+----+------------+

Conclusion

jOOQ is all about dynamic SQL. There’s hardly any SQL function left that jOOQ doesn’t help. This consists of window features, for instance, but in addition ensuring that your dynamic SQL works on numerous SQL dialects, no matter the little syntactic particulars.

You’ll be able to construct your personal libraries to assemble re-usable SQL components from different SQL constructing blocks as this text has proven, to dynamically create single-query OFFSET pagination meta knowledge calculation, with out performing further database spherical journeys.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments