Thursday, September 1, 2022
HomeProgrammingPlot an ASCII Bar Chart with SQL – Java, SQL and jOOQ.

Plot an ASCII Bar Chart with SQL – Java, SQL and jOOQ.


No want for costly Tableau subscriptions. Ditch Microsoft Excel. Simply use native PostgreSQL to rapidly visualise your information!

Right here’s an thought I had for some time. As you might know, jOOQ can produce fancy charts out of your jOOQ outcomes. However that requires you employ jOOQ, and also you might not be utilizing jOOQ, since you’re not coding in Java/Kotlin/Scala (in any other case, you’d be utilizing jOOQ). That’s OK. I believed, why not do it with SQL (PostgreSQL, to be particular) immediately, then? In any case, I’m me:

So, with the next fancy question, which I’ll keep and develop additional on this github repo, it is possible for you to to simply plot absolutely anything immediately out of your favorite SQL editor.

Simply take a look at it. Look:

-- The instance makes use of https://www.jooq.org/sakila, however you may simply change
-- the "supply" desk with anything
with 

  -- This half is what you may modify to adapt to your individual wants
  --------------------------------------------------------------

  -- Your information producing question right here 
  supply (key, worth) as (
    choose payment_date::date::timestamp, sum(quantity)
    from fee
    the place extract(yr from payment_date) < 2006
    group by payment_date::date::timestamp
    order by payment_date::date::timestamp
  ),
  
  -- Some configuration gadgets:
  constants as (
    choose
    
      -- the peak of the y axis
      15 as peak, 

      -- the width of the x axis, if normalise_x, in any other case, ignored
      25 as width, 

      -- the bar characters
      '##' as characters,

      -- the characters between bars
      ' ' as separator,
      
      -- the padding of the labels on the y axis
      10 as label_pad, 
      
      -- whether or not to normalise the information on the x axis by
      -- - filling gaps (if int, bigint, numeric, timestamp, 
      --   timestamptz)
      -- - scaling the x axis to "width"
      true as normalise_x
  ),
  
  -- The remaining does not must be touched
  --------------------------------------
  
  -- Pre-calculated dimensions of the supply information
  source_dimensions (kmin, kmax, kstep, vmin, vmax) as (
    choose 
      min(key), max(key), 
      (max(key) - min(key)) / max(width), 
      min(worth), max(worth)
    from supply, constants
  ),
  
  -- Normalised information, which fills the gaps in case the important thing information
  -- kind could be generated with generate_series (int, bigint, 
  -- numeric, timestamp, timestamptz)
  source_normalised (key, worth) as (
    choose ok, coalesce(sum(supply.worth), 0)
    from source_dimensions
      cross be part of constants
      cross be part of lateral 
        generate_series(kmin, kmax, kstep) as t (ok)
      left be part of supply 
        on supply.key >= t.ok and supply.key < t.ok + kstep
    group by ok
  ),

  -- Exchange source_normalised by supply should you do not just like the 
  -- normalised model
  actual_source (i, key, worth) as (
    choose row_number() over (order by key), key, worth 
    from source_normalised, constants
    the place normalise_x
    union all
    choose row_number() over (order by key), key, worth
    from supply, constants
    the place not normalise_x
  ),
    
  -- Pre-calculated dimensions of the particular information
  actual_dimensions (
    kmin, kmax, kstep, vmin, vmax, width_or_count
  ) as (
    choose 
      min(key), max(key), 
      (max(key) - min(key)) / max(width), 
      min(worth), max(worth), 
      case
        when each(normalise_x) then least(max(width), rely(*)::int) 
        else rely(*)::int 
      finish
    from actual_source, constants
  ),
  
  -- Extra comfort
  dims_and_consts as (
    with 
      temp as (
        choose *, 
        (size(characters) + size(separator)) 
          * width_or_count as bar_width
      from actual_dimensions, constants
    )
    choose *,
      (bar_width - size(kmin::textual content) - size(kmax::textual content)) 
        as x_label_pad
    from temp
  ),
  
  -- A cartesian product for all (x, y) information factors
  x (x) as (
    choose generate_series(1, width_or_count) from dims_and_consts
  ),
  y (y) as (
    choose generate_series(1, peak) from dims_and_consts
  ),

  -- Rendering the ASCII chart
  chart (rn, chart) as (
    choose
      y,
      lpad(y * (vmax - vmin) / peak || '', label_pad) 
        || ' | ' 
        || string_agg(
             case 
               when peak * actual_source.worth / (vmax - vmin) 
                 >= y then characters 
               else repeat(' ', size(characters)) 
             finish, separator 
             order by x
           )
    from 
      x left be part of actual_source on actual_source.i = x, 
      y, dims_and_consts
    group by y, vmin, vmax, peak, label_pad
    union all
    choose 
      0, 
      repeat('-', label_pad) 
        || '-+-' 
        || repeat('-', bar_width)
    from dims_and_consts
    union all
    choose 
      -1, 
      repeat(' ', label_pad) 
        || ' | ' 
        || case 
             when x_label_pad < 1 then '' 
             else kmin || repeat(' ', x_label_pad) || kmax 
           finish
    from dims_and_consts
  )
choose chart
from chart
order by rn desc
;

Operating towards the sakila database, you’ll get this fancy chart:

chart                                                                                   |
----------------------------------------------------------------------------------------+
11251.7400 |                                                       ##                   |
10501.6240 |                                                       ##                   |
9751.50800 |                                                       ##                   |
9001.39200 |                                                       ##                   |
8251.27600 |                                                       ##                   |
7501.16000 |                                     ##                ##             ## ## |
6751.04400 |                                     ##                ##             ## ## |
6000.92800 |                                     ##                ##             ## ## |
5250.81200 |                   ##                ##             ## ##             ## ## |
4500.69600 |                   ##                ##             ## ##             ## ## |
3750.58000 |                   ## ##             ## ##          ## ##             ## ## |
3000.46400 |                   ## ##             ## ##          ## ##             ## ## |
2250.34800 |    ##             ## ##          ## ## ##          ## ## ##          ## ## |
1500.23200 | ## ##             ## ##          ## ## ##          ## ## ##          ## ## |
750.116000 | ## ##             ## ##          ## ## ##          ## ## ##          ## ## |
-----------+----------------------------------------------------------------------------|
           | 2005-05-24 00:00:00                                     2005-08-23 00:00:00|

Isn’t that one thing!

How does it work?

The question has 3 components:

  • supply: The precise question, producing information. That is what you may substitute and place your individual, as an alternative
  • constants: The configuration part, the place you may tweak dimensions, bar chart characters, and many others.
  • the remaining, which you don’t have to tamper with

The supply is only a question like this:

  supply (key, worth) as (
    choose payment_date::date::timestamp, sum(quantity)
    from fee
    the place extract(yr from payment_date) < 2006
    group by payment_date::date::timestamp
    order by payment_date::date::timestamp
  )

It produces all income per fee date from the fee desk. Whereas payment_date is a timestamp, we solid that so far to have the ability to get day by day income. However with a view to fill the gaps utilizing PostgreSQL’s generate_series, we have now to solid the date worth again to timestamp, as a result of surprisingly, there’s no native generate_series(date, date) operate in PostgreSQL.

All you need to do is produce a set of knowledge in a key/worth type. You possibly can change this by anything, e.g. to get cumulative income:

  supply (key, worth) as (
    choose 
      payment_date::date::timestamp,
      sum(sum(quantity)) over (order by payment_date::date::timestamp)
    from fee
    the place extract(yr from payment_date) < 2006
    group by payment_date::date::timestamp
    order by payment_date::date::timestamp
  )

… for which you (at present) must patch the normalisation again to false (the padding of gaps isn’t right but). Additionally, to avoid wasting area, I’ve made the bars a bit slimmer:

  '#' as characters,
  '' as separator,
  false as normalise_x

And now you’ll get this good chart displaying the exponential improve of income that we want so dearly to point out our managers (it’s not truly exponential, as a result of now, the gaps aren’t revered, however duh, it’s simply generated pattern information):

chart                                                |
-----------------------------------------------------+
66872.4100 |                                        #|
62414.2493 |                                       ##|
57956.0886 |                                     ####|
53497.9280 |                                   ######|
49039.7673 |                                  #######|
44581.6066 |                               ##########|
40123.4460 |                              ###########|
35665.2853 |                            #############|
31207.1246 |                          ###############|
26748.9640 |                       ##################|
22290.8033 |                     ####################|
17832.6426 |                   ######################|
13374.4820 |                #########################|
8916.32133 |            #############################|
4458.16066 |        #################################|
-----------+-----------------------------------------|
           | 2005-05-24 00:00:00  2005-08-23 00:00:00|

Superior, huh! Mess around with it right here:

Ship your pull requests with enhancements. Challenges:

  • Stacked charts
  • Fill gaps additionally for cumulative information
  • Different options?

The sky is the restrict.



RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments