Displaying my Work as I Question a Forem Occasion
This builds on work from my Stroll By way of of Utilizing Postgresql and Blazer to Generate a Cohort Report.
The question I’ll be constructing helps reply what share of lively customers commented on not less than one welcome article. For this question, an lively person is somebody who’s been on the positioning 4 of the final 7 days.
Establishing the Entity Relationship Mannequin
Once more, when writing SQL, I like to begin with a relationship diagram. On this knowledge foray, we now have 4 related tables.
class Consumer
has_many :articles
has_many :feedback
finish
class Article
belongs_to :person
has_many :feedback, as: :commentable
finish
class Remark
belongs_to :commentable, polymorphic: true
belongs_to :person
finish
class PageView
belongs_to :article
belongs_to :person, non-obligatory: true
finish
Beneath is a diagram for these of you preferring an Entity Relationship Mannequin (ERM) of the 4 knowledge fashions.
An ERM of the tables utilized in these queries.
Querying All Lively Members
I need all customers who’ve page_views
on not less than 4 of the final seven days; we’ll take into account these “lively customers.”
First I wish to construct a really slender question; one which lets me ensure that I do know I’m on the precise path. I’ll restrict the web page views to my user_id
:
SELECT user_id, extract(isodow from created_at) AS day_of_week
FROM page_views
WHERE page_views.user_id = 702612
AND page_views.created_at::date > NOW()::date - INTERVAL '7 day'
GROUP BY page_views.user_id, day_of_week
The next question instances out; It’s attempting to question all customers.
SELECT dow.user_id,
depend(dow.day_of_week) AS number_of_days
FROM (
SELECT user_id,
extract(isodow from created_at) AS day_of_week
FROM page_views
WHERE page_views.created_at::date
> NOW()::date - INTERVAL '7 day'
AND user_id IS NOT NULL
GROUP BY page_views.user_id, day_of_week
) AS dow
GROUP BY dow.user_id
HAVING depend(dow.day_of_week) >= 4
Due to the enormity of the web page views we have to restrict to solely lately up to date customers. The next is the question to get latest customers.
SELECT id
FROM customers
WHERE updated_at::date
> NOW()::date - INTERVAL '7 day'
The next question is the foundational “Who’re the present lively customers of Forem.”
SELECT DISTINCT dow.user_id FROM (
SELECT customers.id AS user_id,
extract(isodow from page_views.created_at) AS day_of_week
FROM customers
INNER JOIN page_views
ON page_views.user_id = customers.id
AND page_views.created_at::date
> NOW()::date - INTERVAL '7 day'
AND user_id IS NOT NULL
-- Prolong the window for customers only a bit to account --
-- for timing variance --
WHERE customers.updated_at::date
> NOW()::date - INTERVAL '8 day'
GROUP BY customers.id, day_of_week) AS dow
GROUP BY dow.user_id
HAVING depend(dow.day_of_week) >= 4
I “saved” the above question to https://dev.to/admin/blazer/queries/717-regular-and-active-recent-users-of-dev. We now have our “who’s the presently lively customers of DEV.to” question.
Querying Lively Customers Who Have Commented on a Welcome Submit
The following half is to work out who all commented on a welcome put up. In Stroll By way of of Utilizing Postgresql and Blazer to Generate a Cohort Report, I wrote about discovering the customers who had commented on the welcome article.
Nevertheless, I want to regulate the cohort question; I solely need customers who commented on the welcome put up. The cohort question has customers who did and didn’t touch upon the welcome put up.
As a fast reminder, the results of the next question is all user_id
that commented on a welcome put up; however with a limitation on the person’s updated_at
SELECT DISTINCT feedback.user_id AS user_id
FROM feedback
INNER JOIN customers
ON feedback.user_id = customers.id
-- Prolong the window for customers only a bit to --
-- account for timing variance --
AND customers.updated_at::date
> NOW()::date - INTERVAL '8 day'
INNER JOIN articles
ON feedback.commentable_id = articles.id
AND feedback.commentable_type = 'Article'
AND articles.title LIKE 'Welcome Thread - v%'
AND articles.revealed = true
AND articles.user_id = 3
GROUP BY feedback.user_id
Now to meld the 2 queries. I’m utilizing the Postgresql WITH
assertion to create two queries that I can reference afterward. I discover the WITH
assertion to assist “encapsulate” queries and hopefully make them extra conceptually comprehensible.
WITH cow AS (
-- Consumer IDs of latest of us who've commented on the
-- welcome threads --
SELECT DISTINCT feedback.user_id AS user_id
FROM feedback
INNER JOIN customers ON feedback.user_id = customers.id
-- Prolong the window for customers only a bit to account --
-- for timing variance --
AND customers.updated_at::date > NOW()::date - INTERVAL '8 day'
INNER JOIN articles
ON feedback.commentable_id = articles.id
AND feedback.commentable_type = 'Article'
AND articles.title LIKE 'Welcome Thread - v%'
AND articles.revealed = true
AND articles.user_id = 3
GROUP BY feedback.user_id
), dow AS (
-- Consumer IDs of oldsters who've interacted not less than 4 completely different
-- days of this week --
SELECT user_id FROM (
SELECT customers.id AS user_id,
extract(isodow from page_views.created_at) AS day_of_week
FROM customers
INNER JOIN page_views
ON page_views.user_id = customers.id
AND page_views.created_at::date >
NOW()::date - INTERVAL '7 day'
AND user_id IS NOT NULL
-- Prolong the window for customers only a bit to account for
-- timing variance --
WHERE customers.updated_at::date >
NOW()::date - INTERVAL '8 day'
GROUP BY customers.id, day_of_week
) AS dows
GROUP BY user_id
HAVING COUNT(day_of_week) >= 4
)
SELECT COUNT(dow.user_id) AS count_of_users,
(
SELECT COUNT(*)
FROM dow
INNER JOIN cow
ON cow.user_id = dow.user_id
) AS count_of_users_that_said_hello
FROM dow
Conclusion
After I was writing the final question, I stored getting a consequence that stated each lively person on the positioning had commented on a welcome article. I didn’t belief that consequence; it appeared extremely inconceivable. I revisited my queries and logic, discovered my error, reworked the question and received a extra affordable reply.
What was improper? I had copied and pasted a question from Stroll By way of of Utilizing Postgresql and Blazer to Generate a Cohort Report. However that question wasn’t the precise factor to ask. It does spotlight one problem of SQL; it may be arduous to check the correctness of your question.