Skip to content

Commit df39adb

Browse files
committed
365 sql just exploring
1 parent d5462ca commit df39adb

File tree

5 files changed

+487
-0
lines changed

5 files changed

+487
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
/*
2+
Mock Interview 1)
3+
4+
-- table name: post_events
5+
6+
-- user_id int
7+
-- created_at datetime
8+
-- event_name varchar
9+
-- (event, post, cancel)
10+
11+
12+
Question: What information would you like to start off by pulling to get an overall understanding of
13+
the post feature?
14+
15+
16+
Possible Answer:
17+
We might want to get an idea of OVERALL HEALTH.
18+
+ Total number of posts (number of enters)
19+
+ Posts Made by Date
20+
+ Success Rate
21+
+ Cancel Rate
22+
23+
*/
24+
25+
26+
/***** Success Rate *****/
27+
-- success rate by date
28+
-- date | success rate = number of posts / number of enters
29+
30+
-- explaination in english
31+
-- group by date
32+
-- count of number of posts / count of number enters
33+
34+
SELECT created_at,
35+
COUNT(CASE WHEN event_name = 'post' 1 ELSE null END) * 1.00 /
36+
COUNT (CASE WHEN event_name = 'enter' 1 ELSE null END) * 100 AS percent_success
37+
FROM post_events
38+
GROUP BY created_at
39+
ORDER BY created_at;
40+
41+
42+
/*
43+
Question: When sucess rates are low, how can we diagonis the issue?
44+
45+
Possible Answer: There can be several approach to this problem. But one possible way is we can take a look at the map
46+
out between created_at and success_rate. And see whether there is a pattern during a certain period of time.
47+
This can be one off dip, etc. Based on this information, we can take a further look at if there is any underlying
48+
issue in the application or not. Or if there is any potential group of users who are causing this kind of unsccessful posts.
49+
*/
50+
51+
52+
/*
53+
Questions:
54+
55+
What are the success rates by day?
56+
57+
Which day of the week has the lowest success rate?
58+
*/
59+
60+
-- group by dow of created_date
61+
-- average of perc_success
62+
-- order by per_success
63+
-- day | per_sucess
64+
65+
WITH created_events AS(
66+
SELECT created_at,
67+
COUNT(CASE WHEN event_name = 'post' 1 ELSE null END) * 1.00 /
68+
COUNT (CASE WHEN event_name = 'enter' 1 ELSE null END) * 100 AS percent_success
69+
FROM post_events
70+
GROUP BY created_at
71+
ORDER BY created_at)
72+
73+
SELECT EXTRACT (dow FROM created_at) AS dow,
74+
AVG(percent_success)
75+
FROM created_events
76+
GROUP BY 1
77+
ORDER BY 2 ASC;
78+
79+
80+
/*
81+
Question: What could be a problem if we're aggregating on percent success?
82+
83+
Possible Answer: this can lead to a problem that we're not taking into consideration of underlying distribution
84+
of percent success across the dates.
85+
*/
86+
87+
88+
SELECT EXTRACT (dow FROM created_at) AS dow,
89+
COUNT(CASE WHEN event_name = 'post' 1 ELSE null END) * 1.00 /
90+
COUNT (CASE WHEN event_name = 'enter' 1 ELSE null END) * 100 AS percent_success
91+
FROM created_events
92+
GROUP BY 1
93+
ORDER BY 2 ASC;
94+
95+
/*************************** Using actual table *******************************/
96+
97+
SELECT *
98+
FROM interviews.post_events
99+
LIMIT 10;
100+
101+
SELECT EXTRACT (dow FROM created_at) AS dow,
102+
COUNT(CASE WHEN event_name = 'post' THEN 1 ELSE null END) * 1.00 /
103+
COUNT (CASE WHEN event_name = 'enter' THEN 1 ELSE null END) * 100 AS percent_success
104+
FROM interviews.post_events
105+
GROUP BY 1
106+
ORDER BY 2 ASC;
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
/*
2+
Mock Interview 2)
3+
4+
Question:
5+
6+
-- 1) Find the date with the highest total energy consumption from our datacenters.
7+
-- 2) Output the date along with the total engery consumption across all datacenters.
8+
9+
-- Table: eu_energy
10+
date datetime
11+
consumption int
12+
13+
-- Table: asia_energy
14+
date datetime
15+
consumption int
16+
17+
-- Table: na_energy
18+
date datetime
19+
consumption int
20+
21+
*/
22+
23+
/******
24+
25+
Always make sure you under the tables and columns correctly. Ask the interviewer if you need to make any assumptions
26+
on the columns data, etc.
27+
28+
1) So there are 3 tables representing energy consumptions across different continents.
29+
Can I assume that there is only one energy consumption for each particular date Or can there be a multiple consumptions
30+
for a specific date? Is there any possible missing values too?
31+
32+
We can just sum it up across different dates across the different tables.
33+
*****/
34+
35+
36+
/*
37+
Question: What would you do if in the first table there are two of the same dates with
38+
different energy consumptions?
39+
40+
Possibe Answer:
41+
Then we can just group by using Date and sum the engery consumptions.
42+
*/
43+
44+
45+
/****
46+
47+
Clarification Question back to Interviwer:
48+
Is there a situation that there are Multiple Dates with the same highest Total Energy Consumption?
49+
****/
50+
Loading
Loading

0 commit comments

Comments
 (0)