This week we will continue our exploration knowledge of SQL by looking at some slightly more complex SQL Queries, how to analyze data and explore analytic SQL Functions.
This week we will look at:
- Analyzing data using basic SQL statistical functions
- Look at a subset of the analytical and window SQL functions
Exercise 1 – Analyzing data
Use the statistical function to answer the following question, using the EMPT, DEPT, etc tables.
[The following exercises will take approx 25-35 minutes to complete]
1-1 – What is the maximum and minimum salary
1-2 – What is the difference between the maximum and the average salary
1-3 – Count the number of employees per grade
1-4 – What is the average salary per department
1-5 – Select the department and number of employees, who have a department count greater than 2
1-6 – List the employees who have a salary greater than the average salary
1-7- Use the SUMMARY package to gather statistics
Use the in-database package to gather summary statistics about one of the attributes in a table. This package and function is called DBMS_STAT_FUNCS.SUMMARY. Check out my blog post about this function.
Exercise 2 – Analytical SQL Functions
This lab exercise requires you to practice using some of the hundreds of analytical SQL functions. In addition to the exercises/tasks listed below, you should spend some time exploring and experimenting with the many of these analytical SQL functions
[Open the Oracle SQL Language Reference Manual = RTFM]
[Also see the examples in the slides]
[50-75 minutes to complete Tasks 1-6. Complete remaining Tasks if time or as homework]
Task 1 – Practice using basic SQL Stats functions
Experiment with using the following stats functions in SQL, on the data sets you have in your Oracle Schema
AVG(): returns the mean
COUNT(): returns the population (or sample, depending on the row source)
POWER(a,n): returns the value of a to the nth power
SQRT(n): returns the square root of n
SUM(): returns the sum of the values in a set
STDDEV(): returns the standard deviation of a sample
STDDEV_POP(): returns the standard deviation of a population
VARIANCE(): returns the variance of a sample
VAR_POP(): returns the variance of a population
CORR(): correlation analysis
Task 2 – Having Clause
Take the SQL stats functions used in Task 1 and modify them to include the HAVING Clause.
You will need to think carefully on How and Why you are using the HAVING clause in each case.
What are you trying to show or demonstrate in each case.
Write 1-3 line description of what the benefit is for using the HAVING clause for each example.
Task 3 – Sampling data
Sample = 5%
Sample = 10%
Sample = 15%
Sample = 20%
What differences can you find and/or observations that you see. Can you explain these?
Task 4 – PIVOT
Write a query that returns the average salary per department (full name)
Rewirte the query using the PIVOT command to display the display the departments as columns in the output.
Write a query that returns some of the details of each employee, ordered by salary in ascending order
Alter the query to add an additional column that displays the salary of the previous person in the results set
Alter the query to add an additional column that displays the salary of the next person in the results set
Add the following columns
– Difference in salary between the employee and the person who earns just below them
– Difference in salary between the employee and the person who earns just more than them
Task 6 – Ranking Staff
Using your basic query that returns the details of employees and their salary
– Use the RANK function applied to all the data
– Use the PERCENT_RANK function applied to all the data
– Use the CUME_DIST function applied to all the data
Using the PARTITION BY clause change the above queries to rank the results based on each department.
Exercise 3 (Optional)
Perform some regression analysis on the Employee salary data using the following function
Alter these questions to partition the results by
These should be done as separate queries
Use the ROLLUP function, on your employee salary query, to calculate aggregated results by
– department and job_id
Modify your previous query to use the CUBE function install of the ROLLUP function.
Examine how the results are different between the usage of the ROLLUP and CUBE functions.
Can you think of other scenarios when you would use one of these functions instead of the other.