In previous article on SQL, I focused only on some basic queries. This article will talk about the built-in-functions. Built-in-functions are used in the SQL statement to speed up the data retrieval process. These functions help shape up the data more precisely, so significantly reduce the output size. We can write functions as well but focus of this article is built-in-functions in SQL. Here I am discussing few functions in detail:
Aggregate functions take an entire column or an entire row and output a single value or null value. Some examples of aggregate functions are sum(), max(), min(), avg() etc.
It’s time to see some use cases. We already have a petsale table in our database. I used IMB DB2-warehouse on a cloud database for today’s work also. Click here to get the table. By running this statements you will get the exact same table that I will use here. Or use any other table you have available to practice. Petsale table looks like this:
Let’s use this table to see how we can use aggregate functions. Starting with sum function. This is how we sum up all the values in SALEPRICE column. Say, we want to know the total revenue from the pet sale.
Select SUM(SALEPRICE) from PETSALE
This was my output. I cut only the result portion of the page to display here. Full page won’t look clear.
What if we need to have that sum in a new column called sum_of_saleprice.
Select sum(SALEPRICE) as SUM_OF_SALEPRICE from PETSALE
Output table looks like this:
Get the maximum quantity of any animal:
Select max(QUANTITY) from PETSALE
Here we get the max quantity:
All of the functions can be used on the subset of a data table as well. For example, get the minimum value of ID column for Cats:
Select min(ID) from PETSALE where ANIMAL = ‘Cat’
This is the result. Min cat id is 1.
I am providing some more examples here. Specify the average value of all the dogs in.
Select avg(SALEPRICE) from PETSALE where animal = ‘Dog’
Output of the query shoes the average dog price was $266.
We can perform mathematical operations between the columns and use average function on that. This kind of complex queries are very helpful to minimize steps and achieve more better results. Here is an example:
Select avg(SALEPRICE / QUANTITY) from PETSALE where ANIMAL = ‘Dog’
Scaler and String Functions
These functions perform their operations on every element of the selected portion of the table. Some examples of scaler and string functions are: length(), round(), ucase, lcase. It’s better understandable with examples. Here are some examples:
Round Up or Down all the values in SALEPRIE column
Select round(saleprice) from petsale
Retrieve the length of each value in animal. It may not seem too important in this table but when dealing with text data this function may be useful.
Select length(animal) from petsale
Retrieve animal values in uppercase
Select ucase(animal) from petsale
We can also use two function in one statement. Suppose we want to find out what are name of animals that we sold till now and we want to get the result in uppercase. Here is how you should write the queries:
Select distinct(ucase(animal)) from petsale
Date, Time Functions
Most Databases have unique datatypes for dates and times. The general format for dates and times are:
Date: YYYYMMDD YearMonthDay
Time: HHMMSS HourMinuteSecond
Timestamp: YYYYXXDDHHMMSSZZZZZZ Year Month Day Hour Minute Second Microsecond
Commonly used Date, time functions are Year(), Month(), Day(), DAYOFMONTH(), DAYOFWEEK(), DAYOFYEAR(), WEEK(), HOUR(), MINUTE(), SECOND(). Time to practice with some example again.
Extract the day portion from a date:
Select day(saledate) from petsale where animal=’Dog’
We need to know how was the sales performance in the month of May.
Select count(*) from petsale where month(saledate) = ‘05’
Just for a practice, let’s find the date of 3 days later from each sale date.
Select (saledate + 3 days) from petsale
Find out how many days have passed since each sale date till now
Select (current_date – saledate) from petsale
Output of this query is gong to be is days:
Sub-queries and Nested Selects
A sub query is a query nested inside another query. Almost two full length query to perform one big task. Basic syntax may look something like this:
select column1 from table Where column2 = (select max(column2) from table)
Some real examples will make it clear. If you saw my previous article on SQL, you already know about employees table. If not click here to get the SQL statement to create the employees table and also a csv file called employees.csv for the rest of the data. This is how my employees table looks like:
Now I will this table to work on some nested queries. We need to make a list containing the information of employees who earn more than the average salary. This is how I did it:
Select emp_id, f_name, l_name, salary from employees where salary < (select avg(salary) from employees);
Substituting column name with a sub_query is called a column expressions.
Select emp_id, salary, (select avg(salary) from employees) as avg_salary from employees
This will output emp-id, corresponding salary and a new column named avg_salary :
Substitute the table name with a sub-query called derived tables or table expressions
Select * from (select emp_id, f_name, l_name, dep_id from employees) as emp4all;
Output will be a new table named emp4all that has emp_id, f_name, l_name, dep_id columns from employees table.