how to write subqueries in SQL

In SQL a lot of the times there will be a need to write subqueries that will extract data that you can analyse.

But what if you want to use the data as input to another query that can then be used to give you the result you desire?

This is is the power of using a subquery, the subquery can gather the information you need, the main query will then provide the output.

But why would somebody approach it this way?

(A) Complex SQL and joins are used in the sub query.

(B) The main query that uses the sub query , summaries the data.

(C) Lots of data is returned in the subquery, the programmer chooses out of it what they want.

(D) Flexibility – You can adjust the main query to pull out what you need from it, not everything is required to be used, but can be quickly if a decision is made to utilise that data.

(E) As long as the subquery does not need to be changed at all, it can reduce maintenance and coding required to keep it up to date.

So what are the different ways we can use sub queries?

We will use these table values to return output that will allow us to provide output for analysis.

CUSTOMER

SALES

Selecting specific values that you only want to look at

In the below code the second select statement is the subquery. In this instance, we are using the output of that as an input to the first query on the left-hand side of the equals sign.

This therefore allows us to alter the right hand side for the values we want to return in the output, by just adding customer nos.

Note by adding customer nos we will need to change the equals sign, that will be covered off in the next section

select * FROM DBO.SALES
where CUSTOMER_NO = (select CUSTOMER_NO FROM dbo.CUSTOMER WHERE CUSTOMER_NO = '123456' )

Giving the below output:

Returning more than one value from the sub query

In the above section, we where focused on returning one value. That used the operator “equals”.

When we have more than one value to return that will not work, hence we use the like operator in the sub query.

As a result of this, the sub query will now return more than one value, this is the desired output we are looking for

select * FROM DBO.SALES
where CUSTOMER_NO IN (select CUSTOMER_NO FROM dbo.CUSTOMER WHERE INVOICE_NO LIKE 'INV%')

The output now that will be produced is as follows:

As can be seen the subquery returns through the wild card ‘INV%’ all those rows with that value. Then the main query will match all the customers it has with the customer nos returned in the subquery to produce the final output.

Use a subquery to UPDATE values

There may be a scenario where a subquery is used to update values on a table.

The subquery below checks for a NULL INVOICE_NO and the customer no = 654654, and if both are true, then the main query updates the table column INVOICE_NO value to ‘ERROR’ where CUSTOMER_NO = 654654.

SELECT * FROM dbo.SALES;
UPDATE dbo.Sales
SET INVOICE_NO = 'ERROR'
WHERE CUSTOMER_NO in (SELECT CUSTOMER_NO FROM dbo.SALES where INVOICE_NO is null AND CUSTOMER_NO = 654654);
SELECT * FROM dbo.SALES;

Before the update is applied:

After the update is applied:

Use a subquery to DELETE values

The final use of a subquery we will use is to look at a scenario where we want to delete values from the database table.

If we look at the above output from the previous section, we will see that the below code should remove rows 5-10 and 12.

Here we will use:

DELETE FROM dbo.Sales
WHERE CUSTOMER_NO in (SELECT CUSTOMER_NO FROM dbo.SALES where INVOICE_NO is null);
SELECT * FROM dbo.SALES;



Have you seen these great posts?

how to groupby in a select statement
how to update records in sql
Deleting table records in SQL

Deleting table records with SQL

Estimated reading time: 4 minutes

Often you will hear about deleting table records with SQL, but how is this achieved, if you have not done it before?

In previous videos on SQL we discussed how to update records in SQL , how to insert data into a table in SQL.

You could be faced with the scenario where there is data that you no longer need, as a result, it needs to be deleted.

So lets work through three different scenarios as follows:

(A) Delete a record based on a unique identifier, in this instance CUSTOMER_NO.

(B)Delete multiple customer records at once using CUSTOMER_NO as the column that will identify the records.

(C) Delete a record where it equals a particular column value.

Our initial dataset will look like this:

This image has an empty alt attribute; its file name is image-14.png

Delete a record based on a unique identifier

DELETE FROM dbo.CUSTOMER WHERE CUSTOMER_NO = 987981

select * from dbo.CUSTOMER

When the above code is run it gives the following output.

As can be seen it now has ten records and the row for CUSTOMER_NO = 987981 has been removed.

This image has an empty alt attribute; its file name is image-15.png

Delete multiple customer records at once

In the above scenario we have deleted just one record, but most likely you will want to remove more than one.

This code below will complete this at once. Remember the column we are using is a primary key column, hence all the values are unique.

It may be the case that you are doing to have to do this on a column that does not have unique values, the code will still work.

I would caution though that using the primary key column value always allows you to only remove the ones that are not needed.

It could be the case that there are genuine values that need to remain, even if they are duplicated.

DELETE FROM dbo.CUSTOMER WHERE CUSTOMER_NO in(3,25897458,25898633)

select * from dbo.CUSTOMER
This image has an empty alt attribute; its file name is image-16.png

As can be seen running the above code now leaves us with seven rows, down from ten and the three rows we had in the SQL have been removed.

Deleting a record where it equals a particular column value

So our final scenario is where we need to find a particular value in a column , and it is exists , then delete the whole record for that row.

Here we are not relying on the primary key, but basically a string search in the customer_type column.

When this SQL is run, it supposed remove two rows and leave us with five records.

The problem that you will encounter

There will be a problem with this as the primary key , even though is not been referenced will throw an error.

The error in this instance relates to the the CUSTOMER_NO on the table customer

  • It has a foreign key equivalent column in the sales table.

In essence you cant leave out the primary key value when completing this SQL statement.

As a result both tables need to be updated, where the primary key value exists.

DELETE FROM dbo.CUSTOMER WHERE customer_type = 'POSTAL'

select * from dbo.CUSTOMER
** the above code will fail with the below error:
The DELETE statement conflicted with the REFERENCE constraint "FK__SALES__CUSTOMER___267ABA7A". The conflict occurred in database "SALES", table "dbo.SALES", column 'CUSTOMER_NO'.
The statement has been terminated.

Our table in sales looks like this:

This image has an empty alt attribute; its file name is image-17.png

Updating our logic as follows:

DELETE FROM dbo.SALES
where SALES.CUSTOMER_NO = 123458 ---> This line needs to be run first as the next line checks if this exists as a foreign key.
DELETE FROM dbo.CUSTOMER
where CUSTOMER.CUSTOMER_NO = 123458
and customer_type = 'POSTAL'


select * from dbo.CUSTOMER
select * from dbo.SALES

Results in:

This image has an empty alt attribute; its file name is image-18.png

To summarise there are a number of things to consider:

  • It is important to always check if there is a primary and foreign key relationship.
  • Specifically not checking these relations will result in the SQL outputting an error.
  • Under these circumstances a review of all tables before proceeding is advised.
  • Ensure you have your tables backed up before doing any deletions!

how to update records in SQL

Estimated reading time: 3 minutes

There are multiple reasons for wanting to update records in a database table, from data quality to enriching your data for any machine learning projects.

One example maybe where you want to add a creation date to a record.

In this blog post we will look at how to update database records, and also where you try to apply updates and they cannot be completed.

The where clause will become important as it allows you to identify specific records.

It will be updating the records below:

Updating a single record

--Update one record on its own
UPDATE dbo.CUSTOMER
SET customer_type = 'WEB'
WHERE CUSTOMER_NO = 125872;

Running the above code gives this output, resulting in the value been updated per our SQL statement.

Updating multiple rows at once

There maybe scenarios where you want to have multiple records/rows updated at once, the below code will provide you with an example of how this can be achevied:

--Update multiple rows at once
UPDATE dbo.CUSTOMER
SET customer_type = 'WEB'
WHERE CUSTOMER_NO in(333333,456789)

This will now give you updated values in the output below. It is a very handy way to at once get multiple records updated.

Show an update that is valid but will not change anything

--Show an update request that does not complete
UPDATE dbo.CUSTOMER
SET CUSTOMER_NO = 2,customer_type = 'ERROR'
WHERE CUSTOMER_NO = 456789222 ---> will not execute as this value does not exist, it ignores.

With the above code, we have a valid update statement. In this scenario it will not complete as the CUSTOMER_NO does not exist on the database table.

In essence the output is exactly the same as in the previous example.

Update multiple column values

In our final example here, we are going to show how to update the CUSTOMER_No and the customer_type values.

The below SQL will change both values, but it is important to note that as the CUSTOMER_NO is a primary key, it cannot be changed to a number that already exists as primary keys can’t be null or have duplicates.

UPDATE dbo.CUSTOMER
SET CUSTOMER_NO = 3,customer_type = 'ERROR'
WHERE CUSTOMER_NO = 333333

So to wrap up there are a number of ways to update your database table. Here we have introduced some examples. There are also some additional steps you could take:

  1. Count all the records before and after to ensure they remain the same.
  2. Create a copy of the table before updates are applied, and then compare that to the updated table. The differences should be what you expect to see.

how to write subqueries in SQL

how to groupby in a select statement

how to insert data into a table in SQL

how to join tables in SQL