how to write subqueries in SQL

In SQL a lot of the times there will be a need to write subqueries that will extract data that you can analyse.

But what if you want to use the data as input to another query that can then be used to give you the result you desire?

This is is the power of using a subquery, the subquery can gather the information you need, the main query will then provide the output.

But why would somebody approach it this way?

(A) Complex SQL and joins are used in the sub query.

(B) The main query that uses the sub query , summaries the data.

(C) Lots of data is returned in the subquery, the programmer chooses out of it what they want.

(D) Flexibility – You can adjust the main query to pull out what you need from it, not everything is required to be used, but can be quickly if a decision is made to utilise that data.

(E) As long as the subquery does not need to be changed at all, it can reduce maintenance and coding required to keep it up to date.

So what are the different ways we can use sub queries?

We will use these table values to return output that will allow us to provide output for analysis.

CUSTOMER

SALES

Selecting specific values that you only want to look at

In the below code the second select statement is the subquery. In this instance, we are using the output of that as an input to the first query on the left-hand side of the equals sign.

This therefore allows us to alter the right hand side for the values we want to return in the output, by just adding customer nos.

Note by adding customer nos we will need to change the equals sign, that will be covered off in the next section

select * FROM DBO.SALES
where CUSTOMER_NO = (select CUSTOMER_NO FROM dbo.CUSTOMER WHERE CUSTOMER_NO = '123456' )

Giving the below output:

Returning more than one value from the sub query

In the above section, we where focused on returning one value. That used the operator “equals”.

When we have more than one value to return that will not work, hence we use the like operator in the sub query.

As a result of this, the sub query will now return more than one value, this is the desired output we are looking for

select * FROM DBO.SALES
where CUSTOMER_NO IN (select CUSTOMER_NO FROM dbo.CUSTOMER WHERE INVOICE_NO LIKE 'INV%')

The output now that will be produced is as follows:

As can be seen the subquery returns through the wild card ‘INV%’ all those rows with that value. Then the main query will match all the customers it has with the customer nos returned in the subquery to produce the final output.

Use a subquery to UPDATE values

There may be a scenario where a subquery is used to update values on a table.

The subquery below checks for a NULL INVOICE_NO and the customer no = 654654, and if both are true, then the main query updates the table column INVOICE_NO value to ‘ERROR’ where CUSTOMER_NO = 654654.

SELECT * FROM dbo.SALES;

UPDATE dbo.Sales
SET INVOICE_NO = 'ERROR'
WHERE CUSTOMER_NO in (SELECT CUSTOMER_NO FROM dbo.SALES where INVOICE_NO is null AND CUSTOMER_NO = 654654);


SELECT * FROM dbo.SALES;

Before the update is applied:

After the update is applied:

Use a subquery to DELETE values

The final use of a sub query we will use is to look at a scenario where we want to delete values from the database table.

If we look at the above output from the previous section, we will see that the below code should remove rows 5-10 and 12.

Here we will use:

DELETE FROM dbo.Sales
WHERE CUSTOMER_NO in (SELECT CUSTOMER_NO FROM dbo.SALES where INVOICE_NO is null);

SELECT * FROM dbo.SALES;

And this will give us the following updated table:

To summarise the sections above:

(A) Using a sub query has many applications, and its use is varied.

(B) Using a sub query can be useful to help altering table values.

(C) Sub queries can be as complex or as simple as you want to make them.

how to update records in SQL

Estimated reading time: 3 minutes

There are multiple reasons for wanting to update records in a database table, from data quality to enriching your data for any machine learning projects.

In this blog post we will look at how to update database records, and also where you try to apply updates and they cannot be completed.

The where clause will become important as it allows you to identify specific records.

It will be updating the records below:

Updating a single record

--Update one record on its own
UPDATE dbo.CUSTOMER
SET customer_type = 'WEB'
WHERE CUSTOMER_NO = 125872;

Running the above code gives this output, resulting in the value been updated per our SQL statement.

Updating multiple rows at once

There maybe scenarios where you want to have multiple records/rows updated at once, the below code will provide you with an example of how this can be achevied:

--Update multiple rows at once
UPDATE dbo.CUSTOMER
SET customer_type = 'WEB'
WHERE CUSTOMER_NO in(333333,456789)

This will now give you updated values in the output below. It is a very handy way to at once get multiple records updated.

Show an update that is valid but will not change anything

--Show an update request that does not complete
UPDATE dbo.CUSTOMER
SET CUSTOMER_NO = 2,customer_type = 'ERROR'
WHERE CUSTOMER_NO = 456789222 ---> will not execute as this value does not exist, it ignores.

With the above code, we have a valid update statement. In this scenario it will not complete as the CUSTOMER_NO does not exist on the database table.

In essence the output is exactly the same as in the previous example.

Update multiple column values

In our final example here, we are going to show how to update the CUSTOMER_No and the customer_type values.

The below SQL will change both values, but it is important to note that as the CUSTOMER_NO is a primary key, it cannot be changed to a number that already exists as primary keys can’t be null or have duplicates.

UPDATE dbo.CUSTOMER
SET CUSTOMER_NO = 3,customer_type = 'ERROR'
WHERE CUSTOMER_NO = 333333

So to wrap up there are a number of ways to update your database table. Here we have introduced some examples. There are also some additional steps you could take:

  1. Count all the records before and after to ensure they remain the same.
  2. Create a copy of the table before updates are applied, and then compare that to the updated table. The differences should be what you expect to see.

how to create and drop a table in SQL

If you are working on data analytics , and have access to databases that you can load data into and manipulate, then read on.

There maybe a reason for you have your own partitioned table with data in it, but you need to know how to create a database.

The creation of a table can be achieved as follows;

  • Decide on the table name.
  • Within the table you will have columns, they need to be named.
  • Next for each column name a datatype needs to be decided upon.

It is important to note that the data type you decide is very important before you create the table.

This is because once you start loading data into the table if you decide to change the data type for a column, it could change the data contained within that column.

As a result a review of the data to be loaded should be completed, looking at the type of data, its length and any attributes of the data that need careful consideration.

Lets create our table and talk through the different aspects of the SQL used.

So in the below we have the create table statement

The SQL is broken down as follows:

  1. The first line creates the table within the schema. It has its own unique area in there.
  2. Lines 2,3,4 create the columns with their datatypes.

When this is run , the schema is updated, one thing to note it is empty, and requires an insert statement to populate the columns.

And if we do a select all from the table, it returns empty:

How to drop a table that you have created?

Sometimes you may have a need to drop tables from your schema.

Before proceeding there are a couple of things to consider:

  • Have you backed up the data contained within it?
  • You understand the deletion is permanent ( unless you can rebuild from backup)
  • Any batch runs that run off the table will fail once it is deleted.
  • Any tables that have queries that join to the table been removed will fail.
  • This might be part of a wider database cleanup and may improve performance.

Below is the one line of code ( yes you are reading that correct), that will delete the table.

In most roles the ability to drop tables will be restricted by the database administrator.

After the deletion it will look like this. As you can see the table no longer exists.

In conclusion the deletion of a database table is very easy to do , but comes with caveats as outlined above, so proceed with caution.

How to data cleanse a database table

In Data Analytics, that is a very relevant question, and something I look to implement in most projects, sometimes it is too easy to click the shortcut icon to your spreadsheet application!

Here we are looking to bring a little automation into these videos. Building on How to import data from files and Removing characters from an imported CSV file this video connects to a Microsoft Azure cloud database table, brings in the data with errors on it, fixes the errors and displays the correct output on the screen.

What can this do for organisations?

There are several benefits to automating this step:

  • Less manual intervention if there is a need to fix data issues.
  • Better productivity.
  • Better data flows with no errors and quicker reporting.

 

Moving away from files

The process of moving away from files and into automation has several steps:

  • Be clear on your data needs.
  • Understand what you are trying to achieve.
  • Build a process that is repeatable but can be updated easily.
  • Ensure that you build in data quality checks, helps deliver the better output to the users.

Thanks for stopping by!

Data Analytics Ireland