Querying Teradata and SQL Server - Tutorial 2 Filtering and Sorting

2460 views 2 comments posted at about 3 years ago Raymond

Tutorial –1

For Teradata developers, if you have no SQL Server installed, please go to the following link to download the SQL Server 2014 Expression Edition.

http://www.microsoft.com/en-us/server-cloud/Products/sql-server-editions/sql-server-express.aspx

image

It is easy to get started and free to use. Follow the ‘Install instruction’ section in the download page to install.

Download page: http://www.microsoft.com/en-us/download/details.aspx?id=42299

About Filtering and Sorting Data

Filtering and sorting data is very common in querying database. In this tutorial, I will cover the following items:

  • Filtering data by predicates
  • Sorting data
  • The TOP statement

Prerequisites

Querying Teradata and SQL Server - Tutorial 0: Preparation

Filter data by predicates

Task

Retrieve all the employees whose salary is below 120,000 from Employee table; return these columns: (Employee ID, Employee Name, Salary).

Query – Teradata

 DATABASE TD_MS_SAMPLE_DB;
 SELECT EmployeeID,
    EmployeeName,
    Salary
    FROM employee
    WHERE Salary < 120000;

Query – SQL Server

 USE TD_MS_SAMPLE_DB
 GO
 SELECT EmployeeID,
    EmployeeName,
    Salary
    FROM employee
    WHERE Salary < 120000;

Result

image

Task

Retrieve the following employees from Employee table; return these columns: (Employee ID, Employee Name, Salary).

  • Derek
  • Nancy
  • David

Query – Teradata

 DATABASE TD_MS_SAMPLE_DB;
 SELECT EmployeeID,
    EmployeeName,
    Salary
    FROM employee
    WHERE EmployeeName =ANY ('Derek','Nancy','David');

Query – SQL Server

 USE TD_MS_SAMPLE_DB
 GO
 SELECT EmployeeID,
    EmployeeName,
    Salary
    FROM employee
    WHERE EmployeeName IN ('Derek','Nancy','David');
-- OR
 SELECT EmployeeID,
    EmployeeName,
    Salary
    FROM employee
    WHERE EmployeeName = ANY (select 'Derek' union select 'Nancy' union select 'David');

Result

image

Comments

Both Teradata and SQL Server supports IN, =ANY, NOT = ALL, = SOME operators. Sometimes, using these operators will be easier.

To learn more, visit:

https://technet.microsoft.com/en-us/library/ms187074(v=sql.105).aspx

http://www.info.teradata.com/HTMLPubs/DB_TTU_13_10/index.html#page/SQL_Reference/B035_1145_109A/ch11.15.012.html

Task

Retrieve all the employees whose age are above 50 from Employee table ; return these columns: (Employee ID, Employee Name, Age ).

SQL – Teradata

SELECT EmployeeID,
    EmployeeName,
    (EXTRACT (YEAR
    FROM CURRENT_TIMESTAMP) - EXTRACT(YEAR
    FROM Birthday)) AS Age
    FROM employee
    WHERE Age > 50;

SQL – SQL Server

 USE TD_MS_SAMPLE_DB
 GO
 SELECT EmployeeID,
    EmployeeName,
    DATEDIFF(YEAR,Birthday, CURRENT_TIMESTAMP) AS Age 
    FROM employee
    WHERE DATEDIFF(YEAR,Birthday, CURRENT_TIMESTAMP) > 50;

Result

image

Comments

In Teradata, once you give the column alias, you can use the alias in SELECT, WHERE, ORDER BY, GROUP BY… In SQL Server, you can only use it in ORDER BY statement due to different execution sequence.

In SQL Server, we write our query in the sequence of SELECT … FROM … WHERE … GROUP BY… HAVING… ORDER BY while the engine will execute in different order: FROM … WHERE … GROUP BY… HAVING… SELECT… ORDER BY.

Task

Retrieve all the employees who were born between ‘1970-01-01’ and ‘1980-01-01’ from Employee table ; return these columns: (Employee ID, Employee Name, Birthday).

SQL – Teradata

SELECT EmployeeID,
    EmployeeName,
    Birthday
    FROM employee
    WHERE Birthday BETWEEN '1970-01-01' AND '1980-01-01';

SQL – SQL Server

 USE TD_MS_SAMPLE_DB
 GO
 SELECT EmployeeID,
    EmployeeName,
    Birthday
    FROM employee
    WHERE Birthday BETWEEN '1970-01-01' AND '1980-01-01';

Result

image

Comments

When querying character data, it is important to understand whether the column you are filtering on is case sensitive. You can specify case sensitive when querying Teradata.

For example, the following query will not return any records.

SELECT EmployeeID,
    EmployeeName,
    Birthday
    FROM employee
    WHERE 'Name'='name' (CASESPECIFIC);

In SQL Server, you can specify the collations.

https://technet.microsoft.com/en-us/library/ms180175(v=sql.100).aspx

Task

Retrieve all the employees who have no managers from Employee table ; return these columns: (Employee ID, Employee Name).

SQL – Teradata

SELECT EmployeeID,
    EmployeeName
    FROM employee
    WHERE  EmployeeManagerEmployeeID IS NULL ;

SQL – SQL Server

 USE TD_MS_SAMPLE_DB
 GO
 SELECT EmployeeID,
    EmployeeName
    FROM employee
    WHERE EmployeeManagerEmployeeID IS NULL;

Result

image

Comments

Always be careful when you deal with NULLs.

Sorting data

Task

Retrieve all the employees from Employee table order by Gendar ASC, EmployeeName DESC; return these columns: (Employee ID, Employee Name).

SQL – Teradata

SELECT EmployeeID
      ,EmployeeName
      ,EmployeeManagerEmployeeID
      ,DepartmentID
      ,Gendar
      ,Birthday
      ,JobGrade
      ,Salary
  FROM Employee
  ORDER BY Gendar, EmployeeName DESC;

SQL - SQL Server

USE TD_MS_SAMPLE_DB
GO
SELECT [EmployeeID]
      ,[EmployeeName]
      ,[EmployeeManagerEmployeeID]
      ,[DepartmentID]
      ,[Gendar]
      ,[Birthday]
      ,[JobGrade]
      ,[Salary]
  FROM [dbo].[Employee]
  ORDER BY [Gendar], [EmployeeName] DESC


Result

image

The TOP Statement

Both SQL Server and Teradata support the SELECT TOP statement.

Task

Retrieve the top three employees who have the highest salary from Employee table ; return these columns: (Employee ID, Employee Name, Salary).

SQL – Teradata

SELECT TOP 3 EmployeeID
      ,EmployeeName
      ,Salary
  FROM Employee
  ORDER BY Salary DESC;

SQL – SQL Server

USE TD_MS_SAMPLE_DB
GO
SELECT TOP 3 [EmployeeID]
      ,[EmployeeName]
      ,[Salary]
  FROM [dbo].[Employee]
  ORDER BY [Salary] DESC;

Result

image

Comments

In Teradata, the TOP function is slightly different.

TOP {decimal | integer} [PERCENT] [WITH TIES]

    Some samples:
  • TOP 10 – Return the top ten rows
  • TOP 10 PERCENT – Return the top 10% of rows
  • TOP 10 WITH TIES – If more that one row has the same criteria value, return all
  • TOP 10 PERCENT WITH TIES – If more than one row has the same criteria value, return all

Task

Retrieve the 2nd, 3rd and 4th employees who have the highest salary from Employee table ; return these columns: (Employee ID, Employee Name, Salary).

SQL – SQL Server

USE TD_MS_SAMPLE_DB
GO
SELECT  [EmployeeID]
      ,[EmployeeName]
      ,[Salary]
  FROM [dbo].[Employee]
  ORDER BY [Salary] DESC
  OFFSET 1 ROW FETCH NEXT 3 ROWS ONLY;

Result

image

Comments

After SQL Server 2012, the OFFSET FETECH statement can help you retrieve only a window or page of data from the result set (must be used with ORDER BY clause ).

For the detailed syntax, please refer to https://technet.microsoft.com/en-us/library/gg699618(v=sql.110).aspx.

There is not equivalent OFFSET FETECH statement in Teradata.

Task

Retrieve 3 sample employees from Employee table ; return these columns: (Employee ID, Employee Name, Salary).

SQL – Teradata

SELECT EmployeeID
      ,EmployeeName
      ,Salary
  FROM Employee
  SAMPLE 3;

SQL – SQL Server

USE TD_MS_SAMPLE_DB
GO
SELECT  [EmployeeID]
      ,[EmployeeName]
      ,[Salary]
  FROM [dbo].[Employee] TABLESAMPLE(3 ROWS);


Result

image

Comments

Both Teradata and SQL Server supports retrieve sample data from result set (by actual number or percentage).

-- SQL Server

USE TD_MS_SAMPLE_DB
GO
SELECT  [EmployeeID]
      ,[EmployeeName]
      ,[Salary]
  FROM [dbo].[Employee] TABLESAMPLE(20 percent);

-- Teradata
SELECT EmployeeID
      ,EmployeeName
      ,Salary
      ,SAMPLEID
  FROM Employee
  SAMPLE 0.2;

The physical implementation of TOP function data in Teradata is not randomized while for SAMPLE you can control whether to return the sample data by AMP proportionally.

* About AMP:

The AMP is a vproc in the Teradata Database's shared-nothing architecture that is responsible for managing a portion of the database. Each AMP will manage some portion of each table on the system. AMPs do the physical work associated with generating an answer set which includes sorting, aggregating, formatting, and converting. The AMPs retrieve and perform all database management functions on the required rows from a table. An AMP accesses data from its single associated vdisk. AMPs is also able to redistribute a copy of the data to other AMPs.

Summary

I didn’t cover all the details for filtering and sorting data since I am only focusing on the most commonly used ones. You can learn all the others through the official tutorials.

In next tutorial, I will talk about how to combine sets in SQL Server and Teradata.

Due to limited knowledge, there might be errors/issues in my tutorials; please let me know if any information is inaccurate.

Add comment

Comments (2)

Raymond about 2 years ago Re:Querying Teradata and SQL Server - Tutorial 2 Filtering and Sorting

@Tiru INSERT INTO TARGET_TABLE SELECT * FROM SOURCE_TABLE is a standard ANSI SQL. For lock statement, it is the syntax for Teradata. ACCESS lock is similar to set transaction isolation level as READ UNCOMMITTED in other database, i.e. SQL Server.

Tiru about 2 years ago

what is the exact query in ANSI sql for below mentioned query ? LOCKING SOURCE_TABLE FOR ACCESS INSERT INTO TARGET_TABLE SELECT * FROM SOURCE_TABLE;

Tiru about 2 years ago Re:Querying Teradata and SQL Server - Tutorial 2 Filtering and Sorting

what is the exact query in ANSI sql for below mentioned query ? LOCKING SOURCE_TABLE FOR ACCESS INSERT INTO TARGET_TABLE SELECT * FROM SOURCE_TABLE;
In this Page