what is hash join



what is hash join..

Answer / guest

The Hash join will be used, if there are no adequate indexes
on the joined columns. This is a worst situation. In this
case, hash table will be created. Hash join is most
efficient when one of the tables is significantly differ in
size than another one.

The query optimizer makes a Hash join in two phases: build
and probe. So, Hash join has two inputs: the build input and
the probe input.

On the build phase, hash table will be created by scanning
each value in the build input and applying the hashing
algorithm to the key.

Let me to describe Hash join on the example with two tables.
Look at this example:

if object_id('dbo.Table1') is not null drop table Table1
GO
CREATE TABLE Table1 (id int, Name char(10))
GO
if object_id('dbo.Table2') is not null drop table Table2
GO
CREATE TABLE Table1 (id int, Name char(20))
GO

DECLARE @i int
SELECT @i = 1
WHILE @i < 1000
BEGIN
INSERT INTO Table1 VALUES (@i, LTRIM(str(@i)))
SELECT @i = @i + 1
END
GO

DECLARE @i int
SELECT @i = 1
WHILE @i < 1000
BEGIN
INSERT INTO Table2 VALUES (@i, LTRIM(str(@i)))
SELECT @i = @i + 1
END
GO

SET SHOWPLAN_TEXT ON
GO
SELECT a.Name FROM Table1 a INNER JOIN Table2 b
ON a.Name = b.Name
GO
SET SHOWPLAN_TEXT OFF
GO

This is the result:


StmtText
-------------------------------------------------------------------------------------------
|--Hash Match(Inner Join, HASH:([a].[Name])=([b].[Name]),
RESIDUAL:([a].[Name]=[b].[Name]))
|--Table Scan(OBJECT:([pubs].[dbo].[Table1] AS [a]))
|--Table Scan(OBJECT:([pubs].[dbo].[Table2] AS [b]))

The smaller table will be build input, the other - probe
input. Field Name (column that joins the tables) is called
hash key. The hash table consists of linked lists called
hash buckets. The result of using a hash function on a hash
key is called hash value. Hash value and RID (row
identifier) will be placed into hash table.

Hash value must be smaller than hash key. So, query
processor economies on the size of the hash table. The real
example of hashing is notebook. You can open notebook on the
appropriate letter and scan all surnames on this letter to
find necessary ones. So, notebook is the example of hash
table, and pages on the appropriate letter are the example
of hash bucket.

During the probe phase, the entire probe input is scanned,
and for each probe row computes the same hash value on the
hash key to find any matches in the corresponding hash bucket.

There are two main kinds of Hash join:

# In-memory Hash join
# Grace Hash join


In-memory Hash Join will be used if entire build input can
be placed into memory.

Grace Hash join will be used if your server has not enough
memory to hold the entire build input. In this case, query
processor proceeds Hash Join in several steps (hash table
will be divided into multiple partitions and relevant
partition will be loaded as need).

Because the query optimizer usually selects the best
execution plan for a given select statement, it is not
necessary to enforce the desirable join type, but sometimes
it can be useful. You can enforce the desirable join type by
using the OPTION clause.

This is the example to enforce Hash join:

USE pubs
GO
SET SHOWPLAN_TEXT ON
GO
SELECT a.au_id FROM authors a JOIN titleauthor b
ON a.au_id = b.au_id OPTION (HASH JOIN)
GO
SET SHOWPLAN_TEXT OFF
GO

This is the result:


StmtText
-----------------------------------------------------------------------------------------------
SELECT a.au_id FROM authors a JOIN titleauthor b
ON a.au_id = b.au_id OPTION (HASH JOIN)

(1 row(s) affected)

StmtText
-----------------------------------------------------------------------------------------------
|--Hash Match(Inner Join, HASH:([a].[au_id])=([b].[au_id]),
RESIDUAL:([a].[au_id]=[b].[au_id]))
|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind]
AS [a]))
|--Index
Scan(OBJECT:([pubs].[dbo].[titleauthor].[titleidind] AS [b]))

(3 row(s) affected)

Is This Answer Correct ?    2 Yes 4 No

Post New Answer

More SQL PLSQL Interview Questions

Is pl sql a programming language?

0 Answers  


Can we use rowid as primary key?

0 Answers  


What does count (*) do in sql?

0 Answers  


How do I edit a stored procedure?

0 Answers  


Display the total debit counts, total credit counts, sum of total debits, sum of total credits from an employee's transaction table (a single table) containing the following columns. Transaction_number Transaction_amount Transaction_date Transaction_type --> tells if the amount is a credit or a debit. As the query alone is important, the table is left with specifying just the field's name. Pls help me with this query.

3 Answers   iNautix, Oracle,






When we give SELECT * FROM EMP; How does oracle respond?

24 Answers   Accenture, HCL, Infosys,


Can i possible to see Table Details ? Ex : Table Name Date Time User Emp May/18/2010 12:59pm Scott

3 Answers  


Why do we need cursors in pl sql?

0 Answers  


what is the difference between cluster and non cluster index? : Sql dba

0 Answers  


What is composite data type in pl sql?

0 Answers  


i have a table t1 a math 20 b phy 30 cchemisty 10 a math 40 b phy 23 c che 21 a math15 bphy 33 c che 56 write a quire to find out the max markr of each subject

8 Answers  


What is bind reference and how can it be created?

0 Answers  


Categories