Why do Hash joins usually perform better than Merge Joins?
Answer Posted / narayana
In MERGE join rows to be join must be present in same AMP.. If the rows to be joined are not on the same AMP, Teradata will either redistribute the data or duplicate the data in spool to make that happen based on row hash of the columns involved in the joins WHERE Clause.Hash join takes place if one or both of the tables on each can fit completely inside the AMP's memory.AMP chooses to hold small tables in its memory for joins happening on ROW hash.
Usually optimizer will first identify a smaller table, and then sort it by the join column row hash sequence. If the smaller table is really small and can fit in the memory, the performance will be best. Otherwise, the sorted smaller table will be duplicated to all the AMPs. Then the larger table is processed one row at a time by doing a binary search against the smaller table for matched record.
Where as in MERGE join Columns to be join is Non INDEXED column. teradata will redistribute the table rows into SPOOL memory and sort them by hash code.So that matching data lies on same amp, so the join can happen on redistributed data
| Is This Answer Correct ? | 6 Yes | 0 No |
Post New Answer View All Answers
Differentiate database data and data warehouse data?
What does sleep function does in fast load?
what are the uses of fact table and dimension table in banking project?
Describe primary index in teradata?
What is the command in bteq to check for session settings ?
What are the uses of client software involved in teradata?
Can we have an unconnected lkp to lookup a db2 record against a teradata record? Doesnt seem to work. I could be wrong
What is a node in teradata? Explain
what is sysdba and sysdbc ? which has high priority ?
How to eliminate product joins in a teradata sql query?
Difference between multiload and tpump?
What are normalization, first normal form, second normal form and third normal form?
What are default access rights in teradata? What explicit right can be given to a user?
What are the components used in smp and massively parallel processing (mpp) machines?
what is object level locking ? where do appear this type of locking ?