How do you check the performance of Teradata Query and list
down the basic Performance Tuning steps you use?
Answers were Sorted based on User's Feedback
Answer / hanumanth
APPROACHES
A. In case of product join scenarios,check for
- Proper usage of alias
- joining on matching columns
- Usage of join keywords - like specifying type of joins
(ex. inner or outer )
- use union in case of "OR” scenarios
- Ensure statistics are collected on join columns and this
is especially important if the columns you are joining on
are not unique.
B. collects stats
- Run command "diagnostic help stats on for the session"
- Gather information on columns on which stats has to be
collected
- Collect stats on suggestions columns
- Also check for stats missing on PI, SI or columns used in
joins - "help stats <databasename>.<tablename>
- Make sure stats are re-collected when at-least 10% of data
changes
- remove unwanted stats or stat which hardly improves
performance of the queries
- Collect stats on columns instead of indexes since index
dropped will drop stats as well!!
- collect stats on index having multiple columns, this might
be helpful when these columns are used in join conditions
- Check if stats are re-created for tables whose structures
have some changes
c. Full table scan scenarios
- Try to avoid FTS scenarios as, it might take very long
time to access all the data in every amp in the system
- Make sure SI is defined on the columns which are used as
part of joins or Alternate access path.
- Collect stats on SI columns else there are chances where
optimizer might go for FTS even when SI is defined on that
particular column
2. If intermediate tables are used to store results, make
sure that
- It has same PI of source and destination table
3. Tune to get the optimizer to join on the Primary Index of
the largest table, when possible, to ensure that the large
table is not redistributed on AMPS
4. For large list of values, avoid using IN /NOT IN in SQLs.
Write large list values to a temporary table and use this
table in the query
5. Make sure when to use exists/not exists condition since
they ignore unknown comparisons (ex. - NULL value in the
column results in unknown) . Hence this leads to
inconsistent results
6. Inner Vs Outer Joins
Check which join works efficiently in given scenarios.Some
examples are
- Outer joins can be used in case of large table joining
with small tables (like fact table joining with Dimension
table based on reference column)
- Inner joins can be used when we get actual data and no
extra data is loaded into spool for processing
Please note for outer join conditions:
1. Filter condition for inner table should be present in
"ON" condition
2. Filter condition for outer table should be present in
"WHERE" condition
Is This Answer Correct ? | 4 Yes | 1 No |
Answer / tdguy
Refer the explain plan. 1. check for product or nested
joins and try to avoid those joins, as they affect the
performance. 2. try to check the confidence levels of TD.
Collect stats for the columns recommended by TD. 3. In case
of long queries, try to use SI appropriately. Should be
carefully chosen. 4. Check for the usage of temporary
tables. These tables can be used in case any aggregrate
results are needed.
Is This Answer Correct ? | 3 Yes | 0 No |
What are the scenarios in which full table scans occurs?
why use references rather than pointers in the public api, particularly for arguments which are modified?
Explain the term 'foreign key' related to relational database management system?
If the table does not have duplicates then which utility you can suggest to load the data ?
Give some points about Teradata Viewpoint ?
Why teradata is used?
what is identity column in TD?
If the query is NOT WRITTEN PROPERLY then what are the recommendations you can give to the developer ?
What is spool space? Why do you get spool space errors?
Write a program to show the parser component used in teradata?
What does sleep function does in fast load?
What are the different softwares used with their functions in teradata?