Datastage Interview Related Questions and Answers V1.2
You will get Answers for below mentioned questions, in this Post:
# What is the main differences
between Lookup, Join and Merge stages ?
# What
are the different types of lookup? When one should use sparse lookup in a job?
# Use and Types of Funnel Stage in Datastage
?
# What is the Diffrence Between Link Sort and
Sort Stage?
# what
is main difference between change capture and change apply stages?
# Difference between Transformer and
Basic Transfomer stage ?
----------------------------------------
# What is the main differences
between Lookup, Join and Merge stages ?
All are used to join tables, but find the difference.
Lookup: when the reference data is very less we use lookup. bcoz the
data is stored in buffer. if the reference data is very large then it wl take
time to load and for lookup.
Join: if the reference data is very large then we wl go for join.
bcoz it access the data directly from the disk. so the
processing time wl be less when compared to lookup. but here in
join we cant capture the rejected data. so we go for merge.
Merge: if we want to capture rejected data (when the join key is not
matched) we use merge stage. for every detailed link there is a reject link to
capture rejected data.
Significant
differences that I have noticed are:
1) Number Of Reject Link
(Join) does not support reject link.
(Merge) has as many reject link as the update links (If there are n-input links then 1 will be master link and n-1 will be the update link).
2) Data Selection
(Join) There are various ways in which data is being selected. e.g. we have different types of joins inner outer( left right full) cross join etc. So you have different selection criteria for dropping/selecting a row.
(Merge) Data in Master record and update records are merged only when both have same value for the merge key columns.
1) Number Of Reject Link
(Join) does not support reject link.
(Merge) has as many reject link as the update links (If there are n-input links then 1 will be master link and n-1 will be the update link).
2) Data Selection
(Join) There are various ways in which data is being selected. e.g. we have different types of joins inner outer( left right full) cross join etc. So you have different selection criteria for dropping/selecting a row.
(Merge) Data in Master record and update records are merged only when both have same value for the merge key columns.
# What are the different types of lookup? When one should use sparse lookup in a job?
In
DS 7.5 we have 2 types of lookup options are avilable: 1. Normal
2. Sparce
In DS 8.0.1 Onwards, we have 3 types of lookup options are available 1. Normal 2. Sparce 3. Range
In DS 8.0.1 Onwards, we have 3 types of lookup options are available 1. Normal 2. Sparce 3. Range
Normal lkp: To perform this
lkp data will be stored in the memory first and then lkp will be performed due
to which it takes more execution time if reference data is high in volume.
Normal lookup it takes the entiretable into memory and perform lookup.
Sparse lkp: Sql query will be
directly fired on the database related record due to which execution is faster
than normal lkp. sparse lookup it directly perform the lookup in database
level.
i.e If reference link is directly connected to
Db2/OCI Stage and firing one-by-one query on the DB table to fetcht the result.
Range lookup: this will help you to search records based on perticular range. it will serch only that perticuler range records and provides good performance insted of serching the enire record set.
i.e
Define the range expression by selecting
the upper bound and lower bound range columns and the required operators.
For
example:
Account_Detail.Trans_Date >=
Customer_Detail.Start_Date AND
Account_Detail.Trans_Date <= Customer_Detail.End_Date
# Use and Types of Funnel Stage in Datastage
?
The Funnel
stage is a processing stage. It copies multiple input data sets to a single
output data set. This operation is useful for combining separate data sets into
a single large data set. The stage can have any number of input links and a
single output link.
The Funnel stage
can operate in one of three modes:
- Continuous Funnel combines the records of the input data in no guaranteed order. It takes one record from each input link in turn. If data is not available on an input link, the stage skips to the next link rather than waiting.
- Sort Funnel combines the input records in the order defined by the value(s) of one or more key columns and the order of the output records is determined by these sorting keys.
- Sequence copies all records from the first input data set to the output data set, then all the records from the second input data set, and so on.
For all methods the meta data of all
input data sets must be identical. Name of columns should be same in all input
links.
#What is the Diffrence Between Link Sort and
Sort Stage?
Or Diffrence Between Link sort and Stage Sort ?
If the volume
of the data is low, then we go for link sort.
If the volume of the data is high, then we go for sort stage.
If the volume of the data is high, then we go for sort stage.
"Link
Sort" uses scratch disk (physical location on disk), whereas
"Sort
Stage" uses server RAM (Memory). Hence we can change the default memory
size in "Sort Stage"
Using
SortStage you have the possibility to create a KeyChangeColumn - not possible in link sort.
Within a SortStage you have the possibility to increase the memory size per partition,
Within a SortStage you have the possibility to increase the memory size per partition,
Within a
SortStage you can define the 'don't
sort' option on sort key they are already sorted.
Link
Sort and stage sort,both do the same thing.Only the Sort Stage provides you
with more options like the amount of memory to be used,remove duplicates,sort
in Ascending or descending order,Create change key columns and etc.These
options will not be available to you while using Link Sort.
# what
is main difference between change capture and change apply stages?
Change Capture stage : compares two data set(after and before) and makes a record of the differences.
change apply stage : combine the changes
from the change capture stage with the original before data set to reproduce
the after data set.
Change capture stage
catch holds of changesfrom two different datasets and generates a new column called
change code.... change code has values
0-copy
1-insert
2-delete
3-edit/update
Change apply stage
applies these changes back to those data sets based on the chanecode column.
# Difference between Transformer and
Basic Transfomer stage ?
Basic Difference
between Transformer and BASIC transfomer stage in parallel jobs ?
Basic transformer used in server jobs and Parallel
Jobs but
It supports
one input link, 'n' number of output links, and only one reject link.
Basic
transformer will be operating in Sequential mode.
All
functions, macros, routines are writtened by using BASIC language.
Parallel Transformer stages
Can have
one primary input link, multiple reference input links, and multiple output
links.
The link
from the main data input source is designated as the primary input link.
PX
Transformer all functions, macros are written in C++ language.
It Supports
Partioning of Data.
No comments:
Post a Comment