Datastage Interview Related Questions and Answers V1.1
You will get Answers for below mentioned questions, in this blog:
A.What
is DataStage parallel Extender (PE)/ Enterprise Edition (EE)?
B.What is a
conductor node?
C.How do you execute datastage job from command line prompt?
D.Difference
between sequential file,dataset and fileset?
----------------
A.What
is DataStage parallel Extender / Enterprise Edition (EE)?
Parallel extender is that the parallel
processing of data extraction and transformation application . there are two
types of parallel processing
1) Pipeline Parallelism
2) Partition Parallelism.
B.What is a
conductor node?
Ans->Actually every process contains a conductor process where the execution was started and a section leader process for each processing node and a player process for each set of combined operators and a
individual player process for each uncombined operator.
When ever we want to kill a process we should have to destroy the player process and then section leader process and then conductor process.
When ever we want to kill a process we should have to destroy the player process and then section leader process and then conductor process.
C.How do you execute datastage job from command line prompt?
Using
"dsjob" command as follows. dsjob
-run -jobstatus projectname jobname
and also the options like
-stop -To stop the running job
-lprojects - To list the projects
-ljobs - To list the jobs in project
-lstages - To list the stages present in job.
-llinks - To list the links.
-projectinfo - returns the project information(hostname and project name)
-jobinfo - returns the job information(Job status,job runtime,endtime, etc.,)
-stageinfo - returns the stage name ,stage type,input rows etc.,)
-linkinfo - It returns the link information
-lparams - To list the parameters in a job
-paraminfo - returns the parameters info
-log - add a text message to log.
-logsum - To display the log
-logdetail - To display with details like event_id,time,messge
-lognewest - To display the newest log id.
-report - display a report contains Generated time, start time,elapsed time,status etc.,
-jobid - Job id information.
D.Difference
between sequential file,dataset and fileset?
Sequential
File:
1.
Extract/load from/to seq file max 2GB
2. when used as a source at the time of compilation it will be converted into native format from ASCII
3. Does not support null values
4. Seq file can only be accessed on one node.
2. when used as a source at the time of compilation it will be converted into native format from ASCII
3. Does not support null values
4. Seq file can only be accessed on one node.
Dataset:
1. It preserves partition.it stores data on the nodes so when you read from a dataset you dont have to repartition the data
2. it stores data in binary in the internal format of datastage. so it takes less time to read/write from ds to any other source/target.
1. It preserves partition.it stores data on the nodes so when you read from a dataset you dont have to repartition the data
2. it stores data in binary in the internal format of datastage. so it takes less time to read/write from ds to any other source/target.
3.
You cannot view the data without datastage.
4.
It Creates 2 types of file to storing the data.
A) Descriptor File : Which is created in
defined folder/path.
B) Data File : Created in Dataset folder
mentioned in configuration file.
5.
Dataset (.ds) file cannot be open
directly, and you could follow alternative way to achieve that, Data Set
Management, the utility in client tool(such as Designer and Manager), and
command line ORCHADMIN.
Fileset:
1. It stores data in the format similar to that of sequential file.Only advantage of using fileset over seq file is it preserves partition scheme.
2. you can view the data but in the order defined in partitioning scheme.
1. It stores data in the format similar to that of sequential file.Only advantage of using fileset over seq file is it preserves partition scheme.
2. you can view the data but in the order defined in partitioning scheme.
3. Fileset
creates .fs file and .fs file is stored as ASCII format, so you could
directly open it to see the path of data file and its schema.
No comments:
Post a Comment