DEV'S DATASTAGE TUTORIAL,GUIDES,TRAINING AND ONLINE HELP 4 U. UNIX, ETL, DATABASE RELATED SOLUTIONS: Datastage Interview Questions and Answers V1.4

Datastage Interview Related Questions and Answers V1.4

This post is very useful for datastage beginners..DK®

You will get Answers for below mentioned questions, in this Post:DK®

# What are the main features of datastage?

# What are Stage Variables, Derivations and Constants?

# Types of views in Datastage Director?

# How do you execute datastage job from command line prompt?

# Functionality of Link Partitioner and Link Collector?

# What are the types of jobs available in datastage?

# What is the difference between Server Job and Parallel Jobs?

# What is a project

# What is sequencer

# What is a container

# What are different types of containers

# What are mainframe jobs

# What are parallel jobs

# How do you use procedure in datastage job

# What is odbc stage

# What is hash file ? what are its types?

# What type of hash file is to be used in general in datastage jobs

# What is a stage variable?

# What are constraints and derivations?

# How do you reject records in a transformer?

# Why do you need stage variables?

# What is the precedence of stage variables,derivations, and constraints?

# What are data elements?

# What are routines ?

# What are transforms and what is the differenece between routines and transforms

# What a datastage macro?

# What is job control ?

# How many types of stage.

# Define the difference between active and Passive Stage

# What are the plugin stages used in your projects.

# What is Plug in stage , How to install plugin stages.

# What is Sort plugin.

# What is Aggregate stage.

# What is the hash file stage and Sequential file stage.

# What types of flat files you used.have you used tab delimited.

# What is the Job control code

-------------------------------------------------------------------------

1. What are the main features of datastage?DK®

DataStage has the following features to aid the design and processing required to build a data warehouse :

Uses graphical design tools. With simple point and click techniques you can draw a scheme to represent your processing requirements.
Extracts data from any number or types of database.
Handles all the metadata definitions required to define your data warehouse.
You can view and modify the table definitions at any point during the design of your application.
Aggregates data.
You can modify SQL SELECT statements used to extract data.
Transforms data. DataStage has a set of predefined transforms and functions. you can use to convert your data. You can easily extend the functionality by defining your own transforms to use.
Loads the data warehouse.

2. What are Stage Variables, Derivations and Constants?DK®

A. Stage Variable - An intermediate processing variable that retains value during read

and doesn’t pass the value into target column.

Derivation - Expression that specifies value to be passed on to the target column.

Constant - Conditions that are either true or false that specifies flow of data with a link.

3. Types of views in Datastage Director?DK®

There are 3 types of views in Datastage Director

a) Job View - Dates of Jobs Compiled.

b) Log View - Status of Job last run

c) Status View - Warning Messages, Event Messages, Program Generated Messages.

4. How do you execute datastage job from command line prompt?DK®

A) Using "dsjob" command as follows.

dsjob -run -jobstatus projectname jobname

5. Functionality of Link Partitioner and Link Collector?

A)Link Partitioner: It actually splits data into various partitions or data flows using

various partition methods.

Link Collector: It collects the data coming from partitions, merges it into a single data

flow and loads to target.

6. What are the types of jobs available in datastage?DK®

Server Job
Parallel Job
Sequencer Job
Container Job

7. What is the difference between Server Job and Parallel Jobs?

A. Server jobs were doesn’t support the partitioning techniques but parallel jobs support the partition techniques.

B. Server jobs are not support SMTP,MPP but parallel supports SMTP,MPP.

C. Server jobs are running in single node but parallel jobs are running in multiple nodes.

D. Server jobs prefer while getting source data is low but data is huge then prefer the parallel

8. What is a project

Datastage Projects :- A Complete project may contain several jobs and user-defined components .

Project Contain Datastage jobs.

Built-in components . These are predefined components used in job.

User-defined components . These are customized components created using the Datastage manager . each user-defined component performs a specific task in a job .

All work done in project. Projects are created during and after installation process.you can add project on the Projects tab of Administrator.

A project is associated with a directory.The Project directory is used by DataStage to store jobs and other datastage objects and metedata.

9. What is sequencer

Graphically create controlling job, without using the job control function.

10. what is a container

A group of stages and link in a job design is called container.

There are two kinds of Containers:

Local and Shared.

Local Containers only exist within the single job they are used. Use Shared Containers to simplify complex job designs.

Shared Containers exist outside of any specific job. They are listed in the Shared Containers branch is Manager. These Shared Containers can be added to any job. Shared containers are frequently used to share a commonly used set of job components.

A Job Container contains two unique stages. The Container Input stage is used to pass data into the Container. The Container Output stage is used to pass data out of the Container.

11. what are different types of containers

Local containers. These are created within a job and are only accessible by that job. A local container is edited in a tabbed page of the job’s Diagram window.

Shared containers. These are created separately and are stored in the Repository in the same way that jobs are. Instances of a shared container can be inserted into any server job in the project. Shared containers are edited in their own Diagram window.

12. what are mainframe jobs

A Mainframe job is complied and run on the mainframe , Data Extracted by such jobs is then loaded into the datawarehouse .

13. what are parallel jobs

These are compiled and run on the DataStage server in a similar way to server jobs , but support parallel processing on SMP,MPP and cluster systems

14. how do you use procedure in datastage job

Use ODBC plug,pass one dummy colomn and give procedure name in SQL tab.

15. What is odbc stage

A Stage that extracts data from or loads data into a database that implements the industry standard Open Database Connectivity API. Used to represent a data source , an aggregation step , or target data table ( Server Job Only )

16. What is hash file ? what are its types?

Hash file is just like indexed sequential file , this file internally indexed with a particular key value . There are two type of hash file Static Hash File and Dynamic Hash File .

17. what type of hash file is to be used in general in datastage jobs

Static Hash File .

18. what is a stage variable?

In Datastage transformation , we can define some variable and define Value from source .

19. What are constraints and derivations?

We can create constraints and derivations with datastage variable .

20. how do you reject records in a transformer?DK®

Through datastage constraint we can reject record .

21. Why do you need stage variables?

That is depend upon job requirement , through stage variable we can file data.

22. What is the precedence of stage variables,derivations, and constraints?

stage variables =>constraints=> derivations

23. What are data elements?

A specification that describes the type of data in a column and how the data is converted .

24. What are routines ?

In Datastage routine is just like function , which we call in datastage job . there are In-Built routine and and also we can create routine .

25. What are transforms and what is the differenece between routines and transforms

Transforms is used to manipulate data within datastage job .

26. What a datastage macro?DK®

In datastage macro can be used in expressions , job control routines and before / after subroutines . The available macros are concerned with ascertaining job status .

27. What is job control ?

A job control routine provides the means of controlling other jobs from the current job . A set of one or more jobs can be validated, run ,reset , stopped , scheduled in much the same way as the current job can be .

28. How many types of stage?DK®

There are three basic type of stage

Built-in stages :- Supplied with DataStage and used for extracting , aggregating , transforming , or writing data . All type of job have these stage .

Plug-in stage :- Additional stages that can be installed in DataStage to perform specialized tasks that the built-in stages do not support. Server jobs and parallel jobs can make use of these .

Job Sequence Stages :- Special built-in stages which allow you to define sequences of activities to run. Only job sequencer have these

29. Define the difference between active and Passive Stage?

There are two kinds of stages:

Passive stages define read and write access to data sources and repositories.

• Sequential

• ODBC

• Hashed

Active stages define how data is filtered and transformed.

• Transformer

• Aggregator

• Sort plug-in

30. What are the plugin stages used in your projects.

Plug-In Stage :- A Stage that performd specific processing that is not supported by the standard server job stage .

Used Plug-in – ORAOCI8, Orabulk.

31. What is Plug in stage , How to install plugin stages.

Installation :- Plug-in stages are added to the Diagram window in the same way as built-in stages. To add a plug-in stage:

A. Do one of the following:

· Choose the stage type from the Insert menu.

· Click the appropriate stage button on the tool palette.

B. Click in the Diagram window where you want to position the stage. The stage appears as a square in the Diagram window.

---- Some information about Plug-In Stage

You may find that the built-in stage types do not meet all your requirements for data extraction or transformation. In this case, you need to use a plug-in stage. The function and properties of a plug-in stage are determined by the plug-in used when the stage is inserted. Plug-ins are written to perform specific tasks, for example, to bulk load data into a data mart.
Two plug-ins are always installed with DataStage: BCPLoad and Orabulk. You can also choose to install a number of other plug-ins when you install DataStage.
You can write plug-ins. However, contact your local Ascential office before creating a plug-in, as there are a range of plug-ins already available that may meet your needs.

30. Difference Between ORAOCI8 and Orabulk?DK®

ORAOCI8 :- This Stage allow to connect Oracle Database .

OraBulk :- The Orabulk plug-in generates control and data files for bulk loading into a single table on an Oracle target database. The files are suitable for loading into the target database using the Oracle command sqlldr.

32. What is Sort plugin.

A mainframe processing stage that sorts input columns

33. What is Aggregate stage.

A stage type that compute s totals or other functions of set of data .

34. What is the hash file stage and Sequential file stage.

A stage that extracts data or load data into database that contain hashed file .

35. What types of flat files you used.have you used tab delimited.

Sequential flat file with comma separated .

36. What is the Job control code

Job control code used in job control routine to creating controlling job, which invokes and run other jobs.

Go to The Top Of the Page