IBM Datastage 11.3.x Newly Added Features

IBM Datastage 11.3.x  Newly Added Features :
Please go through this post to understand newly added features/stages in newly introduced IBM Datastage 11.3.x version. This post contains datastage part only .Apart from that, There are so many changes happened in information server suite level. I will try to add those in my next posts. 


Writing data to XML files : DK®
DataStage now can write data to Microsoft Excel files by using the Unstructured Data stage. This is a great new feature especially when clients/business users want data extracts in Excel format rather than CSV files.

Hierarchical Data Stage (earlier XML stage): DK®
Now it can process not only XML files but all types of Hierarchical Data (JSON). It can be used to design jobs that interact with REST (Representational State Transfer) web services by using HTTP methods. For example, you can design jobs that perform tasks such as posting message to social networking sites, interacting with systems such as Microsoft Sharepoint, or using maps and directions.

Data_Rules Stage ): DK®
Validate data with data rules You can validate your data and ensure that the quality of your data conforms to business expectations for data cleanliness by using the Data Rules stage, which is provided by IBM InfoSphere Information Analyzer. For example, you check that a supplier has a supplier ID in the correct format, a supplier name, and a tax ID that is nine numeric characters. The data rules that are used in the Data Rules stage are typically created by a data analyst by using InfoSphere Information Analyzer.

Big Data File Stage : DK®
The Big Data File stage is used to read and write to files on Hadoop (HDFS). The Big Data File stage is now compatible with Hortonworks 2.1, Cloudera 4.5, and InfoSphere BigInsights 3.0.

Greenplum Connector Stage : DK®
This stage can be used to create a native connection for accessing data located in Greenplum database. Table Definitions can also be imported using the Greenplum Connector framework.

InfoSphere Master Data Management Connector Stage : DK®
This can be used to read and write data from the IBM master data management solution – InfoSphere MDM. This stage can be configured for Member read and Member write interactions from the MDM server.

Amazon S3 Connector Stage : DK®
Amazon S3 (Simple Storage Service) is a cheap cloud file storage system which offers availability through web services (REST, SOAP, and BitTorrent). It offers scalability, high availability, and low latency at extremely competitive prices. The Amazon S3 Connector stage be can used to read and write data residing in Amazon S3.

Unstructured Data Stage – Microsoft Excel (.xls and .xlsx) : DK®
The Unstructured Data stage was first introduced in DataStage v9.1 and was used to read Excel files through a native interface. Previously, Excel data was staged as a .csv file or accessed through ODBC. The stage can also now be used to write data to Excel files.

Sort Stage Optimization : DK®
The Sort stage now tries to optimize your DataStage sort operations by converting length bounded columns to variable length before the sort and then converts it back to a length bounded column after the sort. When a record’s actual size of data is smaller than the defined upper bound, the optimization will result in reduced disk I/O.

Improved Flexibility in Record Delimiting : DK®
The Sequential File stage now gives developers more flexibility with how a source flat file has to be delimited. A new environment variable, APT_IMPORT_HANDLE_SHORT, can be set to enable the import operator the ability the read in records which do not contain all of the fields defined in the import schema. Previously, these records were rejected by the stage. The values assigned to any missing field depend on the data type and nullability.

Operations Console/Workload Management : DK®
IBM lists the Operations Console and Workload Management as new features of the 11.3 release documentation, even though these components have already been introduced in previous releases. Both components are now part of the base Information Server installation and Workload Management is now by default enabled.

url option for commands : DK®
The InfoSphere DataStage CLI now includes a new option, -url, for the logon clause of the dsjob and dsadmin commands. The option specifies a full format URL for the domain to log on to.

Operations Console : DK®
If the capturing of monitoring data is enabled, the AppWatcher process is automatically started when the engine tier computer is started.

Workload management: DK®
The workload management system is now enabled by default. IBM lists the Operations Console and Workload Management as new features of the 11.3 release documentation, even though these components have already been introduced in previous releases. Both components are now part of the base Information Server installation and Workload Management is now by default enabled.

Collection Courtesy : Mohit Saini & Devendra


No comments:

Post a Comment