New
features and changes on IBM InfoSphere Information Server, Version 9.1
New features and changes were introduced for IBM InfoSphere
Information Server, Version 9.1, along with documentation updates. The new and
changed features and documentation updates are described in the following
sections.
Table of contents
InfoSphere Information Server, Version 9.1 new features and
changes:
- Common capabilities across the InfoSphere Information Server suite
- Administering
- Connecting to external sources
- InfoSphere Blueprint Director
- InfoSphere Metadata Asset Manager
- InfoSphere Metadata Workbench
- Migrating
- InfoSphere Information Server for Data Integration:
- InfoSphere Data Click
- InfoSphere DataStage
- InfoSphere Information Server for Data Quality:
- InfoSphere Data Quality Console
- InfoSphere Information Analyzer
- InfoSphere QualityStage
- InfoSphere Business Information Exchange:
- InfoSphere Business Glossary
- InfoSphere Business Glossary Client for Eclipse
Documentation changes included in the Version 9.1 release:
- Documentation introduced or enhanced with Version 9.1
Administering
New repository administration tool
The InfoSphere DataStage and QualityStage operations database
and the InfoSphere QualityStage Standardization Rules Designer database are
typically installed by the installation program unless you are using a database
other than DB2 or unless you want to create them yourself. To assist in the
management of repositories that are not installed by the installation program,
the RepositoryAdmin command line tool is provided. You can also use the
RepositoryAdmin tool for other purposes, such as to assist you in relocating a
repository to another server or to update a connection to a repository. For
more information, see RepositoryAdmin
tool reference.
New database for InfoSphere QualityStage Standardization Rules
Designer
The InfoSphere QualityStage Standardization Rules Designer is
supported by an additional database for your Version 9.1 installation.
Connecting to external sources
Stage for IBM Operational Decision Management
IBM Operational Decision Management allows customers to
externalize complex business rules from applications. With the new ILOG JRules
stage, you can invoke complex business rules within the context of a job.
InfoSphere Streams connector
The new InfoSphere Streams connector enables integration between
InfoSphere Streams and InfoSphere DataStage. You can use the InfoSphere Streams
connector to send data from an InfoSphere DataStage job to an InfoSphere
Streams job, and also to send data from an InfoSphere Streams job to an
InfoSphere DataStage job.
Unstructured Data stage
Use the new Unstructured Data stage to extract information, such
as formulas or document authors, from Microsoft Excel files. The stage supports
style sheets for .xls and .xlsx file types.
Java™ Integration stage
You can use the new Java Integration stage to integrate your
code into your job design by writing your Java code using the Java Integration
stage API. The Java Integration stage API defines interfaces and classes for
writing Java code that can be invoked from within InfoSphere DataStage and
QualityStage parallel jobs.
Support for new data sources
The following connectors and stages are now available:
- DB2 connector for IBM DB2 for Linux, UNIX, and Microsoft Windows, Version 10.1.x
- DB2 connector for IBM DB2 for z/OS , Version 10
- MQ connector for IBM WebSphere MQ, Version 7.1.x and 7.5.x
- Informix stage for IBM Informix, Version 11.7
- Streams connector for IBM InfoSphere Streams 3.0
- Teradata connector for Teradata Database 13.10 and 14.0
- Oracle connector for Oracle Database 11g Release2
- Sybase stage for Sybase ASE, Version 15.7 and Sybase IQ, Version 15.4
- Netezza connector for Netezza 4.6x, 6.0.x, and 7.0.x
- ODBC connector for DataDirect ODBC, Version 7.0.x
- ILOG JRules stage for ILOG-JRules 7.1.x and WODM 8.0.x
- Big Data File stage for IBM BigInsight 1.4 and Cloudera CH4.0
InfoSphere Blueprint Director
Publication of blueprints
Blueprints can now be published to the metadata repository
of InfoSphere Information Server so that other users can view or use
them. For more information, see Publishing a
blueprint.
InfoSphere Metadata Asset Manager
Import metadata by bridge from additional tools
Import support was added for the following tools and types of
metadata:
- CA ERwin Data Modeler 8. Logical and physical data models.
- IBM Cognos , Version 10. Business intelligence (BI) models, BI reports, and related implemented data resources.
- IBM InfoSphere Streams MetaBroker . Endpoints and tuples.
- Oracle BI Enterprise Edition. Business intelligence (BI) models, BI reports, and related data resources.
For more information, see Import bridges.
Export metadata
You can now use the OMG CWM 1 XMI 1 bridge to export the
contents of databases and database schemas to XML files that are compliant with
the OMG CWM XMI file format. For more information, see Exporting assets
by using InfoSphere Metadata Manager.
Create and edit data connections
When you import by using a connector, you can now create a data
connection, use an existing data connection, or edit an existing data
connection. Data connections are saved to the metadata repository. For more
information, see Data connections.
Automatic creation of metadata interchange servers
Metadata interchange servers that enable import from bridges and
connectors are now created automatically during installation. For more
information, see Metadata
interchange servers.
InfoSphere Metadata Workbench
Enhancements in Manage Lineage utility
You can now select or clear InfoSphere DataStage projects to be
included in lineage. Previously, the Manage Lineage utility included all jobs
in a selected project. In addition, you can run the Manage Lineage utility on
database views without selecting a InfoSphere DataStage project to link the
database view to its source database table. For more information, see Manage Lineage
services.
Integration with IBM InfoSphere Blueprint Director
You can browse, query, and display published blueprints. You can
display the blueprint diagram. For more information, see Viewing
blueprints.
Integration with IBM InfoSphere Information Analyzer
- You can browse, query, and display published rule definitions and published rule set definitions.
- You can browse, query, display, and include for lineage the InfoSphere DataStage Data Rules stage and its relationship to the published data rule.
Integration with Big Data platform
You can browse, query, display, and include for lineage the
InfoSphere DataStage Unstructured Data, Big Data File, and Streams Connector
stages.
Integration with IBM InfoSphere Business Glossary
- You can browse, query, display, and assign assets to information governance rules and information governance policies.
- You can query and display the new Is A and Has A term relationships. For more information, see Terms.
- The complete category hierarchy of a term is displayed on the Details page for the term. In previous versions, only the parent category of the term was displayed.
Integration with IBM InfoSphere DataStage
- You can browse, query, display, and include for lineage the InfoSphere DataStage Java Client and Java Transformer stages.
- You can display additional database stage properties: the server, database, schema, and table properties of the stage.
- You can display additional data file stage properties: the file and location properties of the stage.
Integration with IBM InfoSphere Data Click
You can browse, query, display, and include for lineage
published Change Data Capture (CDC) subscriptions from InfoSphere Data Click.
In addition, you can invoke the CDC subscription process from a blueprint
diagram.
Importing assets into the metadata repository
You can generate database, data file, and business intelligence
(BI) report assets from a CSV file for later import into the metadata
repository. For more information, see Generating an
ISX file to import database, data file, BI report, and BI model assets from the
command line.
Migrating
New migration functions
To help you to migrate automatically, you can now use two new
migration wizards. The wizards automate the process of exporting and importing
databases, profiles, and directories that are associated with InfoSphere
Information Server. The wizards collect information about your computer
and InfoSphere Information Server configuration. The information is
then used to export and import your system. The migration wizards support all
three server tiers: the services tier, the engine tier, and the metadata
repository tier. When you export or import by using the wizards, all tiers that
are installed on the computer are backed up simultaneously. For more
information, see Migrating to IBM
InfoSphere Information Server, Version 9.1.
InfoSphere Data Click
InfoSphere Data Click helps users retrieve data and provision
systems with agility. Users can offload individual tables or entire schemas to
generate sandbox environments for personal or group development work. The
simple interface enables users of any skill level to complete the data
integration task.
InfoSphere Data Click inherits the built-in data governance
features of the InfoSphere Information Server platform. InfoSphere Data Click generates
both design and operational metadata to support data lineage and impact
analysis. InfoSphere Data Click assets also support linkages to the business
glossary so that users can establish trust in the sources of information that
are used. Also, administrators can define policies that control the data
integration activity so that users cannot exceed limits that are based on
enterprise requirements.
InfoSphere Data Click is installed when you
install InfoSphere Information Server for Data Integration.
InfoSphere Data Click activities are governed from InfoSphere Blueprint
Director. You install InfoSphere Data Click as a plug-in into InfoSphere
Blueprint Director.
The following screen capture shows the summary of an offload
request in InfoSphere Data Click:
InfoSphere DataStage
Workload management
You can now use the workload management service
in InfoSphere Information Server to allow the administrator to set
system resource policies and prioritization of workload classes. The policies
and workload classes control the execution of parallel and server jobs. For
more information, see Administering
workload management.
Web-based job runtime management
Administration and management of the operational environment is
simplified by extending the Operations Console. Authorized users can now define
the workload management policies, and can run, stop, and reset integration jobs
within the projects that they administer. For more information, see Overview of the
Operations Console.
Balanced optimization for Hadoop
Extending the HDFS features in Version 8.7, you can now use the
Balanced Optimization features of InfoSphere DataStage to push sets of data
integration processing and related data I/O into a Hadoop cluster. InfoSphere
DataStage adds integration with Oozie workflows, as well as real-time
integration with InfoSphere Streams.
For more information about Balanced Optimization, see Introduction to
InfoSphere DataStage Balanced Optimization. For more information
about integration with Oozie workflows, see InfoSphere
DataStage.
Support for IBM Rational Team Concert™ as a source control
system
You can now use Rational Team Concert as a source control system
in IBM InfoSphere Information Server Manager. For more information, see Source control
of InfoSphere DataStage and QualityStage assets.
XML design and performance optimization enhancements
InfoSphere DataStage 9.1 includes new features to help you work
with the type of large XML schemas that are often seen in industry standards.
You can use one new feature, the schema view, to narrow the scope of a large
XSD to only the subset of the schema tree that you want to work with. When you
narrow the scope, you can focus on a particular business challenge and parse
and compose XML documents more easily. Other new features include
user-specified parallelization for greater performance, extended support for
XSD typing, and usability and productivity improvements in XML job editing
through schema search and mapping intelligence.
InfoSphere Data Quality Console
InfoSphere Data Quality Console is a new unified, browser-based
interface that you can use to monitor and track data quality exceptions that
are generated by InfoSphere Information Server products and
components. Exceptions are entities that are generated by a condition or event
and that might require additional information or investigation. For example,
records that do not meet the conditions of data rules in InfoSphere Information
Analyzer might be considered exceptions. The following screen capture shows how
you can view a subset of exception descriptors by specifying search criteria,
which include search terms and attributes.
For more information, see InfoSphere Data
Quality Console.
InfoSphere Information Analyzer
Predefined rule definitions
A key challenge in assessing and monitoring information quality
is starting the process to validate key business requirements. Instead of
starting that process without assistance, you can start by using predefined
data quality rule definitions. New installations of this release include more
than 100 predefined rule definitions for basic and common domains. Also
included are more than 60 predefined rule definitions that are designed to
validate standardized address data. Although the rule definitions are optimized
for US data, they can be modified for any country or region. For more
information, see Accelerating
data quality analysis by starting with predefined rule definitions.
The data domains that are represented include the following
domains:
- Personal identity, such as age, date of birth, and national identifier
- Asset identity, such as IP address information
- Financial
- Orders and sales
- Data classification, such as identifier, indicator, code, date, and quantity
- Completeness, which checks whether a field exists
- Data format, such as alphabetic and numeric
- Address data
User-named output tables for data rules
When you create data rules, you can specify that you want a
user-named rule output table to be created in addition to the system rule
output tables. User-named output tables can be simple or advanced. Use a simple
table if you plan to use the rule output from one rule to create subsequent
rules. Use an advanced table if you want to collect rule output from multiple
data rules into one table. Also, you might want to create an advanced
user-named table if you plan to use the rule output from multiple rules to
create subsequent rules. An advanced user-named table is an additional physical
table with copied records, which means that it requires additional storage
space. For more information, see Setting output
tables for a data rule.
Distinct output records
You can now specify whether you want only distinct output
records or all output records in the rule output table. For more information,
see Setting the
output content for a data rule.
Task sequencing
You can now use task sequences to group multiple InfoSphere
Information Analyzer jobs that are to be executed sequentially. In this
release, task sequencing is available only by using the HTTP API and CLI, and
only rules, rule sets, and metrics are supported for task sequencing. For more
information, see Task sequences.
InfoSphere QualityStage
Standardization Rules Designer
The new Standardization Rules Designer provides an
intuitive and efficient framework that you can use to enhance standardization
rule sets. You can use the browser-based interface to add or modify
classifications, lookup tables, and rules. You can also import sample data to
validate that the enhancements to the rule set work with your data. The
following screen capture shows a part of the Standardization Rules Designer in
which you can add or modify a rule by mapping input values from an example
record to output columns. This rule splits concatenated values in an input
address record by mapping each part of the input value to a different output
column.
an address record and maps each value to an output column
For more information, see Enhancing
standardization rule sets by using the Standardization Rules Designer.
New rule sets
The following rule sets are now available:
- The PHPROD rule set is a rule set for pharmaceutical data. The rule set demonstrates how you can use rules to standardize description data from the health industry.
- The RUNAMEL rule set can be used to standardize Russian names.
- The RUADDRL rule set can be used to standardize Russian addresses and area information.
Rule set enhancements
The predefined rule sets are enhanced in the following ways:
- The domain-specific rule sets can be used with the Standardization Rules Designer.
- The CNNAME, HKCNAME, and HKNAME rule sets now have special options for name processing.
- The CNADDR, CNAREA, CNPHONE, HKADDR, HKCADDR, and HKPHONE rule sets now have user modification subroutines.
- The CNPHONE and HKPHONE rule sets are enhanced in several ways. For example, input data can be converted to half-width characters.
Sample data available for predefined jobs and tutorial
Sample data is now provided for the predefined standardization
jobs that you can use to generate standardized data and the frequency
information for that data. For more information, see Predefined
standardization jobs.
The installation media also now contains sample data and other
files that are required for the InfoSphere QualityStagetutorial. For more
information, see the InfoSphere
QualityStage parallel job tutorial.
InfoSphere Business Glossary
Expanded enterprise information governance with information
governance policies and information governance rules
Now, in addition to creating and managing terms and categories,
you can create and manage information governance policies and information
governance rules. Information governance policies and rules describe the way
that information should be used and managed to comply with business objectives.
You can define relationships among the policies and rules and between the
policies and rules and other metadata information assets. For more information,
see Information
governance policies and Information
governance rules.
Advanced term relationships
You can use new relationships between terms to express
hierarchies of type and containment. The relationships enable consumers of the
information to understand the meaning of terminology more fully, in the context
of other terms. For more information, see Is A and Has A
relationships.
Single sign-on for Windows users
Integration with Windows desktop authentication enables users
who are logged in to Windows to work with InfoSphere Business
Glossary immediately, without requiring a separate login process. For more
information, see Configuring Windows
desktop single sign-on support.
Web-based access to blueprints
You can now define information about blueprints and view
published blueprints directly from the business glossary. For more information,
see Viewing
blueprints.
Dynamic display of external content from OSLC providers
OSLC (Open Services for Lifecycle Collaboration) is a method of
communicating among different systems. InfoSphere Business
Glossary can now be a consumer of OSLC services from Rational Asset
Manager and Rational Software Architect Data Manager. The metadata content that
is stored in these OSLC providers is displayed dynamically in the business glossary.
The dynamic display ensures that data is synchronized and eliminates the need
for separate data transfer procedures. For more information, see Configuring
cross-server communication for external assets.
Enhanced integration with InfoSphere Information Analyzer
In previous releases, you were able to view the results of table
and column analysis, including valid values for columns. You can now browse,
search, view details of, and assign published data rule definitions and data
rule set definitions to business glossary assets. For more information,
see Integration with
other IBM InfoSphere products.
InfoSphere Business Glossary Client for Eclipse
Information governance policy and information governance rule
assets
You can now browse, search, and display the properties of two
new InfoSphere Business Glossary assets: information governance policies and
information governance rules. You can assign an information governance rule to
an asset, such as a database table, so that the information governance rule
governs the asset.
Import and export of glossary assignments
Earlier versions supported import and export term assignments.
In version 9.1, you can import and export glossary assignments, which include
both term assignments and information governance rule assignments.
Advanced term relationships from InfoSphere Business Glossary
Two new term relationships, Is A and Has A, are included in the
Properties view of a term. You can view the supertype and subtype relationship
between terms in the Term Type Hierarchy view.
Business Process Modeling Notation (BPMN) model elements
You can now view and remove term assignments in BPMN model
elements that are displayed in IBM Rational Software Architect. With the Business
Process Model Integration API, you can build functions to add, remove, and get
term assignments to BPMN model elements.
Local indexing
Local term assignments and local information governance rule
assignments are now indexed to improve search and display performance.
Documentation introduced or enhanced with Version 9.1
Introduction to InfoSphere Information Server
This information is more complete and streamlined to help you
understand how the suite and its components interact. Diagrams show where each
component fits in the suite architecture, and scenarios explain how each
component might be used to solve real business problems. For more information,
see Introduction to
InfoSphere Information Server.
InfoSphere Business Glossary
New topics provide information about populating your business
glossary by using the command line:
- Generating business glossary content from InfoSphere Data Architect glossary model (*.ndm) files
- Generating business glossary content from logical data models
InfoSphere DataStage
- The quality of information is improved and task steps are clarified in the InfoSphere DataStage tutorial. For more information, see Tutorial: Creating parallel jobs.
- More troubleshooting information, with focus on client login and job runtime issues, is provided. The enhanced troubleshooting information includes information about specific operating systems and information about how to prevent errors. For more information, see Troubleshooting InfoSphere DataStage.
InfoSphere Metadata Asset Manager
Enhanced documentation of import and export bridges
- Individual reference topics for each bridge contain prerequisites, frequently asked questions, troubleshooting information, and detailed help for each parameter. For more information, see Import bridges.
- Individual PDF guides to using BI bridges contain customized information for imports from IBM Cognos, SAP BusinessObjects, Microsoft, and Oracle BIEE.
- Mapping documents for each import bridge show how each metadata class in the source tool is displayed inInfoSphere Information Server.
Asset interchange and istool command line
The following functions are documented:
- Exporting and importing InfoSphere Streams assets.
- Exporting and importing InfoSphere Data Quality Console assets
- Generating business glossary content from InfoSphere Data Architect glossary models
- Generating business glossary content from logical data models
InfoSphere QualityStage
- To help you learn about the new Standardization Rules Designer, tutorials are provided that use data from the product and address domains. For more information, see Tutorial: Enhancing a product rule set in the Standardization Rules Designer and Tutorial: Enhancing an address rule set in the Standardization Rules Designer.
- New and updated topics provide information about the standardization process and standardization rule sets:
- Standardization workflow
- Developing rule sets
- Enhancing standardization rule sets by using the Standardization Rules Designer
No comments:
Post a Comment