Data Profiling Jump-Start Package

Data Profiling
The introduction of the CA ERwin® Data Profiler (Data Profiler) software brings data profiling into mainstream data modeling environments. Data profiling adds tremendous value to any data modeling development environment. The following bullets identify some of the potential benefits:

  • Elimination of the code-load-explode development methodology. The code-load-explode methodology occurs because ETL or any data movement specifications are created from the knowledge of the subject matter experts (SME). These specifications are often flawed because of the lack of knowledge about the data content. This causes problems to occur during unit, system, integration, and quality assurance testing. Each time a problem is identified, the entire development process is repeated. Leveraging data profiling during the analysis or feasibility phase of the project ensures accurate specifications to be created for data movement.
  • Enhanced data models. Data profiling infers metadata from the data content. This inferred metadata is ideal for determining the exact metadata necessary to capture the data content of interest. Whether creating a new data model or refining an existing data model, profiling existing implementations of the entities/tables ensures that the metadata defined in the data models is accurate.
  • Data source validation. Leverage data profiling to validate that source systems contain the correct data and evaluate the validity of the data content from these source systems.
  • Simplified mapping. The results from data profiling are ideal for mapping exercises between source and target systems. All of the inferred metadata provides the intimate knowledge about the data content necessary to determine how data should be mapped.
  • Harmonization of the data for master data management (MDM) or data warehousing (DW). Whether talking about MDM or DW, the data content needs to be understood to determine how to consolidate the data content into a single target. This includes identifying the correct metadata for the target DW or MDM data hub.
  • Establish data quality standards. Data profiling provides the intimate knowledge about the data content to establish valid values for the data content. The profiling results provide the information necessary to identify data anomalies and/or data quality problems.
  • Enhanced metadata efforts. The results of data profiling provide inferred metadata that enhances the model metadata. Leverage the profiling results to establish metadata standards as part of data governance or source the profiling metadata into a metadata repository to enhance the model metadata with inferred metadata.
  • Data quality and metadata validation. Leverage data profiling to validate that the data content matches the metadata defined for a schema. At the same time, validate that the data content matches original design specifications. Identify data anomalies and data quality issues to be resolved as necessary.
  • Leverage data profiling throughout the development process. Data profiling is useful for validating data content for unit, system, integration, and quality assurance testing. Leverage the overlap analysis to validate source to target migrations and transformations of data content and more.

Obviously, there are many uses for data profiling in a data modeling development environment. Realizing these benefits ensures a strong and complete return on investment (ROI) for acquiring the data profiling software. However, most organizations and data modelers are new to data profiling. It will take time for most companies to integrate the software and establish the environment, let alone begin to utilize the software and realize the benefits of technology.

Jump-Start Package
Data Innovations is now offering a jump-start package to help our clients realize a strong and complete return on investment for the CA ERwin® Data Profiler software as quickly as possible. The following bullets identify the package:

  • The package requires five or more CA ERwin® Data Profiler licenses to be purchased from DI. The cost of the licenses is discounted if purchased as part of the jump-start package.
  • Minimum of 160 hours of consulting services that include the following:
    • Establish the profiling environment. This includes creating the profiling database, installing and deploying the software, and establishing the connectivity to data sources.
    • Identify short-term profiling goals. The DI consultant will work closely with the client to identify projects and tasks to leverage data profiling to realize immediate ROI.
    • Identify data profiling best practices. Ultimately, data profiling should be integrated into the organization's software development life cycle (SDLC). DI consultants have helped numerous clients establish best practices, data quality factories, and methodologies to integrate data profiling into the organization's SDLC. This is a long-term goal and this jump-start package will help establish the approach. However, the approach may not be completely integrated into the SDLC during the engagement.
    • Custom training. DI consultants are experts with data profiling software. This knowledge and experience is transferred to the client during the engagement through formal training and mentoring. This includes training the administrators and users of the software. The training is customized to meet the specific needs of the client and their projects.
    • Methodology development. Executing the data profiling is simple. The challenge is to interpret and utilize the inferred metadata to solve specific business and IT problems for data modeling, data quality, data mapping, and other initiatives. DI methodologies identify the profiling workflow and include step-by-step instructions for utilizing the software to solve specific problems.