Data Quality: Data Domain Discovery Accelerator
Posted by: Informatica Data Quality
Data DomainA data domain is a predefined or user-defined Model repository object based on the semantics of column data or a column name. For example, Social Security number, credit card number, email ID, and phone number can be individual data domains. A data domain helps you find important data that remains undiscovered in a data source. For example, you may have legacy data systems that contain Social Security numbers in a Comments field. You need to find this information and protect it before you move it to new data systems. You can group logical data domains into data domain groups. A data domain glossary lists all the data domains and data domain groups. Use the Preferences menu in the Developer tool to import and export data domains to and from the data domain glossary. You use rules to define data and column name patterns that match source data and metadata. When you create a data domain, the Analyst tool or Developer tool copies associated rules and other dependent objects to the data domain glossary. Use the Developer tool to manage data domains that includes import and export of data domains to and from the data domain glossary. You can also use the Developer tool to manage the rule logic of data domains. Create a profile to perform data domain discovery and you can identify critical data characteristics within an enterprise. You can then apply further data management policies, such as data quality or data masking, to the data. For example, discover product codes or descriptions to analyze which data quality standardization or parsing rules you need to apply to make the data useful and trustworthy. Another example is to find sensitive customer data, such as credit card numbers, email IDs, and phone numbers. You may then want to mask this information to protect it. You can create and run a profile to perform data domain discovery in both Analyst and Developer tools. You can define a profile to perform data domain discovery based on the following rules: Data rule. Finds columns with data that matches specific logic defined in the rule. Column name rule. Finds columns that match column name logic defined in the rule. Data Domain DiscoveryWhen you create a profile to perform data domain discovery, select the source columns, data domains with which you want to match column data and column name, and sampling options. You can also specify the maximum number of rows you want to run data domain discovery on and minimum conformance percentage criteria. The download contains 2 components
- 27 data domains (example age, SSN, credit card number, email).
- The domain rules (data rules and columns name rules) used by these data domains is also available as part of the download.
FeaturesThe package contains data domains which includes AgeBirthdayCountryCreditCardNumberDrivingLicenseNumberEmailGenderSSN StateSalaryURLVehicleRegPlateNumberZip code
- PowerCenter 9.1, PowerCenter 9.5