Tuesday, 23 September 2014

Oracle Enterprise Data Quality – About Profilers


Oracle Enterprise Data Quality – About Profiler:

·          The profilers are intended purely to analyze data
·          They are used to quickly understand the data
·          They are used to help find the issues in provided data
·          Profilers  do not 'check' data for business rules
·          Profilers  do not have output filters like ‘valid’ or ‘invalid’ records

The following are some of the different Profilers:

1.     Quickstats Profiler: It is used to analyze overall data i.e. analyze high-level completeness, duplication, and value frequency across many attributes, and highlights possible issues.  Input as ‘any’ data type

2.     Character Profiler: it is used to analyze a number of attributes and counts of the instances of each character. Input as ‘strings’ data type only

3.     Contained Attributes Profiler: it is used to analyze records to find pairs of attributes where one attribute value commonly contains another. Input as ‘any’ data type

4.     Data Types Profiler: it is used to analyze attribute values for their data type e.g String, Number or Date - and assesses data type consistency.  Input as ‘any’ data type

5.     Date Profiler: it is used to analyze a Date attribute for date distribution by day of week, day of month, day of year, month and year.  Input as ‘Date’ data type only

6.  Equal Attributes Profiler:  it is used to analyze records to find pairs of attributes that commonly have the same values.  Input as ‘any’ data type

7.     Frequency Profiler: it is used to analyze value frequency across many attributes.  Input as ‘any’ data type

8.     Length Profiler:  it is used to analyze a number of attributes and measures the length of values by number of characters.  Input as ‘any’ data type

9.     Max/Min Profiler: it is used to find minimum and maximum values e.g:  longest, shortest, lowest and highest.  Input as ‘any’ data type

10.  Number Profiler:  it is used to analyze a Number attribute for number distribution across user-defined bands.  Input as ‘Number data type only

11.  Patterns Profiler: it is used to analyze character patterns, and pattern frequency, across many attributes.  Input as ‘any’ data type

12.  Record Completeness Profiler:  it is used to analyze records for their completeness across many attributes.  Input as ‘any’ data type

13.  Record Duplication Profiler:  it is used to analyze records for duplicates across many attributes.  Input as ‘any’ data type

                 14. RegEx Patterns Profiler:  it is used to analyze a number of attributes for values 
                      that match a list of regular expressions.

No comments:

Post a Comment