Oracle Enterprise Data Quality – About Profiler:
·
The
profilers are intended purely to analyze data
·
They are
used to quickly understand the data
·
They are
used to help find the issues in provided data
·
Profilers
do not 'check' data for business rules
·
Profilers
do not have output filters like ‘valid’
or ‘invalid’ records
The following are some of the different Profilers:
1.
Quickstats
Profiler: It is used to analyze overall data i.e. analyze high-level completeness,
duplication, and value frequency across many attributes, and highlights
possible issues. Input as ‘any’ data
type
2.
Character Profiler: it is used to analyze a number of
attributes and counts of the instances of each character. Input as ‘strings’ data
type only
3.
Contained
Attributes Profiler: it
is used to analyze records to find
pairs of attributes where one attribute value commonly contains another. Input as
‘any’ data type
4.
Data
Types Profiler: it
is used to analyze attribute values for
their data type e.g String, Number or Date - and assesses data type
consistency. Input as ‘any’ data type
5.
Date
Profiler: it is used to analyze a Date attribute for date distribution by day of week, day of
month, day of year, month and year.
Input as ‘Date’ data type only
6.
Equal
Attributes Profiler: it is used to analyze records to find pairs of attributes that commonly have the same
values. Input as ‘any’ data type
7. Frequency Profiler: it is used to analyze value frequency across many attributes. Input as ‘any’ data type
8. Length Profiler: it is used to analyze a number
of attributes and measures the length of values by number of characters. Input as ‘any’ data type
9. Max/Min Profiler: it is used to find minimum and maximum values e.g:
longest, shortest, lowest and highest.
Input as ‘any’ data type
10. Number Profiler: it is used to analyze a Number
attribute for number distribution across user-defined bands. Input as ‘Number data type only
11.
Patterns
Profiler: it is used to analyze character patterns, and pattern frequency, across many
attributes. Input as ‘any’ data type
12.
Record
Completeness Profiler: it is used to analyze records for their
completeness across many attributes. Input as ‘any’ data type
13.
Record
Duplication Profiler: it is used to analyze records for
duplicates across many attributes. Input as ‘any’ data type
that match a list of regular expressions.
No comments:
Post a Comment