Scarus Data Quality Server (SDQ)
The Scarus Data Quality Server (SDQ) is our all-round solution for high-quality master data quality. The proven intelliSearch API, which is part of SDQ, it offers various services that can easily be integrated into a heterogeneous system landscape. The integration takes place via SOAP or REST web services and is therefore universally applicable.
Choose from among our checking modules
- Duplicate Check
- Address Validation
- Fault-tolerant Search
- Enrichment
- Business Rule Check
The selected services can be integrated individually into the desired systems. Depending on the platform, this can be done by the customer or the ISO-Gruppe. With our 15 years of experience in the field of master data quality, we are happy to support you conceptually so that you can develop a high-quality data pool from unclean master data.
For connecting to SAP, we offer tried-and-tested products from our company for the various test modules and services. Not only do they fully integrate with SAP standard transactions, they also let you process inspection results from the various modules.
All modules are based on ISO's own intelliSearch® search technology for Enterprise Search & Matching. The search technology is memory efficient and scales both vertically and horizontally. The basic application provides these core functionalities:
- adjustable near-realtime search using various search methods – fuzzy, phonetics, wildcard, phrases, period, geodistance, numeric, autocomplete & autosuggest etc.
- Data Ingestion Pipeline (preprocessing during data acquisition and other processing steps)
- Duplicate Matching Engine for defining record similarity and batch mass processing
The modules extend these functions and make them available as a web service.
Duplicates Check
The module for the duplicate check basically consists of three components:
The first component allows you to create an index for the duplicate check either individually or in batches. This is usually started once in the system to be integrated. During the operation, there is an update function to keep the index up to date according to your requirements.
The second component allows you to freely call individual checks. You can perform an error-tolerant search or a duplicate check on all indexed data. Various algorithms are available for this. Here, the classic comparison methods such as Jaro-Winkler, Damerau-Levenshtein and own algorithms of the ISO-Gruppe are used, which have different strengths and advantages depending on the requirements.
The third component is an inventory check module that makes it possible to check an index in its entirety for duplicates. Here you benefit in particular from the scalability and high performance of the in-memory technology that we use to process the check. Even with large amounts of data, a high data throughput can be achieved through an correspondingly adapted system landscape.
Address Validation
To check the correctness of postal addresses, we rely on reference data from our partners Deutsche Telekom, Arvato Bertelsmann or Informatica (AddressDoctor). In a frequency defined by you, we make the current reference data available, which you store in the SDQ directory. The next time the server instance is restarted, the new data is ready for checking.
Here, too, we offer an easily integrated and universally usable web service that you can incorporate in any application. Search algorithms specially optimized for address validation find potential hits in the reference database when entering an address and provide you with a a hit list of correctly spelled entries.
Fault-tolerant Search
Fault-tolerant search makes it easy to find entries within an SDQ data pool. For searching, it is possible to activate and deactivate a general error tolerance for each attribute, to define the threshold values, and to parameterize the comparison algorithms to be used. In addition to the field-specific search, you can also find search strings in long texts and define cross-field searches. Wildcards can also be used to better filter search results. The search is called via web service.
Enrichment
With this module, you feed one of your data sources into an SDQ data pool as a reference source. This makes it easy for you to add valuable additional information to existing data. The data records are then automatically compared using the duplicate check module. If the mapping is successful, the module copies field data from the reference source to the assigned master record.
Business Rules Check
The additional filter package provides further prefabricated filters that contain business logic for formatting person and company master data. This module extends the generic field processing and thesaurus functions of the basic module to include domain-specific business rules.