Merge / Purge Records
The Merge / Purge Records feature is a data management tool used for eliminating duplicate records from your marketing database. This feature allows you to identify duplicate records for the purpose of combining them into a single record (the "merge"), then deleting all of the other duplicates (the "purge"). You can set up a Merge / Purge job to execute automatically, or as a manual process that allows you to review and modify the results prior to executing the Merge / Purge.
Hard vs. Soft Deletes
When the platform runs a Merge / Purge job, the records identified as duplicates are initially "soft deleted," meaning they're flagged for deletion, but are still present in the database table. Then, every hour, a process runs looking for these "soft deleted" records that were flagged more than 24 hours ago. These records are then "hard deleted," meaning the records are permanently removed from the table.
While records are in a "soft deleted" state, they can't be selected as recipients of a Campaign. However, "soft deletes" will still show up in Filter counts. If you run a Merge / Purge job for a table, then run a Filter count off that same table, the numbers generated by the Filter count may be misleading, as the count will continue to show "soft deletes" until they are "hard deleted" 24 hours later.
Access
The Merge / Purge Records screen is accessible by the following method:
-
From the Main menu, select Data > Execution > Merge / Purge Records
Create an Automatic Merge / Purge Job
Missing variable reference allows you define a Merge / Purge process that automatically identifies and deletes duplicate records based on the specifications you provide. Or, you can create a manual job that allows you to review and modify the duplicate records (see "Create a Manual Merge / Purge Job" below for more details on that process).
To create an automatic Merge / Purge job:
-
Select the Merge / Purge Jobs tab.
-
From the Data Source drop-down menu, select the table that you want to search for duplicates.
-
From the Match Field drop-down menu, select the desired field on the selected Data Source table. The system will identify duplicate values in this field, using exact byte-for-byte matching logic.
-
From the Sort Field drop-down menu, select the field by which you want to sort the set of duplicate records (sometimes referred to as the "dupe set").
-
From the Sort Order drop-down menu, select how you want to sort the dupe set -- either Ascending or Descending.
The Sort Field and the Sort Order are used to determine the "winner" (i.e., the record in the dupe set that gets kept). The system will keep the record that appears at the top of the dupe set, after sorting the records based on Sort Field and Sort Order. As an example, let's say you want to find duplicate email addresses in your database, and keep the record with the most recent purchase date. You would select Purchase Date as the "Sort Field," then sort it in Descending order (most recent first).
-
The Fall-back Sort Order field is used to select a sort method in case the records in the dupe set don't have values in the selected Sort Order field. From this drop-down menu, select an alternate sort method -- either Newest Record First or Oldest Record First.
-
Optionally, you can limit the set of potential records by using a Filter. Instead of looking through the entire Data Source table, the system will look only within the Filter's result set. This feature allows you to utilize more complex business rules and logic than just a single Match Field. To use an existing Filter, click the browse button (magnifying glass icon) to browse for and select it. You can also create a new Filter by clicking the new button (plus-sign icon). Please see Filters for more details on this process.
Note: The Filter must be defined using the same Data Source table (selected above in step 2) in order to be available for selection.
-
Click create job. A warning dialog is displayed, telling you that the Merge / Purge results can't be rolled back or undone. Click ok.
-
The log of Automatic Merge / Purge Jobs is updated to include your new job, along with its status. Click refresh to see updated status for this job.
View Automatic Merge / Purge Job Status
The Merge / Purge Records screen provides a log of all the finished and in-progress automatic Merge / Purge Jobs. To view the statistics and status for an automatic Merge / Purge job:
-
Select the Merge / Purge Jobs tab.
-
The bottom of the screen shows a log of all the automatic Merge / Purge jobs. This log shows the match field, the Filter (if used), the job status, the date / timestamp that the job finished, and the number of records that were merged. Optionally, you can click the Filter name to jump to the Filter details screen.
-
Optionally, click the down-arrow icon next to a job to see additional information, including the sort field and sort order.
-
If necessary, click refresh to see updated status and statistics.
Create a Manual Merge / Purge Job
Cheetah Digital allows you define a Merge / Purge process that automatically identifies and deletes duplicate records based on the specifications you provide (see "Create an Automatic Merge / Purge Job" above for more details on that process). Or, you can create a manual job, that allows you to review the duplicate records, and determine which values from which records you want to keep.
To create a manual Merge / Purge job:
-
Select the Manually Merge / Purge Records tab.
-
From the Data Source drop-down menu, select the table that you want to search for duplicates.
-
From the Match Field drop-down menu, select the desired field on the selected Data Source table. The system will identify duplicate values in this field, using exact byte-for-byte matching logic.
-
Optionally, you can limit the set of potential records by using a Filter. Instead of looking through the entire Data Source table, the system will look only within the Filter's result set. This feature allows you to utilize more complex business rules and logic than just a single Match Field. To use an existing Filter, click the browse button (magnifying glass icon) to browse for and select it. You can also create a new Filter by clicking the new button (plus-sign icon). Please see Filters for more details on this process.
Note: The Filter must be defined using the same Data Source table (selected above in step 2) in order to be available for selection.
-
Click find duplicates. The system displays the results of the matching process at the bottom of the screen. Each set of duplicate records (sometimes referred to as a "dupe set") is listed, showing the duplicate value, and the number of records included within the dupe set.
-
Click on a dupe set. The system displays a pop-up window showing the details of each record that was included within this dupe set.
-
Optionally, select specific values from specific records that you want to retain. For example, you could pick a "First Name" value from Record 1, an "Email Address" from Record 2, and a "Phone Number" value from Record 3. If you want to select all of the values within a particular record, click the header row for that record. Within the "Resultant Record" section, the system displays what data the final record will contain.
-
When finished, click save new merged record. The system creates a single record consisting of the values in the Resultant Record section, and deletes all of the other records in the dupe set.