The Engage DataBase Model

Anybody familiar with databases will understand relational database models, used by many applications due to its efficiency from both the perspective of data storage and information retrieval. Relational data models store data in a normalized way, eliminating the overhead potentially being caused by creating double entities.

Because of technical and functional reasons, Marigold Engage utilizes a star data model – basically a de-normalized relational model and very easy to understand and query for an end user.

At the center of the star lies an Audience List containing contacts as the intended target for campaigns – everything related tracks back a specific record in that list. Data that cannot be related back to the Audience List has no added value in Marigold Engage.

 

Deciding on the number of Audience Lists

Aim to have the least amount of Audience Lists possible – ideally one - providing a single system of truth across all campaigns. Multiple Audience Lists can be used when there is a complete segregation between records in each list (for example: B2B & B2C) because Audience Lists that contain overlapping contacts give rise to a number of issues :

  • Changes to a record in one list must propagate to the other. Consider the situation in which a contact amends a mobile phone entry. This change must be synchronized across multiple Audience Lists for multiple entries. Having just the one contact record ensures the change takes place in one location.
  • Consolidating duplicate accounts across multiple Audience Lists is near-impossible, as they may not share the same related data structures.
  • Potential legal issues when opting out from one list does not cascade to removal from another list, giving the impression that the unsubscription request has been ignored.
  • A single invalid email address in one Audience List is flagged (according to predefined bounce rules) as invalid. If the same address exists multiple times across multiple Audience Lists, the same error is repeated and lowers the sender reputation each time, causing delays in delivery or no delivery at all.
  • A contact existing in multiple Audience Lists may be confused when receiving conflicting messages from different campaigns.

Consequently, Make sure that your decision about multiple Audience Lists is founded as reversing this in the future is very difficult :

  • Data consolidation of separate Audience Lists (each with their own campaigns) requires a considerable amount of custom development work.
  • Historical data of previous campaigns may be lost or rendered useless.

Hence it is strongly recommended to keep it to a single Audience List and use segments to sub-divide the records according to business needs.

 

Determine the content of the list

When contacts are interacting with campaigns/emails, almost every time the contact's profile information is fetched from the database. As the Audience List is the center of all the information, that's the main entry to all the profile information. However, if the Audience List is locked (because of maintenance, indexing, update of the main table, etc...) the retrieval of information will be on hold. Therefore, it's absolutely necessary to keep this (and all other) tables as narrow as possible.

Data retrieval

What is the maximum amount of fields (datatypes + lengths) that is supported by Marigold Engage and will render performant results on the part of data retrieval?

It's important to realize that the upper limit is not set by Marigold Engage but rather the underlying database.
Marigold Engage uses Microsoft SQL (> 2016 SP1).
Just like most databases there is a recommendation on the maximum theoretical data size of a single record. One record should fit within a single page (8060 bytes). If it doesn’t – at least part of the data gets moved to a secondary location within the database data file (LOB). This hurts performance tremendously, as the address pointer to the secondary allocation unit that stored the main page holding the record, needs to be retrieved and the additional information needs to be fetched from the secondary location.

The 8060 bytes limit is described in much more detail on this page: https://technet.microsoft.com/en-us/library/ms186981(v=sql.105).aspx
It is worth noting that this upper limit of 8060 bytes should not be used as the upper limit for maximum record size in Marigold Engage. A rule of thumb would be to keep tables as narrow as possible. Spreading data over several tables should be done by grouping it together in logical entities & in some cases by function (selection, personalization). The narrower the table, the easier it will become to apply performance enhancers like indexes, even later on.

Only when a certain field is used in almost every segment or email for the purpose of either selecting or personalizing data, it's worth to consider placing it in the Audience List. For all other fields (> 95%) it's best to group fields together in logical entities and create 1:1 profile extensions for them.

One of the main reasons is performance when retrieving information. Maintenance and import tasks will be significantly faster as only one linked table is targeted at once and operations complete faster as there is less fields to take into account.

In conclusion: There are around 15 default fields being added automatically when creating a user list. Usually we would consider adding only a single (integer) field that contains the customer identification number. Any other information is most likely better suited to be stored in a profile extension (1:1).

 

Deciding on the relations

The type of relation you create between your main Audience List and any other linked list determines how you can use that data for personalization and selection :

  • 1:1 linked data can be used directly in any type of message for personalization.
  • 1:N and 1:N:1 can be used for personalization in Custom Journeys, by means of Journey data.
  • 1:1, 1:N and 1:N:1 linked data can be used in selections, such as the lookup.

When data is retrieved from the profile extensions, the relation between the Audience List and these profile extensions is used. Therefore it's very important that this relation uses a unique key of which the datatype is as small as possible, typically a numeric type.

If this is not possible for any reason, please take the following into account :
Use one key (the customer key) for the synchronization and one key for the Engage data model.

Some explanation :
If the customer has its own set of key fields, do know that we only need those to perform the synchronization with the Engage data model. Engage can still define its own keyset for building selections for example. The keys defined by the customer can be stored once in the Audience List. Once that is the case, every table linked to this Audience List will be linked through the Engage User ID or the field in the Audience List that holds the unique ID of the customer.