Identity Clusters

Identity Clusters

When two distinct_ids are merged, Mixpanel creates a mapping signifying each distinct_id refers to the same user.  A collection of two or more distinct_ids mapped to the same user is called an identity cluster.  

From the collection of distinct_ids in an identity cluster, a single distinct_id is chosen (by Mixpanel) to store all user events with.  This value is known as the canonical distinct_id for the cluster.

As new events and profile updates are ingested, Mixpanel consults these mappings to make sure events and profile updates are attributed to the canonical distinct_id and profile.  

Additionally, when a cluster is created or updated, Mixpanel will look through your historic project data for events ingested with a distinct_id in the identity cluster.  If events have been ingested with a value other than the current canonical distinct_id for the cluster, the distinct_id of events will be remapped to that of the canonical distinct_id, allowing you to stitch all event activity together for distinct_ids in the cluster.  

If profiles exist for multiple distinct_id’s in a cluster, all profiles are hidden except for the profile with the canonical distinct_id.  Properties from hidden profiles will not automatically be synced to the active profile.  As such, it is best practice to only create a single profile for a user once they can be identified. 

There are two important things to note with the ID merge system.  First, once distinct_ids have been merged, they cannot be unmerged.  Second, the canonical distinct_id for a cluster can change.  All events and profile updates are still mapped together for that same user - just under a different distinct_id.  The only thing that could change when the canonical distinct_id changes is the user profile that is displayed (only possible if there are multiple distinct_ids with profiles in a cluster).