In this post, we’ll discuss how to deal with migrating hashed passwords from your current identity provider into Azure AD B2C. Password migrations in which you either have access to the users’ passwords in clear text (terrifying!) or have access to the legacy IDP for real-time credential validation are simpler problems to handle. So instead, let’s dive into how you might approach a situation in which you have decided to part ways with your existing identity provider, and find yourself with a giant exported list of user objects with their password hashes and perhaps salts.
Migrating user objects into Azure AD B2C
Before you start migrating your list of users into Azure AD B2C, you will need to make sure you’ve created equivalent attributes in B2C’s user repository that matches the schema of your user list. In addition to mimicking your schema, you will also need to create a few more attributes:
- ‘User Migrated’ – A flag that indicates whether a particular user’s migration is complete
- Salt (Optional) – If the legacy IDP had separate salts per password, you will also need to create an attribute in B2C to store the password salt in. This includes some schemas that include the salt within the password hash.
Now that you have the schema set up, use MS Graph to bulk upload your exported list into Azure AD B2C. Make sure to map the old password hash to be stored into B2C’s encrypted ‘password’ attribute instead of a regular extension attribute as this is less secure. B2C has its own hashing algorithm that runs on any string stored in the password attribute. What this means is that after migration, B2C holds a hash of a hash of your user’s passwords.
Reverse engineering the password hash API
Next, you will need to set up an API that implements the legacy hashing algorithm. In most cases, you will be able to find documentation that describes the hashing algorithm, the number of iterations for which the algorithm is run, and such that your existing credential store uses. With that, you should be able to borrow some code from https://github.com/topics/hashing-algorithms and implement a cloud-scaled instance of the hashing API. Most hashing algorithms that use a salt will either have the salt embedded in the hashed password string itself or available as a separate salt per user object. You will need to test out your hash implementation with a few test passwords to ensure the encoding of a known password string results in the same output from both the legacy system as well as your hashing API.
B2C Policy Updates
Having set up the building blocks for this migration, it is now time to update B2C’s custom policies to orchestrate a user journey for users who are being migrated.
When a user enters their password on your app or website, your policy will need to execute the following sequence:
- Pass the clear text password to the hashing API and obtain Hash(Old_PW)
- Validate Hash(Old_PW) with the value stored in Azure AD B2C’s password attribute
- If the values match, overwrite B2C’s password attribute that currently is storing Hash(Old_PW) with the clear-text password from Step #1 instead. If the values don’t match, the user has entered a bad password and you can throw an error back to the user.
- Toggle the ‘User Migrated’ flag and clear the ‘Migrated Salt’ value fields for that user. After completing Step #3 and Step #4, there are no more relics of the legacy password hash stored against this user’s profile.
On subsequent user logins, your policy can look up the ‘User Migrated’ flag, determine that a user has already completed migration, and compare their clear-text password directly against the stored password in B2C instead of invoking the Password Hashing API.
Reducing migration downtime
One last thing to be aware of is that if you are dealing with a massive user migration of several million records, the MS Graph write to Azure AD B2C can take several hours. On average, you can expect to write about 500K records per hour. If your migration takes several hours, there is a small likelihood that your customers may have changed their passwords in the legacy system before your migration to the Azure AD B2C completed. Scheduling a lengthy maintenance window is usually not great for business. A good option to consider instead will be to allow users to log into the legacy system during the main migration effort and then effect a far shorter maintenance window during which you run a delta query on the data set from the old system to capture only the records that were updated and write the updated user records into Azure AD B2C prior to the cut-over.
If you have questions, comments, or interesting ideas for how else you’ve approached migration patterns in Azure AD B2C, please leave a note in the comments. We’d love to hear from you!