top of page

The Ultimate Guide to Migrating from SharePoint to Dataverse (Part Five - Transforming the Data)

Updated: Feb 2, 2024

After all of the planning, it's finally time to really begin the process of moving data into the Dataverse!

In this post, we'll discuss the import ETL (Extract, Transform, Load) process. ETL refers to the process of retrieving data from a source, modifying it as needed, and loading it into a destination data store.

We will once again reference our mapping document to transform data from our SharePoint data types into our Dataverse types. Most of them will likely be basic text-to-text or date-to-date conversions, but there are still some considerations you need to be aware of before beginning.

  1. Dataflows do not auto-detect data types

    1. Most data types will need to be set in the Dataflow before importing.

  2. Timezones

    1. Dataverse has three options for date-time columns - date only, time zone independent, and user local. See details here

  3. Special characters

    1. There are cases where invalid Unicode characters can cause problems during import. It's sometimes easier to spot-check your data and then see if you run into any issues during import than to try to find all issues manually.

  4. Embedded data

    1. Some columns such as lookup or user/group columns return a nested record that must be expanded. These are similar to nested tables, beware that expanding nested tables can cause record duplication.

  5. Case sensitivity

    1. Using strings as a key such as email is case sensitive. It's best to convert them all to lowercase to make matching easy.

Next, I'll break down the conversion process of each SharePoint column type. This outlines the process of cleaning up your data and staging it for a successful import.


Choice

Converting a choice column is one of the more complicated transformations from SharePoint to Dataverse, as the SharePoint value is a string and the Dataverse column will want the numeric value for the choice value. For example, with our Warehouse table, we have a Stock Type column that is a choice field in SharePoint. If I wanted to convert that to a choice value in Dataverse, I would need the numeric values of the choices in Dataverse.


This will require us to make a blank query so we can import the choice values.


Below is the query for our Warehouse Stock Type choices from our example


let    
    Data = {
    [Label="Fulfillment", Value=122270000,DisplayName="Stock Type"],
        [Label="Seasonal", Value=122270001,DisplayName="Stock Type"],
        [Label="Bulk Only", Value=122270002,DisplayName="Stock Type"],
        [Label="Intake", Value=122270003,DisplayName="Stock Type"]
    },    
    Source = Table.FromRecords(Data)
in
    Source
    

By default, the query will simply be named Query. You can rename it by right-clicking the query and selecting Rename.


The value of a choice column in the Dataverse is an integer that is unique within the scope of that choice. Choice IDs are seeded by the Choice value prefix setting on the publisher followed by 0000, and each additional choice increments from there (0001, 0002, etc). You can get these values from the choice column in the Dataverse by navigating to the column, editing it, and scrolling down to the Choices section

Copying these values and supplying ChatGPT with my example schema above along with your choice's Display Name will generate the query


Example prompt:

ChatGPT's response:

You can view my ChatGPT thread here:

Rename the query after the choice column to keep all of your choice queries organized.


For multi-select choice columns, see the separate article here.


Date and Time

Date and time columns are ready to go immediately, consider timezones or converting date-time to date only.


Multiple lines of text

Multi-line text columns are generally good to go without any conversions. I would recommend checking the character length to make sure all data will import and spot-check fields for invalid characters using a service such as https://freetools.textmagic.com/unicode-detector

You can also clean the text using the Transform column > Test transformations > Clean function.


Person or Group

Person or group columns in SharePoint are nested records that contain all of the fields related to the person or group referenced in the record (name, email, Id, etc). We are only interested in the email.

When you need to reference a user record, it's best to use their email as we know it is a unique identifier within the organization.

I would recommend using lowercase emails just to remove and case sensitivity mismatches.



Number

Numbers are generally ready to go without transformations. It's a good idea to validate and check high and low values in your data to ensure there are no outliers or invalid numbers in your data.


Yes/No

Converting a Yes/No field from SharePoint to Dataverse takes just a couple of clicks. Navigate to the column, right-click it, and select Change type > True/False.


Hyperlink

Hyperlinks are nested records. You simply need to expand the Url column from the nested record

By default, the expanded column will be named Url. You can rename it after expanding by changing the rightmost argument of the ExpandRecordColumn function.


Image/Thumbnail

There are multiple options, below



Option One - Import the URL path to the image in SharePoint

Sharepoint images are stored as a URL reference to the SharePoint image. To extract the URL from the value, we'll create a custom column called ImageURL and parse the JSON to extract the base URL and append the path to the image. To create a custom column, navigate to the Add column tab in your Dataflow editor and click Custom column within the General section.


Enter a name and this code, changing the YourImageColumn reference to the name of your image column.


let 
	jsonString = [YourImageColumn], 
	parsedJson = Json.Document(jsonString), 
	serverUrl = parsedJson[serverUrl], 
	serverRelativeUrl = parsedJson[serverRelativeUrl] 
in
	serverUrl & serverRelativeUrl

Your custom column configuration should look similar to this:

This will give you the path to your image. If you are planning on inserting the image itself into the Dataverse and not just a URL path, this step is not necessary.

Option Two - Import the image content into a Dataverse file column

To set a Dataverse File column with content from a SharePoint attachment, you will need to access the file content via a Power Automate flow and set the Dataverse file column with your image data. That flow is laid out below, at a high level it is triggered when the Dataflow completes. It proceeds to get all Products from SharePoint, search Dataverse using an OData query to filter the Product records, then loops through them all, reaches back to SharePoint to get all attachments from the record, then gets the content from the attachments and uploads the file content to the Dataverse record.

The process for setting image content from an Image column is similar. Just use the image path to get the file content using the path.


This is a complex flow, please comment below if you have questions!


Currency

Converting a currency field from SharePoint to Dataverse takes just a couple of clicks. Navigate to the column, right-click it, and select Change type > Currency.

After that, it's a good idea to check the high and low values just to make sure they fall within your Dataverse column configuration. This is also a good way to find potential invalid data or outliers you may not want to import.


Managed Metadata

Managed Metadata is a tricky one because it can hold so many different types and structures of data. Managed Metadata will be a nested List meaning records with multiple options will expand into multiple rows, causing duplicate data.

For this guide, we'll stick with our example. If you have questions about the Managed Metadata process, comment below!


In our example, we had managed metadata for our Products list and contained Safety and Environmental Certifications for each Product. Our goal was to migrate those to one table containing all certification types.


To do this, I'll first duplicate the Products list

I will then rename it Certifications

Pressing Ctrl + A in the Dataflow selects all columns. I then hold Ctrl and select the Id column as well as my two Certification columns. Then I right click and select Remove columns

This leaves me with only the ID and Certification columns. I'll now expand the first Certification column


After that I'll expand the Column once more and select the relevant columns

This gives me a table of the values and their IDs as well as the ID of the associated Product record

From here, apply the other methods outlined in this article to continue transforming the data how you need. In my case, I'll populate a table with the Certification values by removing the duplicates and loading the Certification table in Dataverse. Then in my next round of imports, I can establish the Lookups by referencing the Product ID with the Certification Product ID key.



Lookup

In a Dataflow, a lookup column will appear as a nested record of the main record. We will need the ID of the nested record to populate the key field in our main record to set the new Dataverse lookup value. To do this, we'll need to expand the column by clicking the Expand button on the column:

Then uncheck Select all and select the Id field (or whichever field you will be using to set the key column we created in the previous Relationships section of this series) Then click OK.

You can then rename the column by changing the last argument of the ExpandRecordColumn function.


Missing User Records

If you are relating records to users, you will likely run into an issue where previous employees are not showing up in your Dataverse Users table. To fix this, you will need to use power query to identify a list of users who do not appear in the Users table. You can do this by connecting to your Dataverse


In the top right, click the gear icon and select Session details to get your environment's URL.


Past the URL in the Environment domain field, remove the 'https://' and the slash at the end then click Sign in

Once you connect, select the systemuser table and click Create

You can now merge the systemuser table with any other table with a column of emails and then filter the table down to records with no match.


This will give a list of emails that need to be imported. To import them, create an excel file with the headers FirstName, LastName, Email, and Invitation. The Invitation value for all users should be 'Invitation Accepted'. Then fill out the first and last name as well as the email for the missing user(s). Save this as a csv.

Once you have your csv file, go to your environment's Advanced settings by clicking the gear in the top right.

Then navigate to Settings > System > Data Management

Then navigate to Imports

Import data


Select your csv and click Next.

Continue to click Next until you get to the Map Record Types page. Map the csv to the User table.

Map the fields like this and click Next. Keep all other options as default as you click Next until you can finally click Submit.

You can now monitor the import.

If there are errors, you can check them by clicking the import's name and navigating to the Failures tab


Conclusion

We covered a LOT in this post. The ETL process to get data from SharePoint to Dataverse is not perfect and you may find that you need to convert a string to a Yes/No or add some logic to get the final data output you'd like. This guide will hopefully equip you with some ideas on how to tackle that based on all of the column types. We didn't even scratch the surface on the power of the Power Query editor that the Dataflow uses for imports, for more info about that, click here.


In my next post, we'll discuss the mapping process in the Dataflow and how to configure the final import settings! Until then, set your data types and do some spot-checking on the values once they're set for the Dataverse.

 
 
 

Contact

Thanks for reaching out!

PowerUp!

My goal is to educate people in the advanced topics of the Power Platform in a way that helps users go from citizen developer to solution architect.

Connect

  • LinkedIn

© 2021 PowerUp - Power Platform Tutorials. All Rights Reserved

bottom of page