A first look at the new Import Export Mailbox API in the Microsoft Graph Part 3 - Importing Items
In the first two parts in this series I covered the basics of getting going with the Import Export API endpoint in the Microsoft Graph and in part 2 how to export items. In this post I’ll be looking at the Import side of the API.
Permissions
This endpoint supports both delegate and Application permissions. If you are using Delegate permission these aren’t Shared permissions so you can’t use delegate permissions to import (or Export) from a mailbox other then the Primary/Archive of the associated account. This is different from EWS where you could use Delegate permissions to access shared mailboxes as long as you had the underlying Exchange Mailbox Permissions. This means if your going to export or import from different mailboxes or shared mailboxes you need to use Application permissions.
Creating an Import Session
Importing unlike exporting requires that you create a pre authenticated url (import session) to which you upload the items you wish to import to. Critically this isn’t a normal Graph Endpoint but one in the https://outlook.office365.com namespace (the actual underlying infrastructure is unknown) eg https://outlook.office365.com/api/gbeta/Mailboxes('MBX:….')/importItem?authtoken=eyJh…
As the authtoken is part of the querystring it does generate quite a large URI and anything that is capturing the URI request is capturing your authtoken. If you want to or need to debug the JWT token you can decode it like any other jwt token using jwt.io and it should look something like
If you have used the large attachment upload in the Microsoft Graph these tokens will look pretty familiar and the actual process of generating a session and getting a pre-authenticated URL to a (outlook.office.com) endpoint is boiler plate stuff between the two endpoints.
Token Expiration
You need to track token expiration yourself as you can’t use a auth library like MSAL that would handle token renewal this means you code needs to handle it all with custom logic.
Data Formats
I talked a lot about the data formats in Part 2 but the format this endpoint will accept is the Fast Transfer stream generated by either the Graph ExportItems request or an EWS ExportItems request (the later may or may not work).
Import vs Export Considerations
Before I go into the weeds about how this all works one thing you should understand to give a more a 360 degree view of the whole process is that when you import and Item in Exchange vs Export an Item in Exchange your asking the server (or the service/cloud) to do a lot more work. Eg when exporting an item you are just reading existing data, when importing an Item your asking the Exchange Database to do a transaction and commit new data into the Information Store which involves transaction logs and ultimately replication (but this happens in the background). Exports will always be faster then imports because of the impact it has on the Server resources.
Import/Uploading Items
In EWS you had an UploadItem operation that allowed you to add one or more items in a single request. In the Graph/Exchange endpoint this only allows the submission of a single item at a time. This is not optimal where you have a large number of smaller items to import like contacts, tasks or calendar items.
Batching and Throttling
One of the biggest challenges for importing items in Exchange in any sort of volume is the throttling of the requests. In the “Import vs Export Considerations” we talked about the cost on server resources that imports place, throttling is the way for the service to limit the impact importing data has on a particular server (or piece of infrastructure).
You can’t batch the Import Items request (like you could batch the export item request), while graph has a Batching endpoint the url’s it accepts are relative to graph endpoints eg (for export) “/admin/exchange/mailboxes/{mailboxId}/exportItems” while the Import URL is always "https://outlook.office365.com/api/gbeta/Mailboxes('MBX:….')/" with a token as part of the URL with an audience of outlook.office.com.
Just looking at this part it is much less efficient then EWS where you could have many items batched in one request which reduces both latency (time between each import item) and the overhead associated with processing multiple request boths on client and server.
From an engineering perspective this isn’t great because your always wanting to optimize your code as much as you can and when your doing anything at volume it becomes a critical factor. The reality of actually using the service does negate this a little because throttling limits impact at such a low level of throughput that even if you could batch this it would just mean you’ll be hitting the throttling limit quicker but your overall throughput in terms of uploading data will be the same.
Throttling impacts
This is certainly one area that is different from EWS and the limits and headers/responses are in line with the Graph limits (kind of). Eg the following is a throttling response I received from the service.
From this you can tell that the Graph Request limit is in place so no more then 10000 request in 10 minutes. The actual throttling limit that I exceeded was the IncomingBytes limit which is supposed to be “150 megabytes (MB) upload (PATCH, POST, PUT) in a 5-minute period” . This means my effective max rate limit should be (based on chatGpt calculations)
The minimum speed you would need to transmit 150 MB in 5 minutes is approximately 4.19 Mbps. So the maximum theoretical you can transfer is approximately 1.76 GB of data over a 4.19 Mbps link in one hour.
With my Single thread un rate limited PowerShell script I only managed to achieved around 1.2 GB/h but if you optimize the process getting closer to the theoretical is doable. One problem is if you (or more when) you come across a folder where you have thousands of really small items (eg say someone has saved every DMARC report they have ever received). In EWS with the ability to batch it was slow going with a single thread or single item uploads this type of import is going to be painfully slow. There is nothing stopping you having multiple import threads up to the 4 concurrent limit but if it just means hitting the throttling limits quicker and harder this can end up being less efficient then optimizing at one thread.
One thing to keep in mind is that limits apply per mailbox so if you are doing any sort of scale level migration doing multiple mailboxes at once (like you could with EWS) is probably going to be the path you would take.
Throttling headers and trying to avoid the limit
For overall throughput your always better not to hit a throttling limit than to hit it and have to process the backoff and start again. This is especially true if you have multiple threads eg consider you have two concurrent threads importing you breach the IncomingBytes limit and you receive a backoff on one thread if your don’t have good inter-thread communication and your second thread submits a new request during the backoff period you may then get a more punitive back-off period. It’s also usually the backoff period is longer then the budget recharge so if you think of it like a road with traffic lights if you stay under the throttling limits you will have greens all the way and reach your destination at the best arrival time. If you start hitting red lights then you will have to wait a random amount of time and go again so the overall time of your journey will be longer then if you just get all greens even if your speed at certain points is faster along the path.
There are two headers that can currently help you avoid hitting the request throttling limits around the number of request you make in a 10 minute period.
For normal email messages it’s unlikely you would hit that limit before the incomingbytes but for smaller items like contacts, tasks etc then it could be a real possibility. So this is something you should always process on your responses and pause when needed. It would be really useful for Microsoft to include the remaining IncomingBytes limit in the response also as that would help to avoid hitting that limit and optimize the import process as much as one could.
Import Challenges - Ex/X500 addressing
When you import an item with full fidelity if your going into a new Exchange Organization (eg different tenant or OnPrem Exchange Org) some of the Address properties of the Item you are importing will now be invalid. This will hold true if you import a Msg or from a PST or from EWS so its not a new problem it’s just one you should always be aware of. This is because Exchange uses X500 or the EX addresses to route messages internally (SMTP was an Addon in the early versions of Exchange). If you where to look at the recipients collection of a Message for any local Exchange related recipients you would see something similar to
In Exchange 5.5 (and earlier) this would have been the Directory Object pre active directory that this message would/was delivered to. In the modern world the directory it is now Entra and the EX address is just used to resolve to the correct Entra UserObject and the mailbox (or other object) associated with that Ex address. So to maintain reply-ability of older messages you need to add the old Ex/x500 address as a proxy address the process is explained n https://learn.microsoft.com/en-us/microsoft-365/enterprise/cross-tenant-mailbox-migration?view=o365-worldwide this just means these old addresses will resolve in the new/target Exchange Tenant.
Multiple passes and Deleting, Moving and resyncing objects
Generally with Mailbox migrations or folder synchronization because they are so big and the rate you can move the data at is so low it is unlikely that you can just do one copy of the Mailbox and be done. Or if your doing long term synchronization between to different endpoints your going to need the ability to do a second pass copy of the data which would involve copying anything that is new, modifying anything that has changed (or delete and recopy) and deleting anything that has been deleted or moved in the source in the target mailbox or Archive.
Updating an item that has changed is covered in the ImportItem endpoint by using the update mode eg
{
"FolderId": "EDSVrdi3lRAAE…",
"Mode": "update",
"Data": "AQA..",
“ChangeKey: “: ““
}
Deleting (or moving) Items
The single largest problem with this endpoint is the lack of the ability to delete items and for a number of applications this is a show stopper. Deleting is a very important part of a import or synchronization process and you can’t really have such a process without the ability to delete or move things. Its like designing a gearbox that can only shift forward you can drive a car away with it but at some point you need to shift down. The same goes for a migration or synchronization at some point you will have to delete an item that was copied or moved. This endpoint is the only mechanism you can use to access particular items in a mailbox and the only way you can access the archive store. So the lack of this ability is going to be a real blocker for a lot of applications and it makes no sense why you would leave this out. Even just the ability to move an item would unblock most use cases because in Exchange a delete is effectively just a move to the retained items folder anyway. In Mailboxes you could mitigate this a little by using the regular graph endpoint to access the item and delete it but you can’t access every item this way. One workaround is maybe update the item type to a type you can then access using the regular endpoints but wont work for the Archive store. This is just really disappointing from Microsoft to leave out deletes and moves.
Samples doing a Export and Import between two mailboxes
As with the first post I’ve been updating my test PowerShell script that uses the PowerShell Graph SDK. This script is available on git hub at https://github.com/gscales/Powershell-Scripts/blob/master/Graph101/GraphSDK/Import-ExportMod.ps1
This is a run through of how you can export the last 100 items in the archive folder of one mailbox and import them into a targeted folder in another mailbox (both mailboxes are in the same tenant in this example)
First connect using connect-mggraph because this will copy data between two mailboxes you need to use Application permissions
# Define the Application (Client) ID and Secret
$ApplicationClientId = "eeedd474-f96e-4603-bf8d-…."
$ApplicationClientSecret = "-lE8Q~.."
$TenantId = "1c3a18bf-da31-4f6c-a404-.."
# Convert the Client Secret to a Secure String
$SecureClientSecret = ConvertTo-SecureString -String $ApplicationClientSecret -AsPlainText -Force
# Create a PSCredential Object Using the Client ID and Secure Client Secret
$ClientSecretCredential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $ApplicationClientId, $SecureClientSecret
# Connect to Microsoft Graph Using the Tenant ID and Client Secret Credential
Connect-MgGraph -TenantId $TenantId -ClientSecretCredential $ClientSecretCredentialadWrite"
Get the Mailbox setting and source and target Folders for both the source and target mailbox
$SourceMailbox = Invoke-GetMailboxSettings -Upn gscales@….com
$TargetMailbox = Invoke-GetMailboxSettings -Upn jcool@….com
$SourceFolder = Invoke-GetMailboxFolder -MailboxId $Targetmailbox.primaryMailboxId -FolderId Archive
$TargetFolder =
Invoke-GetMailboxFolderFromPath -FolderPath \Inbox\importtest1 -MailboxId $TargetMailbox.primaryMailboxId
Enumerate the items your going to export in the source mailbox
$ExportItems = Invoke-ListMailboxFolderItems -MailboxId $SourceMailbox.primaryMailboxId -MailFolderId $SourceFolder.Id -ItemCount 100
Export those items to a folder
Invoke-BatchExportItems -Items $ExportItems -MailboxId $SourceMailbox.primaryMailboxId -ExportPath C:\temp\ImportExport\ -Verbose
Create an Import session for the Target Mailbox
$TargetImportSession = Invoke-CreateImportSession -MailboxId $Targetmailbox.primaryMailboxId
Import those items into the Target Mailbox
Invoke-ImportItemFromDirectory -ImportURL $TargetImportSession.importUrl -FolderName C:\temp\ImportExport -FolderId $TargetFolder.id -Verbose
Conclusion
Without the ability to delete (or even move items to the retained/deleted items folder) this Import endpoint is a fail because it essentially can’t be used to replace EWS Migration or Synchronization apps. The throttling limits are pretty low but kind of inline with EWS the incomingbytes limit is very easy to hit but you can engineer around it (you can’t engineer around the lack of delete, by the way you can delete a folder which would delete all the items in that folder but you can’t delete a single item...) Also some good documentation around the Import side and especially around throttling would help greatly anybody trying to use it (this is really a few hours work and I don’t understand why you wouldn’t do this if your trying to encourage people to migrate to a new endpoint).
From GMail and most other platforms IMAP migration is still going to the best solution the FTS solution is really only for native Exchange to Exchange migrationshttps://learn.microsoft.com/en-us/exchange/mailbox-migration/migrating-imap-mailboxes/migrating-imap-mailboxes
How are we to use Graph to import mailbox data for a migration scenario from another platform that does not support the Fast Transfer format (e.g., Google Workspace)?