Keep your systems in sync. Download and try the Layer2 Cloud Connector today.
How to sync large amounts of local documents with Office 365
With the Layer2 Cloud Connector, you can make use of your documents and files stored on a local file server or NAS in the Microsoft Cloud, such as Office 365, OneDrive for Business, Office Groups, or Microsoft Teams. But what to do when the number of files you want to use are amounting to 50K or more?
Generally, the Layer2 Cloud Connector has no hard limit regarding the amount of files to keep in sync between local file servers and the Microsoft Cloud. But, depending on requirements, synchronizing all files in one big chunk into a single SharePoint library could be a bad idea. It can take quite a long time to migrate all these files into the cloud first time. Even when you succeeded, there are other issues that can arise from having too much content in a single library. Furthermore, keeping several large libraries in sync can lock down lots of system resources. To overcome these issues, here are our best practices.
Fig.: Layer2 Cloud Connector Connection Manager to setup local data and file sync with Office 365.
Best practices to overcome Office 365 large file sync issues
Break your document sets up into smaller logical units
Depending on your company, try to break up the content into units that fit into
your organizational structure. For example, the data belonging to HR would
migrate into a corresponding HR site. Even a single department can be split into
smaller units, if necessary, for example to host specific project groups.
Instead of trying to use different folders in a single library, use different
libraries for each unit or even libraries on different sites. SharePoint, by
design, is meant to hold different units of content in different libraries, so
by splitting the content this way you are helping SharePoint perform better, as
well as the Layer2 Cloud Connector. Maintaining one very big library in
SharePoint can be a pain to both administrators and users, not to mention the
performance issues that can be caused by accessing large libraries during peak
usage hours.
Manage access rights when migrating
documents
Additionally, using different libraries for different
organizational units makes it as easy as possible to apply appropriate access
rights to documents. Note that you cannot sync NTFS access rights to SharePoint
because it works very different on both sides. It is best practice to assign
access rights to SharePoint libraries only, not individual files. Documents
inherit access rights from the SharePoint library on creation (by online users
or by the Layer2 Cloud Connector sync).
Be sure to create a
migration plan before you start with file sync
It is better to
split the content into different libraries and use search or managed metadata to
find documents in one result list. Once you’ve planned how you want to split out
the content in SharePoint, you can create a Layer2 Cloud Connector connection
for each library, manually - or automatically by PowerShell scripting. The
connector can help you filter the local data with its SQL-like filtering options in the File System Provider to make
sure the right content goes to the right place. You can also include or exclude
specific folders and files by name, type, date, or size.
The Layer2 Cloud Connector can handle approximately 100 connections configured on one machine. As this is a soft cap you can have more connections with higher system resources but we recommend to scale up your Layer2 Cloud Connector environment by installing it on other machines if you need more than that (note, each installation needs to be licensed separately).
The SharePoint List View Threshold
The List View Threshold of SharePoint is a long known problem for users working
with large lists. SharePoint can store millions of records or files in one
library… but it has problems displaying these using views. The Layer2 Cloud
Connector can overcome this limit during synchronization, but when using this
library your users will still be affected by issues caused by the threshold. Do
not use filtered or sorted views in the connection settings. Target against a
flat view in the Layer2 Cloud Connector connection string in URL parameter, or
apply a specific view using the VIEW parameter. If you use filtered views in
SharePoint, make sure to index the filtered fields and do not exceed the 5000
items per view limit.
See here for guidance from Microsoft about how to
deal with large lists and libraries:
https://support.office.com/en-us/article/Manage-large-lists-and-libraries-in-SharePoint-B8588DAE-9387-48C2-9248-C24122F07C59
See
here for more details about the SharePoint limits and boundaries:
https://technet.microsoft.com/en-us/library/cc262787(v=office.16).aspx#Boundaries
See
here for the List View Threshold explained in more technical detail:
https://technet.microsoft.com/en-us/library/cc262813.aspx#Throttling
More best practices advises for Office 365 document migration, backup, and sync
It is recommended that for the initial content migration, that you run each
connection separately as a manually triggered sync. Once they have all completed
successfully then you can schedule them to run automatically. It is best
practice to schedule the connections so they run one-at-a-time, in serial, such
that the first one finishes before the second one starts. This can be done by
staggering the start times of the “First Synchronization” setting for the
connections. Also, make sure the interval is appropriate – it is recommended
that it be, at a minimum, the average time it takes for the sync to run when it
has content to update.
If you have content that you need to sync more
often, then see if it is possible to break that out into another connection that
can run more frequently, and then the other connections can be run during the
evening or other non-peak times. This will allow you to wisely use the resources
of the machine to get the best performance and timely updates for the more
important content.
Make use of SharePoint’s robust
search capabilities. SharePoint provides a great search engine with nice web
parts to manage and find your content. Having a strong content management plan
that takes advantage of SharePoint’s search features will greatly help usability
and findability of your sync’d content. Define your facetted search features or
switch to a Managed Metadata search with the Term Store feature of SharePoint.
If you are on an on-premises system, the Layer2
Auto Tagger can greatly help you with tagging Managed Metadata to your
content. In case of SharePoint Online you can apply Managed Metadata to documents using the "Dynamic
Columns" feature of the Layer2 Cloud Connector directly in C#.
Office 365 Document Synchronization Performance Test Results
To give an estimation of
effort for Office 365 document migration, backup, and sync using the Layer2
Cloud Connector we can give the following example:
Hardware and
Software Specifications:
The test server used to sync files
with SharePoint Online was a Microsoft Azure machine A3 Standard.
- Quad-core 2.10 GHz processor
- 7 GB RAM
- 8x500 max IOPS
- Load Balancing
- Windows Server 2012 R2
- Layer2 Cloud Connector Professional Edition 64-bit (Version 7.6.2)
Connection Settings:
- Data Entity 1 (Source) – Folder on C, no sub folders. The folder includes files with sizes between 4KB and 4MB. File formats are PNG, TXT, PDF, DOCX.
- Data Entity 2 (Target) – Office 365 SharePoint library inside a E3 plan. Empty standard document library with no additional settings.
- One-way synchronization, e.g. for migration (Source -> Target)
- Data Provider: Layer2 Data Provider for Office365 Fast File Sync / Layer2 Data Provider for File System
- Auto Mapping enabled
- Ignore changes within target: TRUE (because of its a one-way sync)
Amount of Files and Data Volume:
- 1,000 files in one folder (490 MB), 0 in SP list
- 5,000 files in one folder (2.38 GB), 0 in SP list
- 10,000 files in one folder (4.78 GB), 0 in SP list
Performance Test Procedure:
Each connection was run separately with manual start.
Performance Test Results for initial migration, and for sync after a few files are changed:
These are results for each test, capturing the initial sync (migration from file share to Office 365) run time in minutes [min], RAM used during the initial sync in megabytes [MB], and an update sync (after some data changes) run in minutes [min]. The values are averages resulting of several runs per test. Below the table are additional notes about the results.
Amount of Files | 1,000 Files | 5,000 Files | 10,000 Files |
---|---|---|---|
Data Volume in MB | 490 | 2,380 | 4,780 |
Duration of Initial Sync in Min | 11 | 32 | 67 |
Number of Files per Minute* | 90.9 | 156.3 | 149.3 |
Sync Duration per File* in sec | 0.7 | 0.4 | 0.4 |
Data Volume MB per Minute* | 44.5 | 74.4 | 71.3 |
Duration of a Library Update in sec (no data changes) | < 1 min | < 1 min | < 1 min |
Duration of a Library Update in sec (100 - 200 files changed) | < 1 min | 4 min | 4 min |
Details about the RAM consumption are available in the Cloud Connector User Documentation.
* Rates are all related to the initial sync.
Performance Test Results for initial backup, and for sync after a few files are changed:
The
backup test includes an initial sync from a document library in SharePoint
Online (SPO) to an empty local folder. In this test we extended the test
environment on one client machine (Surface 3 Pro, Windows 10 Anniversary Update,
8GB RAM).
In this case we used the Layer2 Data Provider for SharePoint (CSOM) Data Provider, not the Layer2 Data Provider for Office365 Fast File Sync.
Amount of Files | 1,000 Files | 1,000 Files | 1,000 Files |
---|---|---|---|
Backup Target | Azure | Local Disk | Net Share |
Data Volume in MB | 490 | 490 | 490 |
Duration of Initial Sync in Min | 8 | 12 | 16 |
Used RAM * | 300-500 MB | 300-500 MB | 300-500 MB |
CPU Average Utilization | 17% | 12% | 12% |
Number of Files per Minute** | 128 | 81 | 60 |
Sync Duration per File** in sec | 0.5 | 0.7 | 1 |
Data Volume MB per Minute** | 63 | 40 | 30 |
Duration of a Library Update in sec (no data changes) | < 1 min | < 1 min | < 1 min |
Duration of a Library Update in sec (10 files changed) | < 1 min | < 1 min | < 1 min |
* RAM usage fluctuated during run, depending on load. It was not increasing during run.
** Rates are all related to the initial sync.
Office 365 Document Synchronization - Next Steps
Learn more about the features and benefits of Office 365 document synchronization via Layer2 Cloud Connector. Please register for download and evaluation of the Layer2 Cloud Connector here.
READY TO GO NEXT STEPS?