Why a Document Migration is Complex

Document migration is often underestimated in terms of complexity and criticality, and we warn against it several places here on the blog. So what are the complexities of migration, and what are the paths through them. We dive into the complexity of migration.

Copy – Paste

When you haven’t tried it before, moving documents from one system to another doesn’t sound that difficult. Files need to be copied: copy – paste.

There’s just so much else to it, and it’s often underestimated. It’s just too bad that a new system, which is otherwise good, is not well received in the organization because the documents from the old system are difficult or impossible to find, or that the new system is delayed because the migration is difficult. Worse, of course, is it if the integrity of the documents is compromised in the process.

Metadata

The complexity stems from the fact that documents consist of two parts: one or more content files and metadata. Document migration is about moving the document in its entirety – files and metadata – and keeping it coherent. Read more about this in MIGRATE THE ENTIRE DOCUMENT.

Next, systems differ in their need for and use of metadata. Often, metadata therefore needs to be transformed on the way from the old to the new system, so that the document fits into the new system and the way you want to use it.

So now we are faced with the schism that we need to preserve both files and metadata in the migration, and yet we have said that metadata needs to be transformed. We have to do the transformations so that the documents can work in the new system, but then we have to make sure that the transformations are consistent and documented and that this documentation is stored for future reference.

Versions, Renditions and Other Relationships

Documents often come from document management systems or esdh systems where there is versioning, and the same document may exist in many versions. Perhaps in many renditions (= same content different formats, e.g. a pdf version of a word document).

There are often relationships between documents other than version or rendition relationships. A document is based on a template, a file is an attachment / appendix to another, a number of documents belong to the same case, etc. These types of relationships may also be important to preserve.

The documents must be loaded into the new system in version order for the version tree to be created correctly. The original version must be loaded before the secondary version if the relationship between the two renditions is to be established. A document cannot be added to a case before the case exists, etc.

There is a logic to the relationships in the old system which must be re-established in the new system, and that will usually make quite a demand on the order and sequence of things.

Migration of Documents Takes Time

There is a physical limit to how quickly a volume of data can be moved from one place to another. You can adjust the physical circumstances – extra servers, more bandwidth, etc., but under the circumstances, there is a limit. It’s smart to find the bottleneck early on and measure the speed you can hope for.

One way out is often delta migration, which means migrating in stages. An example might be to migrate approved documents under calm circumstances in the run-up to go-live, while the closing window just before go-live is then about getting the last ones in – those approved since the last migration. The “delta” in delta migration thus describes doing a migration, and then doing the migration of an increment of documents. The example makes sense if a criterion for migration is that a document is approved. Of course, there could be all sorts of other criteria and realities that would cause the “delta” to be defined quite differently.

We also have seen scenarios where, after the new system is launched, the business is allowed to complete and approve “in process” documents in the old system, then we (delta) migrate them over and finally close the old system.

This type of strategy – or those that are much more complex – is often necessary because of the physical time a migration takes, and it then complicates the administration of the migration enormously.

Conclusion

Document migration has some aspects which are particularly complex. This is because documents are not just files, but files with metadata – including relationships to other documents. In addition, it often involves large volumes and therefore take significant time. When it takes too long for the business to live with the required closure time, things are further complicated by having to migrate in stages.

Migration isn’t just complex – it can also be highly problematic if the process isn’t controlled enough to ensure that you have documents with integrity afterwards.

Our experience and best advice for a controlled migration, where the documents are well afterwards, is gathered in our Guide to Document Migration.