Joey on SQL Server

Making Managed Instance Better with NoRecovery

Microsoft has two primary platform as a service (PaaS) offerings for SQL Server running Azure: Azure SQL Database, which is primarily used for new applications that were developed and grown in the cloud and Managed Instance. The latter is designed for on-premises applications --it's closer to full-code compatibility and is deployed into a private network architecture. Beyond code compatibility, there are some other differences between the services, with migration being a key pain point that I want to highlight.

Each service also takes a different approach to data migration. Azure SQL DB only allows data to be imported using a file format called bacpac, which contains object definitions and insert statements. While importing a bacpac is compatible with nearly all versions of SQL Server, it is not an efficient way to ingest data. If you are doing a switchover with no data loss, you have to account for the time it takes to generate the bacpac, and then the time it takes to ingest that data into your target. This can take many hours for larger databases, as all those insert operations need to be written through SQL Server's normal transaction log mechanism.

Azure SQL Database also supports using transactional replication, which can greatly reduce migration-related downtime, but also has a number of limitations, including that all tables must have a primary key. Microsoft does not allow you to restore a backup to Azure SQL Database-- which is a more efficient process than the other two mechanisms as restore simply writes the files to disk, which bypasses the transaction log.

When Managed Instance was introduced, DBAs everywhere were thrilled mainly because features like SQL Server Agent jobs and cross-database queries were supported in a PaaS offering. You could also simply restore a database to a managed instance from an existing SQL Server to quickly get up and running. In the early days of private preview for the service, Managed Instance supported the with norecovery syntax of the backup command, which allowed DBAs to used the time-tested transaction log shipping process to migrate databases there. If you aren't a DBA, or you've never done log shipping, let me explain the process to you:

  1. Restore a full database backup to your target location, in norecovery mode, which allows additional differential and transaction log backups to be restored to that database.
  2. Depending on the timeframe, restore either differential or transaction log backups to keep the database in sync on the primary.
  3. At time of cutover, take a final log backup from the source and apply it to the target server and recover the target database.

DBAs like this process, because it's robust, simple and can easily be executed with T-SQL commands that can be automated. It also incurs minimal downtime -- a final log backup is an extremely small file, and the restore time should be less than a minute. At some point Microsoft removed the ability to run a restore command with the norecovery option. This left us two options for Managed Instance migration: utilizing a single backup/restore process (which meant extended downtime, especially for larger databases that might take hours to backup and restore) or using the Database Migration Service (which had a long list of requirements, and was rather complex to get up and running).

Last year, Microsoft introduced both the Log Replay Service, which we are discussing here, and the Link feature which allowed for the use of Distributed Availability Group feature to be used for zero downtime migrations. Conceptually the Link feature is ideal, but currently the feature is limited to using one database at time, which limits its usefulness.

What Is the Log Replay Service?
The log replay service executes the same concepts that our log shipping process does. You take a backup into Azure Blob Storage, and as you take differential and log backups, they are applied to your target database in a Managed Instance. There are a couple of different ways to execute the cutover, including passing in the name of the last log backup and the autocomplete flag. The process is managed via PowerShell or the Azure CLI, and three of the four times I've tried to use it I've had to open support tickets. One of these was related to a documentation bug, while another was related to the error handling from Azure storage, where the support engineer did the ticket of equivalent of ¯\_(ツ)_/¯.

There's another problematic requirement with managed instances. Docs say you should complete the migration process within 36 hours of kickoff because the LRS process cannot survive system updates (presumably patches that might require a instance to be failed over). This is a big gap between log shipping and LRS -- many organizations use log shipping as a disaster recovery mechanism, and it stays in place for years at a time. One of the main benefits of log shipping as an approach is that it is extremely fault tolerant of system and network outages. The 36-hour requirement on LRS is extremely restrictive on top of a slightly brittle process. Most of the issues I've had with LRS have been related to not passing back errors from Azure Storage correctly to PowerShell. I do feel like that experience could be improved.

PaaS services are all about making things easier for customers. After all, you are paying the cloud provider extra money to take care of things like patching and high availability. The cloud provider should do their best effort to make the migration process as frictionless as possible. Introduce complex migration processes only if you can support well established practices. I get it. A traditional log shipping approach is challenging because the platform attempts to backup all databases as soon as they are on the instance. While this represents a technical hurdle, it should be easy enough to ignore databases that don't have a state of online.

So, my ask for Microsoft is to simply enable "WITH NORECOVERY" for Managed Instance and allow DBAs to use their own log shipping for migrations.

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over a decade of experience working in both Fortune 500 and smaller firms. He is currently Principal Consultant for Denny Cherry and Associates Consulting. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. Joey is the co-president of the Philadelphia SQL Server Users Group . He is a frequent speaker at PASS Summit, TechEd, Code Camps, and SQLSaturday events.

Featured

comments powered by Disqus

Subscribe on YouTube