In-Depth
Open but Not Equal: Microsoft SQL Server 2016 vNext for Linux
The Microsoft SQL Server database has come full circle now that the company is set to release a Linux-based version. In keeping with its push to embrace Linux, Microsoft announced its plans to port SQL Server to Linux last year, though the move required a major migration effort, given that it’s built to run optimally on Windows.
While introducing SQL Server for Linux makes the company’s database a cross-platform relational database management system (RDBMS), it’s also an important milestone. Offering it on Linux will remove the one obstacle among those shops considering a SQL database server seeking an alternative to Oracle and IBM, but unwilling or unable to choose a Windows-based RDBMS. Since the release of the public preview last fall, I’ve spent some time exploring the innards of the new database.
The first stop is the Microsoft Web site for the public preview of SQL Server to determine what the best way is to get up and running. Microsoft offers several options: Provision an Azure Linux virtual image with SQL Server installed, use an existing Linux system and install SQL Server yourself, or pull a Docker image from Microsoft. If you plan to do the installation onto an existing or fresh Linux server, the initial preview only supports three Linux distros: Red Hat Enterprise Linux (RHEL), Ubuntu and, with the latest Community Technical Preview (CTP) released in late January (1.2), SUSE Linux. In addition to releasing CTP 1.2, Microsoft announced its SQL Early Adoption Program (SQL EAP) for customers looking to validate new features of SQL Server 2016 v.Next for both Windows and Linux.
For this review, the first SQL Server 2016 v.Next CTP was installed onto a free, community-supported version based on RHEL known as CentOS. Relying on the CentOS download page under the Base Distribution section, Microsoft identified CentOS release 7 (1511) as RHEL 7.2. With the fresh Linux VM up and running, the v.Next version of SQL Server for Linux is ready to be downloaded and installed.
Download and Install
Both the download and installation steps for SQL Server on Linux are well-documented by Microsoft, and for this CentOS server, the installation went smoothly. Following the prescribed steps for RHEL, the install kit for SQL Server downloaded and then installed with no issues in only a matter of minutes. Looking over the user posts on the SQL Server instructions page, there were several queries about blocked firewall and port access. In general, access problems were resolved once testers opened the firewall for the listening port of SQL Server. The installation documentation details how to open the firewall port on RHEL, so I suggest running the firewall command. In my case, I used the default port of 1433.
After installing the software, I deployed the sqlcmd tools on the Linux server. Like the server installation, the steps to install the sqlcmd tools are prescribed on the SQL Server site and worked with no issues. Many shops work to keep admin tools to a minimum on production servers, so installing sqlcmd on the server isn’t required.
Unlike a generic GUI-based SQL Server install for Windows, the download and install commands for the new Linux version of the database are terminal-based. Nevertheless, the download, installation and configuration setup felt like a typical Linux package installation. I used curl for the download, yum for the install, and systemctl to verify the SQL Server service was up and running. During the review process, Microsoft released the CTP 1.1 update, allowing me to test out the latest update. The upgrade process was equally straightforward using the update command.
Configuration
As for configuration changes, the Linux preview version does offer a few important SQL Server configuration options. Using the installed mssql-config package you can change the port, collation, default directories for data, log and backups, the ability to enable trace flags, and the default dump directory. Using the mssql-config tool, I could easily change the port for SQL Server. I had equal and expedient success changing the default data directory following the prescribed terminal scripts provided by Microsoft.
Access and Tools
With SQL Server installed and updated, it was time to start looking around. and to do this I needed tools to both access and manage the database server. Fortunately, there are options that include using the terminal or the command line, as well as numerous GUI tools.
If all you want is a command-line tool you can use sqlcmd remotely from your local workstation, or directly on the server in a terminal window. With no problems, I connected and used sqlcmd from both a remote workstation and on the server. My preference is for a more visual tool, and here you have several options (see Microsoft’s connection requirements).
Having worked with SQL Server for more than 20 years, my preferred visual tool is SQL Server Management Studio (SSMS). If you have access to a Windows workstation, SSMS is perhaps the best GUI tool to use with SQL Server. Once my server was up and running, I easily connected to SQL Server with a production version of SSMS 2016 (in my case 13.0.15700.28). Still, for a proper review, I downloaded the v.Next version of SSMS (14.0.16000.64 ) targeted to fully support the Linux version of SQL Server. You might be wondering if a Linux version of SSMS is planned. It’s doubtful. I don’t have inside information, but one of the postings on the SQL Server site suggested that given the many Windows dependencies SSMS has, it would be unlikely that SSMS would be brought to Linux. I do know that a cross-platform database admin tool is in the works.
If you’re not running a Windows workstation, or have another preferred GUI tool for database management, all the tools I tried that relied on either the Microsoft JDBC driver or the jTDS driver successfully connected to the Linux SQL Server. On both Linux and Windows workstations, I made connections to SQL Server using DBVisualizer, DBeaver, SQuirrel SQL and Oracle SQL Developer. While it’s not a complete list, it gives a good indication that whatever tool you’re already using will likely continue to work with SQL Server on Linux.
Supported Options Features
As a product, SQL Server v.Next has multiple components. So with this early preview, officially listed as Microsoft SQL Server v.Next (CTP1.1) what do you get? For this early release, functionality is limited to the core database engine. It is known that Integration Services (SSIS) is part of the roadmap. In mid-January, I attended an online meeting that provided a demonstration for several use cases utilizing SSIS on Linux focused on installation, package execution, logging and troubleshooting. In a later meeting, SSIS for Linux wasn’t planned for release until after the initial general availability (GA) release.
The only clear indicator I’ve seen is a focus on server-side, non-GUI components. That could lead to the speculation that key server-side components for security, high availability, database and the supporting server services may eventually be supported. Microsoft announced in late January that Active Directory authentication and Always-On Availability groups are in the works. SQL Agent will also be part of the package. While user demand will impact the decision, a big unknown is just how difficult it will be to retrofit a Windows-focused product to run on Linux.
Many of you might remember that the original SQL Server codebase was a co-development project with Sybase Inc., which provided its version for the Unix and VMS platforms. In the early days of SQL Server, it could run not just on Intel x386 processors, but other hardware including PowerPC and the Digital Alpha 64-bit chip, via the hardware abstraction layer (HAL). It raises the question: Could Microsoft be using some form of abstraction layer?
Anthony Nocentino, an enterprise architect at Centino Systems, recently hypothesized how a combination of process virtualization via a pico-process in combination with a platform abstraction layer can support SQL Server on Linux. If correct, it could imply that many of the other, as-yet-unsupported server-side functions and features might eventually move into the supported category. And that is good news. Microsoft subsequently explained how the company approached and solved the problem, confirming that it did indeed utilize a platform abstraction layer (PAL).
The post suggests this was a significant undertaking, and one that is not yet complete. The initial CTP utilizes a multi-layered abstraction process, building on the SQL Operating System (SOS) introduced with the SQL Server 2005 release. The SQL Server team described its objective of expanding SOS into a single PAL. Because SOS wasn’t intended to be a PAL, the team made adjustments. Pulling in parts from the Microsoft Research Project Drawbridge, the team built out an initial patchwork PAL for the preview.
According to the post, in time the team intends to combine the disparate components from SOS and the Library OS of Drawbridge into a fully unified SQL Platform Abstraction Layer (SQL PAL), as shown in Figure 1. As such, I didn’t conduct any targeted performance testing. What I can say is that even now, the performance seems to be on par with a similar Windows-based system. I can only hope that the performance will only get better (see "What’s Missing in SQL Server v.Next?").
Exploring SQL Server for Linux
With the SQL Server 2016 v.Next CTP up and running, let’s take a look around. I’ve connected with my favorite admin tool and as I start to explore, it all looks just as I would expect from a Windows installation, be it system databases, security or the server objects. Pulling up server properties tells me that SQL Server is well aware it’s running on the Linux platform CentOS (see Figure 2).
Basic Tasks
My initial tests were basic, including creating a new database, creating and altering tables, or managing indexes. They all worked just as the previous versions of SQL Server. To easily access data, I set up a linked server back to a Windows-based SQL Server. Using the linked server, I ran inserts into the new tables using data from the linked server. Back on the Windows server, I set up a complementary linked server back to the Linux install, again with no problems.
Columnstore tables were equally straightforward. On one of the fact tables I altered it, dropping the clustered rowstore primary key and rebuilding it as a clustered columnstore index, along with a non-clustered primary key constraint. I added and tested functions and ran multiple queries in parallel against both the Linux installation and a similar Windows installation -- all with identical outcomes. I successfully enabled the query store functions and enhanced an existing table with the new temporal table features. It all just worked.
Database Create and Restore
The only point where I stumbled was with the CREATE DATABASE and RESTORE DATABASE commands. The respective FILENAME, FROM DISK and MOVE TO options require a declared file name. And here I had two separate issues to work through: permissions and file-naming convention.
For permissions, SQL Server on Linux relies on the user mssql, so it was just a matter of adjusting the folder permissions. If I used the default folder, /var/opt/mssql/data/, no permission adjustments were necessary for the database files. But if I wanted a different folder for the data or log files, I needed to review and adjust the permissions. And with a RESTORE, permissions for the source backup file must grant the mssql user read access to the file.
With permissions configured, SQL Server must locate the file on Linux. It’s straightforward, except that Windows and Linux use different file folder naming conventions. Windows uses a backslash, while Linux uses the forward slash. So, while the Linux target folder was /var/opt/mssql/data/, the T-SQL FILENAME format became N'C:\var\opt\mssql\data\Test.mdf'.
For this preview release, I found that the T-SQL commands worked best when the Windows format was used. Internally, the provided Windows file format is converted into the Linux format. Reading through the SQL Server online documentation, the SQL Server team acknowledged that this was an issue, and that it will be improved in future updates.
With permissions fully configured and the file folder format worked out, I successfully created new databases, as well as restored several databases that were backed up on a Windows SQL Server 2014 system.
Overall Assessment
With Microsoft’s utilization of a SQL Platform Abstraction Layer, the core database engine implemented on Linux seems solid. The installation, update and tests went without issues. And with the initial next steps to support the server side of SSIS, SQL Server Agent, Active Directory and Always-On Groups, it looks good. And we can all hope that other server-side processes are in the works. After I evaluated CTP 1.1, Microsoft in late January released CTP 1.2. Based on the release notes, Microsoft primarily added the SUSE Linux support and fixed bugs in this newest preview. It’s a great start to Linux support for SQL Server.
About the Author
Andrew Peterson is principal at Realized Design LLC, a business intelligence and SQL Server database consultancy. Peterson has been designing and building SQL Server-driven database systems for the past 15 years for Fortune 100 and Internet startup companies.