Joey on SQL Server

Why SQL Server Is Still Worth It

Despite its steep licensing costs, SQL Server continues to prove its worth over open-source alternatives in some key areas.

SQL Server is an expensive part of your IT stack -- SQL Server Enterprise Edition costs roughly $7,000 per CPU core (plus the likely cost of 23 percent for Software Assurance). And while there are cheaper ways to consume it, those cheaper ways may struggle to compete with "free as in beer" database solutions like PostgreSQL and MySQL, especially to a CIO/CTO who may only be looking at a budget line item and not the core functionality of the application.

What makes SQL Server worth it compared to those free, open-source options? I'm currently working on a development project using PostgreSQL -- it's a good database, but it has its issues -- and I wanted to try to highlight the places where SQL Server (and all the related AzureSQL services) deliver value.

Note: I'm mainly focusing on these relational database RDBMS solutions, not specialty databases like time-series or key-value stores, which you should consider if you have specific needs like telemetry or rapidly changing schemas.

High Availability and Backups
High availability is a solved problem if you run PostgreSQL or MySQL in a cloud service like Azure Database or Amazon RDS. While clustering SQL Server or building an availability group aren't trivial operations, they are mature, well-documented processes that operate with Windows Server Failover Clustering. For the most part, all of this just works. On the other hand, clustering in a Linux environment is a much more challenging endeavor.

At best, you are using two or three different packages to manage high availability, and at its worst, you are dealing with five disparate packages and a runbook that's a mile long to do a failover. This problem (and Microsoft's lack of investment in high availability in Linux) also really limited the growth of SQL Server on Linux. Some good third-party packages make this far less of a problem. Still, high availability is a significant advantage to SQL Server out of the box.

Backups are another place where commercial databases have long held major advantages over their open-source brethren. The PostgreSQL documentation lists three methods of backups:

  • SQL dump.
  • File system level backup (requires database shutdown).
  • Continuous archiving.

The first two options are not good -- SQL dumps require all operations to pass through the transaction log. File system backups are like the cold backups in SQL Server 6.5 in 1998 when I was still an undergrad. Continuous archiving is closest to the basic SQL Server backup/restore process. Still, it's complex to implement and configure. I've yet to find the Ola Hallengren of PostgreSQL. Once again, backup and restore are problems largely solved by cloud providers.

Another concern would be disaster recovery. While I wouldn't say this is the strongest suit for open-source databases, it's not awful. Setting up replication is a typical pattern for both PostgreSQL and MySQL (it's how you do high availability), and because failover for a DR database is nearly always manual, you don't have the same problems as dealing with a clustering solution in Linux.

Everything Built In, as Opposed to Everything Added On
Earlier this week, I worked on an automated database build process for my data warehouse running in Postgres. I have a function that uses a geography data type requiring the PostGIS extension. I had installed PostGIS on my server, but it also needs to be installed in each database that requires it. As part of my database creation process (which currently drops and recreates the DB), I installed that extension on the database before creating any objects. However, I still got an error about the geography data type not existing. I did not figure out the root cause of that error, but running the create function script in a subsequent database connection has resolved the issue.

While SQL Server has configurations, and not everything is enabled by default (for example, replication, PolyBase and R/Python all require additional installations from the SQL Server install media), the vast majority of the T-SQL programming surface area is available across all of the various flavors. A few quick examples for comparison are temporal tables and graph databases -- both are native solutions out of the box in SQL Server and all flavors of Azure SQL. In PostgreSQL, you must install extensions to make both features work. While some extensions, like the very popular PostGIS, are clearly labeled, other less popular features can be more confusing. For example, there are three or four different extensions that support temporal tables. It's unclear which one is the best. The graph extension (AGE) is an Apache open-source project; however, it is not currently available on AWS RDS but on Azure Database for Postgres.

Performance Is a More Complicated Topic
Both PostgreSQL and SQL Server support some of the busiest transaction processing workloads you can imagine, and it's tough to compare database performance between disparate workloads. An interesting design difference is that SQL Server caches execution plans, while MySQL and PostgreSQL do not. This is a decision that can reduce CPU for repeated queries, since plans do not have to be compiled at every execution, but can offer suboptimal plans for all plans. This isn't good or bad either way, just a design choice.  

However, the place where SQL Server truly shines here is the vast array of tools that are built into the database engine to help you better understand performance. Native features like the query store and robust query execution statistics allow DBAs and developers to analyze performance issues quickly. If you want to dive deeply into your performance, SQL Server's extended events engine and dynamic management views let you capture all manner of details surrounding performance.

There is a package called pg_stat that you can add to PostgreSQL for easier tracking, but once again, we are in the realm of having to add functionality. Another quick plug for Azure Database for PostgreSQL here -- Microsoft has added the query store functionality into Postgres, but only in the cloud.

Comparing database engines is always a challenging concept. I didn't even want to get into the topic of benchmarks, but as I mentioned in the performance topic, both PostgreSQL and SQL Server run big workloads that power some of the largest OLTP systems in the world. The challenge is how you manage those systems. Many companies running those large-scale open-source RDBMS installations contribute to the open-source databases' code base. A good example of this is comparing SQL Server Management Studio (SSMS) to PGAdmin. There is almost no comparison, for any complaints you have about SSMS, it's just light years better than PGAdmin. There is no question that PostgreSQL can support high-end workloads; it just makes life more challenging for the DBAs and developers, especially if they are used to a platform with more built-in quality-of-life tools like SQL Server.

My final bit of commentary as working as a developer/DBA/architect on Postgres for the last several months is that if you are coming from SQL Server or Oracle, there are many nice things that you will appreciate more after working in Postgres. One example is SQLPackage, a much-derided database deployment tool that is still much better than anything I've found freely available for Postgres. I also want to give a shout out to the SQL Server community. While open-source projects are known for their broad communities (which PostgreSQL and MySQL/MariaDB have), for a commercial product and just in general, the SQL Server community is second to none. Whether it be in-person at conferences, Slack, BlueSky or LinkedIn, knowledgeable folks are always happy to lend a hand. To close, all of the available database engines are mature and have much overlapping functionality; some, like SQL Server, are more equal than others.

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.

Featured

comments powered by Disqus

Subscribe on YouTube