In-Depth
Microsoft's Role in the New Data Culture
As enterprise IT pros face a business imperative to support the management and analysis of Big Data, Microsoft has embraced the call with the latest evolution of SQL Server, support for NoSQL and Hadoop. The changes promise to alter the role of the traditional DBA.
Satya Nadella brought it all into perspective: It's about people, really. Take all the software, all the data and all the tools for what they're worth, but in the end, business success depends on people.
"Now, talking about technology is one thing, but if you really as an organization want to change anything, it's about culture," the Microsoft CEO said in officially announcing the release of SQL Server 2014 in April. "And when it comes to data and to be able to truly benefit from this platform, you need to have a data culture inside of your organization."
Nadella went on to explain how every part of Microsoft is being transformed by the new data culture enabled by the advent of Big Data and cloud-first computing, but the message resonates for everyone."Business is being fundamentally transformed because of data," Nadella said. "And that doesn't happen because of technology, it happens because you have to build deeply into the fabric of the company a culture that thrives on data."
And you're the ones tasked with building that fabric -- the DBAs, the sys admins, the engineers and the IT decision makers. You have to grow the data culture. But first, you have to adapt to the new world, and Microsoft wants to help you do that with cutting-edge data tools such as its flagship relational database management system (RDBMS) and ancillary new-age technologies.
Many of those tools help you harness Big Data, of course, and Microsoft had to undergo a culture shift itself to accept the upstart technology. The staid, proprietary software giant long dependent on SQL Server and the $5 billion or so it generates annually had to first embrace Big Data in-house and then spread the word to customers.
Big Data Wave Soars
There aren't many doubters left. In the few years since modern methods of managing and analyzing Big Data have surfaced, it's still hitting IT like a hurricane. Recent research by Signals and Systems Telecom indicates Big Data vendors will generate nearly $30 billion in revenue this year alone, with a nearly 17 percent compound average growth rate (CAGR) leading to a $76 billion market by the end of 2020. "Nearly every large-scale IT vendor maintains a Big Data portfolio," the research report states. On the customer adoption side of things, the 2014 IDG Enterprise Big Data research report from earlier this year found that nearly half of all respondents were already implementing Big Data projects or in the process of planning to do so in the future.
"Business is being fundamentally transformed because of data."
Microsoft CEO Satya Nadella
Microsoft has responded well to this shift. The company is one of those aforementioned large-scale IT vendors with a Big Data portfolio, and SQL Server 2014 -- though still a traditional RDBMS -- is a big part of that product suite, described by the company as "the foundation of Microsoft's comprehensive data platform."
Bradley Ball, a senior consultant at Pragmatic Works, which provides implementation services on the Microsoft BI platform, says that's a fair assessment. "I look at SQL Server essentially like the hub in the middle of the spokes," says Ball. Some of those data spokes include:
- The Microsoft Analytics Platform System (APS), described as Microsoft's solution for delivering "Big Data in a box." It's an evolution of the SQL Server Parallel Data Warehouse (PDW) appliance that builds upon the high performance and scale capabilities of the Massively Parallel Processing (MPP) version of SQL Server.
- Azure HDInsight, the new cloud-based service providing the foundation of the Big Data ecosystem: Apache Hadoop. In August, Microsoft announced HDInsight supported Apache Hbase, a columnar NoSQL database.
- Hortonworks Data Platform for Windows, a partnership with Big Data vendor Hortonworks Inc. that provides its Hadoop distribution for implementation in Windows Server environments.
- Azure SQL Database, a relational Database as a Service (DBaaS) in the cloud that elastically provides scale-out capability across thousands of databases to support high-throughput application patterns
- Azure DocumentDB, a fully managed transactional NoSQL document DBaaS offering announced as a preview in August.
- Azure Machine Learning, another preview that provides cloud-based predictive analytics and seamlessly connects to HDInsight for Big Data solutions.
- PolyBase, rather than a standalone product, is technology integrated with APS and other solutions that lets staffers use T-SQL to query their Big Data regardless of where it's stored -- in an on-premises Hadoop/HDFS cluster, Azure storage or PDW.
Also in the Microsoft data portfolio are ancillary products such as PowerPivot for Excel, Power BI for Office 365, Power Query and a host of others.
How SQL Server 2014 Fits In
"They're extending the platform they already have with SQL Server to integrate with Hadoop and cloud and in-memory," says Noel Yuhanna, a principal analyst at Forrester Research Inc. "So they're definitely making sure that SQL Server isn't running in isolation, in silos, but also integrates with the new next generation of data management technologies."
Industry expert Andrew Brust, a SQL Server MVP and research director at Gigaom Research, says this transformation of one of its bedrock products is "the thing that most impresses me about what Microsoft's doing."
Brust, who's coauthored several books on SQL Server over the years, has watched Microsoft develop and roll out the Big Data-friendly Azure HDInsight without marginalizing its flagship RDBMS platform. "It's very hard to have a legacy-entrenched product like SQL Server and then pivot toward something new. I think they're doing pretty well at it," says Brust, who will be presenting a session called "Big Data 101 with Azure HDInsight" at next month's SQL Server Live! conference in Orlando. (Like Redmond magazine, SQL Server Live! is produced by 1105 Media Inc.)
"SQL Server in 2014 has actually implemented an option to turn on delayed durability, which is basically, you could call it `NoSQL mode.'"
Alex Barnes, Infrastructure Architect and DBA, BoomTown
Alex Barnes, a DBA on the front lines who's been investigating SQL Server 2014, also applauds Microsoft's approach. "That's one thing I like a lot about Microsoft and the way they do their databases," says Barnes, an infrastructure architect and DBA at BoomTown, a Charleston, S.C.-based company that provides a platform for real estate professionals. "They create these new tools, and instead of releasing them as a separate data store, which they could pretty easily do, they just tack them onto SQL Server. You have this huge range of tools and options available within this one database system, which is multiple engines working in the background together."
That huge range of tools and options positions Microsoft as an established, familiar resource for organizations delving into the new age of data culture, especially for smaller shops without the resources to develop sophisticated solutions on their own.
"One of the interesting things that Microsoft brings to the table is a set of flexible reference architectures that allows customers to implement in four, five, six, 10 different ways using different pieces and parts of the product stack to meet their particular set of characteristics and circumstances," says Steve Palmer, senior vice president, Data & Analytics, at Avanade Inc., which provides Microsoft-focused consulting and other services.
Selling SQL Server 2014
For its part, SQL Server 2014 is still early in the adoption phase, having been made generally available just six months ago. To boost its adoption rate, Microsoft has been working with dozens of customers on various implementations and sharing the resulting success stories, with Big Data playing an important role in some of them. For example, Beth Israel Deaconess Medical Center in Boston is using the Big Data HDInsight service to improve access to decades of historical information.
The hospital is also using the in-memory capabilities of SQL Server 2014 to drastically reduce database query times. In fact, perhaps the most prominent innovation in SQL Server 2014 stems from Project Hekaton, which brought brand-new in-memory capabilities to the RDBMS. While some in-memory columnstore capabilities are baked into SQL Server 2012 -- to enhance analytics, for example -- SQL Server 2014 introduced in-memory Online Analytical Transaction Processing (OLTP).
"What this does is, it expands the capability of SQL Server from analytics in-memory, which is what a columnstore does, to a transactions and analytics in-memory," says Donald Feinberg, vice president and distinguished analyst, information management, at research firm Gartner Inc. "That means that this is the first of the traditional vendors to compete with HANA from SAP. They can now do transactions and analytics in-memory on the same database."
Feinberg says not everyone wants to do this today. It's most attractive to organizations that find the business value of processing database workloads faster or more real-time in-memory outweighs the increased cost of acquisition.
"The difference is that Microsoft had the vision to see that that's where the industry is headed," Feinberg says. Over the past 15 years, Microsoft SQL Server has been the No. 3 RDBMS in terms of installed base (much higher when in Windows environments) behind IBM's DB2 and Oracle, the market leader. In terms of revenues, Feinberg says Microsoft jumped to No. 2 last year. In a strange paring of bedfellows, Microsoft and Oracle last year formed a deep partnership where Oracle databases can now run on the Microsoft Azure public cloud. Now in-memory databases like SAP's HANA, Teradata Intelligent Memory and Hewlett Packard Co.'s Vertica, along with NoSQL and Hadoop technologies like MongoDB, Cloudera, Hortonworks, DataStax, 10gen, MarkLogix, MapR, Couchbase, Basho and NeoTechology are among the upstarts altering the landscape and have become competitors and in some cases partners to the incumbent RDBMS suppliers. While Oracle and IBM are also responding to these changes with such technologies as the Oracle Big Data Appliance and IBM Watson, Microsoft appears to be closer aligned to the market shifts of Big Data and the move toward in-memory processing.
"Neither IBM nor Oracle even discuss that yet. So this truly is the first time that Microsoft and SQL Server have jumped ahead of both DB2 and Oracle in a feature function for the future," Feinberg says. "They've been playing catch-up since they got into the database business."
Also new is a wizard to help users host a SQL Server database instance in Azure Virtual Machines (VMs), which Microsoft says provides the benefits of an Infrastructure as a Service (IaaS) product in datacenters for use within the company's public cloud environment.
Other integration with the Microsoft cloud comes in the form of the new SQL Server Data Files in Azure, which provides native support for storing SQL Server database files as Azure blobs. This lets users create a database in SQL Server -- running on-premises or in a VM in Azure -- with a dedicated storage location in Azure Blob Storage.
New AlwaysOn feature enhancements are also getting some attention. "With SQL Server 2012 they added automated high availability with the feature called AlwaysOn," Feinberg says. "With 2014, that gets better."
Answering the NoSQL Challenge
Barnes, the BoomTown DBA who's been "playing around" with SQL Server 2014 quite a bit since it was first released in preview, is impressed by features that help the RDBMS remain relevant in the age of upstart NoSQL databases associated with Big Data.
"Over the last five years, or eight or nine years, depending on who you ask, this kind of NoSQL wave has crashed in and it's been the big deal that everyone's talking about," Barnes says. The NoSQL databases aren't generally ACID-compliant, he says, the term in computer science to describe reliable or by some standards having referential integrity. "Basically, they kind of play fast and loose with some of the ideas of what a database is supposed to do in order to get a lot of speed out of it. And Microsoft has done a great job of adapting to that" with features such as the in-memory technologies, compiled stored procedures and more.
A lot of IT setups use a relational database as a persistent tier, Barnes says, and have a "fancy NoSQL database" sitting in front of that, almost acting like a read cache to enhance quick lookups. This results in "polyglot persistence" or "polyglot systems" where data lives in multiple data stores for different reasons. "SQL Server has almost eliminated the need for that in 2014 by sticking this in-memory engine in there," Barnes says. "It does a lot of what those NoSQL databases do."
Barnes is also impressed by "delayed durability," a new ability to reduce latency by returning control to a client before a transaction is logged onto disk. Barnes says he attended a technology conference where he spoke with Senior Program Manager Jos de Bruin and others on the Microsoft SQL Server team, where he learned more about the feature.
Barnes says delayed durability is a concern when a system acknowledges a write into a database. SQL Server, he says, waits until the write hits the log file, which is called "write-ahead logging." NoSQL databases tend not to do that, he says, in order to increase speed. "They will just say, `Yep, I've got it. Go ahead.' And that kind of makes your data less secure or less durable, while gaining the speed boost.
"SQL Server in 2014 has actually implemented an option to turn on delayed durability, which is basically, you could call it `NoSQL mode,'" Barnes continues. "It's a way to say, `I don't need write-ahead logging. I don't necessarily care if my system crashes and I lose a little bit of data. I'd prefer to have the speed up front.' They've really kind of added these options in there to compete with these guys who are playing fast and loose with these rules. You can basically, now -- by jamming a bunch of data into memory, creating these hash indexes, or range indexes, and turning on this delayed durability -- you can basically turn SQL Server into what people think of traditionally as a NoSQL database."
And the new NoSQL functionality doesn't stop there. "If you consider that Azure Storage tables is really a NoSQL database of the key/value variety," Brust says, "if you look at Hbase, which is a NoSQL database of the column family variety and if you look at the new DocumentDB, which is a NoSQL database of the document store variety, well, Microsoft actually has three different NoSQL databases going on right now, which is pretty impressive."
With its embrace of Big Data and other new-world technologies, cutting-edge features, established track record, early customer success stories and "been there, done that" reference architectures and experience, Microsoft's job is now to evangelize its vision of an enterprise data culture and get customers to migrate to SQL Server 2014 to help them realize that vision.
"Every company, in their own way, is finding the most opportunistic value for themselves in this new world that's emerging."
"TK" Ranga Rengarajan, Corporate VP, Data Platform, Microsoft
Considering the SQL Server 2014 Leap
Most industry experts and pundits agree it's too early in the release cycle to judge how well that's going."Typically after a year into production, or general release, that's the time when enterprises typically start to put plans together to upgrade to new releases," says Forrester analyst Yuhanna. He estimates about one-third of enterprises might be planning to migrate to the new platform this year, while another 30 percent or so might make the move in the next year or two.
"I think we're still early on in the adoption phase," adds David Torres, senior director, Data & Analytics, at Avanade. "Unless you have a very specific need for the feature set that's being released in that product, it'll take some time for adoption. What we're also seeing is that people are moving to SQL Server 2014 as existing systems that are using SQL 2005 and 2008 are coming to end of life and they know they're going to start moving to the next version of SQL, so they're beginning that move. So it's really kind of two camps: People [who] are needing to migrate because of end of life, and other ones [who] are looking to implement specific features that are in the release."
Avanade wouldn't give out specific numbers, but says the adoption rate is increasing. Ball, at Pragmatic Works, agreed. "This has been the fastest adoption rate that we've ever seen," says Ball, who will also be presenting at the upcoming SQL Server Live! conference.
The man in charge of much of this technology, T.K. "Ranga" Rengarajan, corporate vice president, Data Platform, Cloud & Enterprise at Microsoft, echoes their observations. "Analysts are predicting a very fast rate of adoption," he says. "We see the same -- they're talking about 40 percent CAGR and data growth." He says the adoption strategy of companies varies widely as they adapt to the new world of data culture and the cloud according to their own needs, using different components and techniques.
"Every company, in their own way, is finding the most opportunistic value for themselves in this new world that's emerging," Rengarajan says. "So different people are approaching the problem -- approaching the value -- in different ways." And he says that's perfect for Microsoft. "It doesn't have to be a one size that fits all."
Andy Leonard is providing expert leadership to companies thinking about moving to the new platform in his many roles as a consultant, data developer, chief security officer, SQL Server Integration Services trainer, blogger and more. "In my experience, SQL Server 2014 adoption is moving slowly but steadily," Leonard observes. "I work in business intelligence, and the compelling feature in SQL Server 2014 is updateable columnstore indexes. Combining table partitioning -- which was enhanced in SQL Server 2012 -- with columnstore indexes allows SQL Server 2014 to scale into the Big Data arena."
Clients migrating from SQL Server 2008 R2 should skip SQL Server 2012 and migrate directly to SQL Server 2014, Leonard believes. "The licensing model is roughly equivalent (it's going to cost about the same amount), and they are moving to a platform with an extra couple years of mainstream support in its future," he says.
Barnes, the BoomTown DBA, won't need to skip a version, as the company has used SQL Server 2012 since it was released. He hasn't migrated to 2014 yet because there hasn't been a compelling reason to do so or an opportunity to realize massive performance benefits with no risk. "I'll probably jump up to 2014 by the end of the year," Barnes says. "I just haven't done it yet because there hasn't been this dire need, so I was kind of waiting for it to be out a little bit and let everybody else find the bugs."
Looking Before the Leap
In addition to any possible bugs, a lot more problems can crop up during a migration. IT staffers have to be aware of all the work involved in a successful adoption, which requires planning, testing, and probably modification of existing apps and systems.
For example, the in-memory capabilities are attractive to many businesses, but adopting such new technology is complicated. "If you're going to use that and you're looking to take advantage of some of the memory optimization, you may need to rewrite some of your stored procedures so that you can compile them in-memory," Torres warns. "So there is some updating to the business logic that may take some time to make that migration. Being able to run the memory optimization advisor and so forth on your database and kind of figure out what tables may benefit are definitely some things people are looking at now, and that's a good first cut of looking at what tables in your database may benefit from using in-memory."
Torres advises organizations to proceed with caution. "You've got to be very careful about these kinds of things," he says. "You've really got to do some planning; you've got to do some testing; you've got to do a lot of things. You can't just flip a switch and start using in-memory tables."
Brust notes that any such major migration can be fraught with problems. "Migrating is really risky and expensive for most organizations," he says. "When everything is running, the last thing you want to do is mess with it -- that's just natural."
Challenges and Opportunities for DBAs
At whatever speed, the migration to SQL Server 2014 is happening, and it will come with new possibilities and challenges for DBAs and other IT pros running associated systems -- on top of the daunting task of responding to new business needs brought on by the advent of Big Data and other modern technologies.
Several experts believe DBAs and other staffers will see the new database offering as a welcome opportunity to learn and do new things, rather than a problematic challenge.
"My take is that they're actually very excited about the opportunity to do something new and interesting," says Avanade's Palmer. "I mean, think about it, if you've been a DBA in the SQL world, you've probably been one for a very long time, and it hasn't really changed that much in the last 15 years. I mean, SQL is SQL and there are new features and functions depending on the vendor, but the relational world has been pretty much the same. And we're seeing people getting very excited about the prospects of learning new stuff and applying it in new ways."
Gartner's Feinberg agrees that SQL Server 2014 provides a huge opportunity for DBAs and others. "First, the functionality for in-memory, once they learn how it works and how to use it and stuff like that, puts them ahead of DBAs working with Oracle and IBM DB2, because they don't even have it," Feinberg says. "So it's a new feature or function they have to learn. It's a benefit. Secondly, if they want or need for their organization to get into the other tools that manage data -- the modern [tools], like Hadoop -- OK, they now have the possibility of doing that and staying in the Microsoft family. And then with PolyBase, how to integrate it all together, again, stays within the family. I don't think this hurts the DBAs at all. On the contrary, I think it helps them because it's going to broaden their base of knowledge while keeping them working with what they obviously like to work with."
Ball also sees the new technologies as an opportunity rather than a threat to database pros. "A lot of DBAs are worried that all these new offerings are going to take the place of their job," he says. "And, honestly, if they like to learn, they're in a wonderful position, because we're at a point in our world where data is just growing at an accelerated rate."
Rengarajan echoes their thoughts. The head of Microsoft's data platform sees DBAs playing an important role in leading organizations into the future. "I think of the DBAs as the future scouts of the data culture," he says. "They are currently running databases, managing them, keeping them up-to-date and up and alive and running and all that stuff on-premises. But if you look at the world of hybrid -- of new data engines -- nobody is more ideally situated than DBAs to help lead their organizations forward to take advantage of the cloud and manage data in both realms," he says. "They are perfectly suited to show the way."
Ted Neward, an expert database developer, consultant and author, says it's actually more of a necessity for DBAs to get on board with the wave of the future. "It's essentially an evolve-or-die kind of industry," says Neward, another presenter at next month's SQL Server Live! conference. "If you aren't spending personal time trying to keep up on all of the new things that are happening, you're going to get run over. There's a train that's coming through IT -- there's always been a train coming through the IT industry -- and you either get on it or you get run over by it." It happened to the Cobol guys, he says, and then it happened to the C++ guys, and to some degree it's happening to the database guys, who have had a longer reign of doing things their own way than anyone.
"The relational database has been the queen of the datacenter for 20 years now," says Neward, adding that in some respects this has stifled innovation and new possibilities. While he has nothing against relational databases, he argues they aren't the solution to every data problem, something that the NoSQL movement has revealed and enabled. "DBAs are going to have to start to adjust to what I think is now the reality that they will have non-relational data within their purview, within their space," Neward says.
Barnes, though running a mostly Microsoft shop at BoomTown, based on the Microsoft .NET Framework and Windows Server, has already made this adjustment in his space. Along with SQL Server 2012, he says, the company uses the MongoDB and Couchbase NoSQL databases for caching purposes. "That's our big three," he says.
Neward notes the characteristics of traditional relational databases will continue to play a pivotal role in the datacenter. Even with the acceptance of NoSQL databases, "We don't want to throw the baby out with the bathwater," he says. "A lot of the things that DBAs have relied on for years are reporting tools and monitoring tools and application-performance management and all that good stuff. We don't want to lose that. That has been one of the principal criticisms leveled at the NoSQLs, and they need to respond."
What's Next?
And as they respond, so will Microsoft, with the Azure cloud as a big part of the response.
"Our goal is to have a comprehensive cloud data platform -- Azure data platform, if you will," says Microsoft's Rengarajan, who notes the data platform he oversees has different engines for different needs. "And the key question is: How many engines do you really need? And there are big debates in the research community and the database vendor community about how many engines are necessary. And different companies have gone different ways.
"We have taken the sort of minimal approach," Rengarajan continues. "That is, look, if there is a capability that you need and it's possible to provide that in SQL Server, let us just do that. In the case of in-memory Hekaton for example, it's just a config option for them. Similarly, the columnstore is a config option in the existing engine. So we don't need to complicate peoples' lives by creating a new engine for every new need."
However, if there is a clear need, as happened with the advent of Hadoop and NoSQL database capabilities, Microsoft won't shy away from providing new engines to provide solutions, the corporate vice president insists.
"We are on this mission in this brand-new world, right," Rengarajan says. "Cloud-first, mobile-first, multiple mobile devices providing multiple experiences to people and data being served from the cloud in a consistent way across these multiple experiences.
"We want to provide all the necessary capabilities for customers to create the data culture in their organizations. And whatever's necessary for that, we will do. We will minimize the complexity for our customers -- for ourselves -- in trying to do that."