'SQL Server 2012' and Hadoop Efforts Announced at PASS
Microsoft inched even closer to Hadoop at the PASS Summit 2011, held in Seattle this week, and announced a few product name changes along the way.
First off, Microsoft has dropped its "Denali" code name and now calls its next-generation relational database management system "SQL Server 2012." The current Denali release has been available as a third community technology preview (CTP), as announced in July. However, Microsoft today explained that the solution has advanced to "the final production stages." The final SQL Server 2012 product is expected to get released sometime in the first half of 2012, Microsoft announced at the event, which is sponsored by the Professional Association for SQL Server.
SQL Server 2012 Features
Other code names associated with the Denali CTP release also were dropped, Microsoft announced. For instance, Microsoft's "Crescent" code-named feature, which offers a simplified way for information workers to create data mashups, is now called "Power View." Microsoft also announced that it has added a new touch capability to Power View, which will allow users to drill down into data via touch-screens.
SQL Server Denali developers who were used to the old "Juneau" code name for Microsoft's integrated development environment can now say hello to Microsoft's new, more descriptive name, "SQL Server Data Tools."
Microsoft announced a new capability for businesses to share data in SQL Server 2012, and so introduced another code name, "Data Explorer." This feature, which will be available via SQL Azure Labs in November, will eventually leverage the Windows Azure Marketplace, although the details were lacking in Microsoft's announcement. Microsoft describes Data Explorer as providing "capabilities for data curation, collaboration, classification and mashup, opening new capabilities and opportunities around the data that you own or want to work with."
Microsoft's ongoing relationship in supporting the open source Hadoop technology continues apace as interoperability is being opened up for Windows Server and Windows Azure. Microsoft is partnering with Apache Hadoop core contributor Hortonworks on the effort. Hortonworks was founded by Yahoo and Benchmark Capital. SQL Server Certified Microsoft Master Brent Ozar joked in a Twitter feed that "It'd be hilarious if Microsoft ends up buying Yahoo just for the Hadoop expertise."
Hadoop is more than just clustering technology, according to James Kobielus, a senior analyst at Forrester Research, who described it as "the nucleus of the next-generation enterprise data warehouse in the cloud." He called Hadoop an evolutionary path. It has storage layer as well as an aggregation and query layer called "Hive." It also has an in-database analytics layer through Map Reduce.
"Hadoop is a petabyte-scalable complex data and analytics staging layer sitting behind an enterprise data warehouse or it can be a standalone data warehouse to some degree," Kobielus said in a phone call. He added that Hadoop is used by early adopters for things like social media analytics. It's used by AOL and Yahoo for ad analytics, for instance.
"Hadoop is an 'in-database analytics' approach, under which complex analytics -- including multivariate statistical analysis, data mining, predictive modeling, sentiment analysis, and content analytics -- are executed in parallel across MPP [massively parallel processing] clusters of distinct processing and storage nodes," Kobielus explained via e-mail. "Hadoop's power enables these functions to be executed with linear scaling across clouds that hold hundreds of petabytes of data and may distribute processing within individual data centers or even wide-area networks."
Kobielus described Microsoft's collaboration with Hortonworks as a key partnership, since that company has been pushing the vision for next-generation Hadoop.
"The Microsoft partnership,…I believe,…is providing professional services and consulting to ISVs and data warehousing companies and others that want to go down the road of Hadoop for big data," Kobielus said. "Hortonworks is very very principled in their commitment to the open source process. All of their development work is contributed back to the Apache open source community. Microsoft has indicated to me that that's a big reason why they're going with Hortonworks."
In the near future, Hadoop distributions will work with Microsoft's PowerPivot business intelligence tools on Windows Server and Windows Azure. Microsoft plans to release a CTP of the Hadoop service for Windows Azure at the end of this year, while the CTP of the Hadoop service for Windows Server is planned for sometime next year. Microsoft will offer code contributions to Hadoop, which is an open source project initiated by the Apache Software Foundation.
Kobielus described the Hadoop work with Windows Server and Windows Server as an "exciting" development.
"Hadoop then becomes the common technology, bridging the parallel data warehouse architecture with the Azure architecture, which are two entirely separate databases for big data," he said. "This is great. I look forward to seeing where they are going in terms of using Hadoop as the catalyzing converge layer between those two Microsoft initiatives."
Microsoft previously released CTPs of Hadoop connectors for SQL Server 2008 R2 and SQL Server Parallel Data Warehouse back in August. The one for SQL Server 2008 R2 has now advanced to "release-to-Web" status and can be downloaded here. Hadoop is typically used to run "big data" business intelligence-type operations for applications such as supply-chain management, sales analytics, call-center record analysis, Web event analysis and financial reporting.
Those looking for more information on Hadoop may do well to track Kobielus' work. Yesterday, Forrester published his study, "Enterprise Hadoop Best Practices: Concrete Guidelines From Early Adopters In Online Services." Kobielus also is finishing up two more studies for publication this month, including one on Yahoo's use of Hadoop and a study on enterprise use of Hadoop for big data applications. Those wanting more can look for a future Forrester Wave study from Kobielus on data warehousing players.
Hadoop currently is being embraced by Oracle, NoSQL, IBM, Netezza, Teradata and EMC Greenplum, in addition to Microsoft.
Kurt Mackie is online news editor for the 1105 Enterprise Computing Group.