Machine-Readable Web Still a Ways Off -- Redmondmag.com

Machine-Readable Web Still a Ways Off

By Joab Jackson
11/02/2009

Despite recent initiatives, the possibility of a machine-readable Web extolled by World Wide Web creator Sir Tim Berners-Lee still faces many obstacles, he admitted during a talk at the International Semantic Web Conference, held this week in Chantilly, Va.

Formats such as spreadsheets or even application programming interfaces (APIs) don't do enough to help the reusability of data, he said. Neither are there enough commercial products available to make Web site transitions to the new semantic Web formats easy.

"When you look at putting government data on the Web, one of the concerns is...to not just put it out there on Excel files," he said. "You should put these things in" the Resource Description Framework (RDF).

Berners-Lee has long extolled the virtues of annotating the Web with machine-readable data. This week's conference of semantic Web enthusiasts, however, offered him the chance to discuss in-depth the challenges of getting the rest of the Web to start using the technologies and approaches he advocates.

Few Web site managers are trained in RDF, and not many Web development applications use the standard, Berners-Lee admitted. "I'm not sure we have a grasp of our needs for the next phase of products," he said. He implored the semantic Web community in the audience to educate and inspire their peers. The people they need to talk to, he said, "are not going to be found in these corridors," referring to the conference attendees themselves.

Part of the issue is the inherent complexity of the concept of the semantic Web, which was Berners-Lee's original name for his concept of a machine-readable Web. Even simple sets of data linked by RDF, which was one simple component of his grand vision, "is still remarkably difficult as a paradigm shift," he said.

During the Q&A period, an audience member asked why exposing the API isn't sufficient for exposing data. Berners-Lee pointed out that to use an API, a system administrator or developer must write some sort of program to get at the data. With RDF, the data should be able to be reused directly within the browser itself, involving no additional work on the part of the user.

Berners-Lee noted that if the Web manager uses common uniform resource identifiers to identify people, cities or countries in the data, the browser could automatically pull information from other Web sites about these entities. "So there is very much more value to data for me, if I'm just browsing," he said.

He said that the use of RDF should not require building new systems or changing the way site administers work, reminiscing about how many of the original Web sites were linked back to legacy mainframe systems. Instead, scripts can be written in Python, Perl or other languages that can convert data in spreadsheets or relational databases into RDF for the end users. "You will want to leave the social processes in place, leave the technical systems in place," he said.

Conferences attendees admitted that the idea of the machine-readable data can be hard sell to those unfamiliar with the idea. The idea of linked data, like the idea of a World Wide Web itself, "solves a problem we didn't know we had," said Ronald Reck, head of the consulting firm Rrecktek. In other words, many of the benefits offered by the then-nascent Web, such as the ability to share documents, were already offered through other technologies, such as the File Transfer Protocol. Likewise, it is difficult to understand the concept of a single format for Web-based data when plenty of formats such as relational databases and spreadsheets already annotate data in ways that make it reusable by other systems.

Nonetheless, the idea of enabling the semantic Web so it can be shared seems to be gaining at least some traction, not the least because of efforts that disregard some of its more advanced notions, such as ontology-building, in favor of simply linking data sources.

Elsewhere at the conference, some researchers from the Rensselaer Polytech Institute demonstrated how they re-rendered all the data from the Data.gov Web sites into RDF. Their work was partially funded by grants from the Defense Advanced Research Projects Agency and the National Science Foundation.

"Our goal is to make the whole thing shareable and replicable for others to re-use," said project researcher Li Ding.

By rendering data into RDF, it can be more easily interposed with other sets of data to create entirely new datasets and visualizations, Ding said. He showed a Google Map-based graphic that interposed RDF versions of two different data sources from the Environmental Protection Agency, originally rendered in .CSV files.

The graphic derives the new material by linking common elements from the two datasets, Ding explained. The map shows the levels of ozone depletion across the country, the severity of the depletion marked by the circumference of the bubbles over the area where the readings were taken. One data set contained the ozone readings, while the other data source contained the geographical locations of where the readings were taken. The map data was created by joining these two sets of data by their common element -- namely, the names of the locations where the readings took place.

The Rensselaer project is one of a number of interrelated efforts. Linked Open Data, a directory of RDF stores, has documented at last count over 4.2 billion assertions encoded in RDF across a wide variety of sources, such as GeoNames and DBpedia.

About the Author

Joab Jackson is the chief technology editor of Government Computing News (GCN.com).

Featured

Microsoft Expands Defender Experts With New Threat Intelligence and Multicloud Coverage

Microsoft on Wednesday introduced a threat intelligence service and expanded its managed detection and response offering as the company looks to help security teams face growing volume of threat data into specific defensive actions.
What Happens When Malware Outlives its Intended Lifespan, Part 1?

Aging malware can remain dangerous long after its creators move on, leaving victims with fewer protections and no reliable recovery path.
Microsoft, 3M Partnership Targets AI Infrastructure and Enterprise Transformation

Microsoft and 3M on Wednesday announced a wide-ranging partnership that links two major areas of enterprise AI investment: the infrastructure needed to support AI data centers and the use of AI to modernize large organizations.
Microsoft's Record July Patch Tuesday Fixes 570 Flaws, Including Two Exploited Zero-Days

Microsoft's July Patch Tuesday release broke the record for a second straight month, delivering fixes for roughly 570 holes across Windows, SharePoint, Microsoft 365, Azure and others.
Why Most Backup Success Metrics Are Meaningless

Traditional backup metrics can show perfect health while failing to reveal whether critical workloads can actually be restored.

comments powered by Disqus

Subscribe on YouTube

Office 365 Watch

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

Virtual Hands-on Training Seminar: PowerShell Mastery Workshop: From Fundamentals to Advanced Automation
September 9-10, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

Virtual Hands-on Training Seminar: AI-Powered PowerShell and Infrastructure Automation with Claude Code
December 10-11, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 9-13, 2027

Webcasts

More Webcasts

Whitepapers

More Tech Library