Q&A
Microsoft on How SharePoint Syntex Uses AI To Address Content Management Woes
Project Cortex team members Naomi Moneypenny and Dan Holme explain the origins of SharePoint Syntex, how it benefits orgs with content management problems, and addresses security and governance concerns.
Microsoft this week at Ignite announced SharePoint Syntex, its first product coming from Project Cortex, which uses artificial intelligence (AI) to sift through organizational data and gather insights that can be automated into processes. Project Cortex is a Microsoft 365 solution, with SharePoint as a principal component. Behind the scenes, Project Cortex taps information surfaced by the Microsoft Graph, a cloud-based data store with AI capabilities that runs through Microsoft 365 content.
Syntex specifically is a trainable AI product that will reach "general availability" commercial release on Oct. 1. Syntex provides AI access when processing three content types, namely "digital images, structured or semi-structured forms, and unstructured documents," according to an explanation by Seth Patton, a marketing team leader for Microsoft 365 productivity solutions, in a Tuesday Microsoft announcement.
It's maybe a little confusing that Syntex will be arriving before Project Cortex. Microsoft officials explained during Ignite that various Project Cortex products will get released this year. It's Microsoft's newly expressed plan.
I got a moment to pose questions about Syntex to Naomi Moneypenny and Dan Holme, both on the Project Cortex team. They told me some things about the project's origins and how it benefits organizations with content management problems. They also offered clarifications on security and governance concerns.
Moneypenny is director of product development for Project Cortex. Holme is director of product marketing on the Microsoft 365 team. The Q&A that follows has been edited for clarity and length.
Redmond: What are the main concepts behind SharePoint Syntex?
Holme: The goal of SharePoint Syntex is really to do a couple of things. First, it's to scale individual expertise. It's to allow someone who is an expert at a specific content-centric process in an organization to capture their knowledge with no code and model that. Then, they and others can use it to automatically process content moving forward, unlocking the knowledge that's always locked inside of content and pulling it out as metadata. That metadata then can be used for three really primary steps forward, making content easier to find and discover, and making it a part of an integrated business process.
So taking, for example, metadata out of a contract to determine who needs to approve it next. And then finally, allowing an organization to improve their security and compliance stance by understanding content better and really seeing what's inside content. You can have situations where you need to understand fundamental differences between different contract types. We might want to apply different retention policies or different security policies.
So, SharePoint Syntex is about unlocking the knowledge stored in content by allowing individuals to teach SharePoint Syntex how to process content the way they do. That's the top line of it.
So if I am using Syntex, I can tell it what to look for and it will help surface some of this information. I don't have the whole Project Cortex at that point, right? I just have the ability to label and start training my AI, maybe?
Moneypenny: Yep, that's really what it comes down to. I think the Syntex project is really about making sure we can unlock the expertise you have in a business process -- so, for example, if my day job is processing a whole bunch of forms or I'm a project manager and I get sent a lot of responses to an RFP [request for proposal]. As a project manager, I have to read dozens and dozens of proposals. I have to figure out what's important out of each of those proposals. I have to extract the milestones of the project, the deliverables, the pricing, the experts -- all of those things together. And so, what if I want to basically automate that? How do I extract what's really important as part of that business process and extract what's important to me, and then make sure that I can connect it into the retention policies or security policies and extract the metadata that's required for that business process? That's kind of a time savings.
Also, I'd like to be able to understand how I can contribute the knowledge that I have to the rest of the organization. And so this is really where AI helps you in terms of understanding topics, figuring out how it can automatically identify topics inside your organization from all the content that you already have. And then helping you to collect the resources of the people connected to that.
"SharePoint Syntex is about unlocking the knowledge stored in content by allowing individuals to teach SharePoint Syntex how to process content the way they do. That's the top line of it."
Dan Holme, Director of Product Marketing, Microsoft 365, Microsoft
Would organizations be concerned about this kind of metadata collection? Are there governance and security controls over this?
Holme: Great questions. First of all, SharePoint Syntex is available as an add-on for both [Microsoft 365] E3 and E5 customers. And secondly, this is about accelerating business processes that already exist. All of this respects and actually augments the security policies you have as an organization. So one example is that people have paper document scans that they're not managing at all. They don't even know what's in there. And if they knew that there was actually a Social Security number in one of these scans, it would actually help them improve their compliance and governance. So it is complementary to information governance as opposed to being at odds with it.
Moneypenny:Â For sure, we always respect the permissions that people have on their items, just to be very clear on that. So the same thing inside of the Microsoft Graph -- the same permissions they use for Microsoft Search, whether I could see a search result or not -- Project Cortex and SharePoint Syntex both apply those permissions automatically. You're never suggested or recommended something you don't have access to. And the same thing is true when you're processing content with SharePoint Syntex. You can apply to it a retention policy, even a sensitivity label if you want to in the future. Those are the areas where we want to make sure that it plugs in very well with our colleagues over in the Microsoft Information Protection area.
What can you tell me about Syntex's machine teaching aspect?
Moneypenny: Within SharePoint Syntex, one of the new technologies that we're basically bringing to market for the first time is something we call "machine teaching." And that's a little bit distinct and separate from machine learning. Machine teaching is really great at understanding datasets inside of your organization -- the classic project manager problem where I process a lot of contracts in an organization, for example. If I have that kind of job, even though it's a lot of information to humans, to computers it's not very much at all.
So this approach that we call machine teaching is about how people help to basically encapsulate or encode their knowledge, so that they can extract what's important to them. We believe fundamentally that anything that can be taught to a person can be taught to a machine, but you do have to help explain that to the machine -- what it's looking for and why it's important to you. So we have a different way of teaching, and that relies on a smaller dataset. We give it things like five examples of a document. And importantly, you're also going to give it a counter-example of something that is not the document. Machine teaching really helps us to be able to have the person explained to the machine, what it is that they're looking for. And then once you have that model, then multiple people inside of your organization can actually help to teach the machine, as well.
"Within SharePoint Syntex, one of the new technologies that we're basically bringing to market for the first time is something we call 'machine teaching.' And that's a little bit distinct and separate from machine learning."
Naomi Moneypenny, Director of Product Development, Project Cortex, Microsoft
Syntex was described as the first Project Cortex product. What's next?
Holme: The rest of Project Cortex is about looking across the organization at all of the content and all of the activity and expertise that an organization has and identifying shared topics -- whether it's a project or a process or a customer. And then pulling all the information related to that topic into one place, creating what we've shown as topic pages, which are like a Wikipedia page for the enterprise about a particular topic. That topic understanding then can be augmented by human experts to make it even better. And then it gets pushed and delivered automatically into the apps you use every day.
So you can imagine being in an e-mail and seeing the name of a project or an acronym that you don't understand, and literally being able, in the e-mail, to hover over it and get a summary of what that means, and who are the experts and what resources are there in the organization about that topic. So that topic intelligence is the part of Cortex that will be available to customers later this year.
It sounds like Microsoft's early objectives for SharePoint are coming together. You had enterprise search, and Microsoft acquired a company called FAST. And then you had the Microsoft Graph.
Holme: I think, conceptually, you're heading in the right direction. SharePoint has always been able to manage metadata. You could create schemas, lists and libraries. Search could use the metadata to make search more accurate and rich. So those two layers have always been there.
What Syntex does is almost like making SharePoint better. In the past, people had to add metadata, people had to process content and say that for this one row of this contract the value is x, the vendor is y, the deadline is z. Now, Syntex can actually learn from people to do that automatically, saving those people time. The other thing that Syntex does is it captures that knowledge so that others can use it. So those middle layers of SharePoint and search provided a foundation for this concept of knowledge management and knowledge sharing in the organization. And Project Cortex is now applying AI to address the weak spots, which have been the actual gathering and unlocking of knowledge, the curation of it, and then the delivery into the context of your work.
Moneypenny: It might be helpful if you think about this as a layer of intelligence that exists within the Microsoft Graph. Microsoft's journey also included FAST and Yammer acquisitions. All of that, and the goodness of SharePoint for the last 20 years or so, has basically been going into the Microsoft Graph -- all of that collaboration data, the signals we have from people working together. And there's the search index that happens across the Microsoft Graph that understands relationships between documents and people. What we're trying to do with Project Cortex is really elevate that to the next level. To say, 'What's here is a knowledge index,' essentially.
And the Microsoft Graph is defined as a data store to which AI is applied?
Moneypenny: It's been the storage layer of all of the content we have in Microsoft 365. So all the different applications -- SharePoint, Teams, Yammer, Outlook, etc. -- all of that goodness goes in there. That's the base content, if you will, the primordial soup of everything that's in Microsoft 365. And on top of that you've got the signal data between those things, so all of that collaboration that happens inside your organization is kind of the beginnings of a brain.
It's in that process of connecting things together, of understanding how people are working together, what the traffic patterns around e-mailing, around documents creation, meetings, tasks -- all of those things, all within the Microsoft Graph. And so, from there, I can extract intelligence essentially, and I can put a different level of processing on top of it. So, the first thing is obviously search. The next level is really about understanding knowledge inside of your organization.