In-Depth

The SharePoint Diaries

Itching to deploy Microsoft's powerful new SharePoint portal server technology? Better know what you're getting yourself into first.

Microsoft's SharePoint Portal Server 2003 lets enterprises gather, leverage and expose vast stores of knowledge. However, the process of deploying the software can overwhelm IT managers who find themselves working across a dizzying array of technical and business disciplines to tie it all together.

How do you make a SharePoint deployment fly at a large enterprise? Join me on a high-stakes deployment at a very large Food and Beverage Company that we will call FBC.

DAY 1: Genesis
I meet with the CIO for the first time since I've been hired. Nice guy. He asks me how much I really know about SharePoint because he saw my resume, which mentioned my expertise. It turns out one of his direct reports has been charged with finding a solution for document collaboration and management. One of the project managers has some SharePoint experience and FBC is looking seriously at the software. The project manager and I hit it off almost immediately and start geeking out on SharePoint technology. The project manager tells me he's a developer and is glad an IT guy like me can help with the deployment.

DAY 2: Know Thyself
I meet with key stakeholders in the project. I have been around the block enough times to know that leaving people out of the loop can build walls, and in the case of SharePoint, you need to know who all the players are from a business perspective. The taxonomy in SharePoint is key and this makes knowing who's who in the organization an immediate priority.

I make it clear in the meeting that SharePoint will not automatically organize the information, that care and planning on our part will be critical to success. Then I drop the bomb: "Where's the information that your organization needs to have and use to be successful?" I ask.

The silence is deafening. The most common answer after the silence is "Everywhere." Sorry, but that doesn't cut it. We need to know exactly where the information is. Don't know? Find it. And once you find it, figure out what's relevant and what's not relevant. Based on the looks I get in the meeting, answers won't be easy to come by. And yet, this is probably the No. 1 planning issue in deploying SharePoint Portal Server 2003 -- knowing where the relevant information resides and how much of it there is. (In Microsoft Office SharePoint Server 2007, the Knowledge Network feature makes this issue a lot less daunting.)

After the information is located and assessed, the decisions as to what content sources to include are all but made. In SharePoint a content source can be a file share, an Exchange public folder, other SharePoint servers, or other Web sites. Content sources are important because SharePoint uses them to build a Content Index, which is created when SharePoint crawls the locations where the information resides and stores them. The context index is then accessed by search queries. In order to create and manage additional indices, you'll need to enable Advanced Search Administration mode.

DAY 3: Short Stack
Everyone is on board. Actually, the word's gotten out and more people want to get on board! I'm starting to get e-mails about when the application will be deployed and when particular departments will receive their "portal." I share the e-mails with the project manager who moves them up the chain to his boss. As this is a pilot project, we need to keep the lid on things and limit the scope of the effort. This means no extra "stuff" or "tools," and all the hardware and software must be commercial, off the shelf (COTS).

Actually, all this makes the job more manageable. We know how many departments and groups will be allowed to participate in the project and we can avoid fighting with vendor support in this deployment. The server stack includes Microsoft Server 2003 Enterprise Edition, SharePoint Portal Server, Microsoft SQL Server 2000 Enterprise Edition, Microsoft Cluster Server and Microsoft Network Load Balancing. Next, I interview the groups to find out exactly what their individual needs are, and how many documents they are planning on using to collaborate with others.

DAY 4: Infrastructure Dance
The interviews were informative. I now have enough information to start planning the logical structure. I've decided that a large server farm is appropriate, even for this pilot project. There are five business units in four distinct locations (all in North America) that make a convincing pitch for needing SharePoint.

Now for the physical infrastructure. I need to figure out where the servers will be placed geographically, what DNS entries to make (A records and CNAME records are used heavily in SharePoint) and the exact topology. I will have to work with the Active Directory team to ensure that the proper accounts are created and I will need to work with the Network team to get the appropriate IP addresses for each server. I wonder if is there is a database team? (I later find out that there was.) I will need to bring them in the loop since SharePoint must store its configuration information there. Emails go out to about 40 people.

DAY 5: Hard Math
While I wait for the response to the e-mails, I use the time to think about the hardware. What type, how much and what are the specifications? Because FBC has a contract with a very large hardware vendor (LHV), my options are somewhat limited. This is a large server farm, so I will need servers to host the Web front-ends (WFE), Search, Index and a clustered database. I calculate how much space I will need for the implementation. This gets interesting. I will need about a 5:1 disk space ratio -- for every 1GB of data I want to put into SharePoint, I will need 6.5GB of free disk space. Where does that ratio come from? Let's do the numbers.

If I have 1GB of data, I'll need 1GB of free disk space, obviously. Because this space is in SQL server, I will also need 100 percent free disk space equal to the size of the database for SQL DB maintenance routines. We estimate we'll need about 300MB of storage for the internal index, and four times that amount for indexing external data, such as data from file servers, public folders, other Web sites and the like. Finally, we'll need plenty of available storage for the farm backup. The numbers for our project shake out as shown in the table below.

SP chart

Because this is a large server farm, the server components need a high-speed connection between them (and a T1 is decidedly not high speed). The reason: WFE and Search servers talk to one another constantly and a long delay (of say more than 30 seconds) could bring down the server farm. Also, FBC utilizes SSL on its Web servers and uses ISA Server as its proxy, which may be an issue. Incoming HTTPS request packets are received as an HTTP address by SharePoint, which makes it impossible to upload documents.

The folks in Information Security aren't likely to change their default configuration just for me. But when I show them that we can end run the problem by implementing host-header forwarding, the group gets on board. Now ISA Server can forward the HTTPS packets to the WFE without altering the original host header in the HTTPS packets.

Windows SharePoint Services Version 3

It gets better. The newest version of the SharePoint platform greatly improves the collaborative tools, content management, tracking capabilities and hosting of other services to the knowledge worker.

According to Kurt DelBene, Microsoft's corporate vice president of the Office Server Group, the extensibility of Windows SharePoint Services is a conscious design choice to create a product that takes advantage of the industry's rich ecosystem of solution providers and highly specialized software developers.

"We have designed Windows SharePoint Services with the foundational components that enable customers and partners to develop solutions for collaboration, content management, and portals," DelBene says.

As in WSS v2, the new version can provide a single workspace for teams to coordinate schedules, organize documents, and participate in discussions -- within the organization and over the extranet.

So what's new? To begin with, the new user interface is breathtaking. There's a ton of information on the starting page of the new Windows SharePoint Services version 3 (WSSv3).

NewSharePoint
[Click on image for larger view.]
Figure A. The new start page puts everything at a mouse-click away.

It's all about organization. As you can see the five main areas of Permissions, Look and Feel, Galleries, Site Administration and Site Collection Administration are all laid out for you on the same page, a real improvement over version 2. There are two key component improvements in WSSv3:

  • Improvements to collaboration workspaces. SharePoint sites now offer e-mail and directory integration, alerts, Really Simple Syndication (RSS) publishing, templates for building blogs (also known as weblogs) and wikis (Web sites that can be quickly edited by team members requiring no special technical knowledge), event and task tracking, improved usability, enhanced site navigation and more.
  • Enhancements to content storage. SharePoint lists and libraries now provide per-item security for better data control and integrity, a recycle bin and enhanced flexibility for storing more types of content. Row and column capacity has also been increased, as has retrieval speed. WSSv3 can be easily integrated with smart client tools. In particular, close integration with Microsoft Office Outlook 2007 provides offline access to events, contacts, discussions, tasks and documents.

One of the best interface updates is the breadcrumb feature, which always shows users where they are in the site hierarchy. It's a well-known fact that if users have to click more than six times to get somewhere, they'll become frustrated and give up. Breadcrumbs should eliminate this issue.

Microsoft shied away from the term "document management" in the last version of WSS, but no more. Version 3 is a full-blown document management environment, with workflow, scheduling, tracking, and other features vital to keeping tabs on document creation and archival.

The new SharePoint even boasts a built-in calendar, which can sync with Outlook and has an RSS feed that allows you to subscribe to sites. The new version also adds item level security in lists, providing much higher granularity when managing access to information. It's a long-awaited improvement that most IT managers will welcome.

The list goes on and on and the best part is that it's free. Go here to download your copy. -- R.T.

Another option is SSL bridging, though it's more problematic when troubleshooting search than host-header forwarding. ISA Server enables HTTPS-to-HTTP bridging, but the functionality is not supported when publishing with SharePoint. SharePoint uses absolute URLs, and the URL from the client and the URL sent to the server must match. To keep the URL sent from the client to ISA Server the same as the URL sent from ISA Server to the Web server, a new SSL connection must be established between ISA Server and the Web server.

Request
[Click on image for larger view.]
Figure 1. ISA Server receives a secure HTTPS request, then uses host-header forwarding to send an HTTP request for the published site.

DAY 6: CAS Deep Dive
I tally up the best estimates from all the groups as to how many documents they have and decide that I will need substantial hard drive space; somewhere in the neighborhood of three terabytes. I make a call to the hardware vendor and get the exact specifications for the servers and RAID arrays and receive a quote. I send the quote to the finance guy and he gives me an ETA on when I can get the hardware.

With infrastructure pieces of the deployment planned, it's time to pull in the project manager who has development experience to discuss possible development issues in SharePoint. Some groups will be developing custom Web parts, and as a responsible administrator I need to negotiate what should and should not be done in SharePoint from a development perspective. This means a discussion about Code Access Security (CAS).

CAS is an important aspect of SharePoint. If you, as an administrator, allow developers to write and deploy whatever they want, you are asking for problems. If a developer were to create an assembly that performs file I/O, you should ensure that the code is restricted to specific (and hopefully isolated) areas of the file system. CAS also means that you should prevent other code developed externally from calling internal code. CAS can also use an assembly's URL, or hash, to identify code. In the .NET framework, evidence is used to identify assemblies and grant appropriate permissions to those assemblies. This can be the URL, or Zone, from which the assembly was obtained. Evidence could also be a digital signature or hash. In addition to the default ASP.NET security policy files, Windows SharePoint Services (WSS) provides two policy files: (wss_minimaltrust.config and wss_mediumtrust.config). Each policy file has a set of code groups which are used to assign permissions to assemblies.

What does this information mean to an IT administrator? For one, it means understanding and restricting the behavior of assemblies installed on your WFEs. A utility called PERMVIEW lets you view all declarative security used by an assembly. The syntax is: PERMVIEW [/output filename] [/decl] manifestfile

Let's say that your developer created an assembly called UBERassembly.exe and you want to know all the declarative security on this file. You would run PERMVIEW /output whatsitdoing.txt /decl UBERassembly.exe. Review the output and if you see RequestMinimum permission, understand that this lowers the security threshold required for the code to run. Also, if the output shows Unrestricted, you'll know that once it has obtained minimum permission, the code will enjoy unrestricted access to whatever resource it is calling. You are well advised to make sure that developers understand this code will not run on your WFEs. Here is a short list of questions you should be asking your developers:

  • Is your assembly strong named? That is, does it have a hierarchical name, rather than a weak (flat) name?
  • Do you request minimum permissions? Minimum permissions make it much easier for code to run unrestricted.
  • Have you scanned your code for Assert calls? Remember, Asserts that are not handled carefully may allow malicious code to call your code through trusted code.

If your developer gets uppity or tries to dazzle you with dev talk, simply tell him/her that the code will stay in a dev environment until proven safe.

The rest of the day is spent developing Visio diagrams of the proposed SharePoint infrastructure, to be presented at the stakeholder meeting tomorrow. Figure 2 below shows how it might look.

Diagram
[Click on image for larger view.]
Figure 2. An overview of FBC’s SharePoint environment reveals ample redundency.

DAY 9: Search Savvy
Hardware will be here in a few days and I'm preparing to present to the stakeholders again. The number of stakeholders has grown from eight to more than 20. My goal at the meeting is to present the proposed infrastructure diagram, outline principles of governance, and have a mini-training session over how to navigate the user interface. I will break out of the meeting and meet with the future application owners to show them how to craft a useful search result, which may take half the day.

The mini training goes longer than expected. So many questions! The training aspect of this initiative suddenly hits me like a ton of bricks. Universal adoption requires that the end user be trained, but you can't expect everyone to acquire significant new skill sets.

The 5 Commandments of SharePoint Delpoyments

An effective SharePoint deployment must be built with a solid understanding of the organization's design needs. Here are some of the most common things you should take into account before (or as) you design your SharePoint infrastructure.

  1. Thou shalt not put all documents into SharePoint. This is a common mistake. SharePoint is a good document repository, but it should not replace your file servers. Keep non-collaborative documents on your file servers and point SharePoint to the file server as a content source. Dropping all documents into SharePoint unnecessarily grows your SQL database and makes a backup and restore more cumbersome, especially for a file-level restore.
  2. Thou shalt put processing power on the Web front-end. Architects often place the biggest, most powerful piece of hardware at the back-end with SQL. But if that database is dedicated to SharePoint, you are off course -- the "hoss" should be placed at the front-end with the WFE. That's the end that gets busy with crawling content and serving up user requests.
  3. Thou shalt not underestimate storage requirements. Obey the Golden Rule of SharePoint -- for every 1GB of data, set aside 5GB to 6GB of storage capacity. If you don't adequately size your disk space, you'll be forever adding space at inconvenient times.
  4. Thou shalt not scrimp on user training. What if you built a killer app and no one used it? Fail to train your users, and you'll find out. Develop an internal training program or pay for competent external training, but do not let your investment go down the drain.
  5. Thou shalt respect search. If you deployed SharePoint for its search, you must invest man-hours to make it work right. Expect to budget 0.5 FTE (Full Time Employee) for every 100 content sources SharePoint server must crawl. That half-day will reflect time spent ensuring content sources are being correctly crawled, that filters are working and that quality results are being returned. -- R.T.

The breakout meeting was also interesting. They were surprised the search functionality of SharePoint was as powerful as it was configurable. Unlike other search engines, SharePoint's search uses what is called "Free Text Queries" and ignores wildcards and Boolean expressions. SharePoint attempts to understand what you are searching for rather than matching the words you put in the search field. It uses different components to help best match your intent. One of those components is the Thesaurus. Thesaurus files are located in %systemroot%\Program Files\SharePoint Portal Server\DATA\Config. The files are separated by language and if you are using English, be sure to edit the correct English file (ENU for USA, ENG for UK). For FBC, there were many words for which we needed to expand the Thesaurus. For example, "Water" was expanded to "Still," "Sparkling," "Spring" and "Drinking."

It was also suggested that certain words be excluded -- achieved using the Noise Word file. It tells SharePoint to exclude words from the Index, such as prepositions, conjunctions and articles. Just realize, if a library wanted to index the movie "The Way We Were," it would be invisible to SharePoint. Every word in the title is a default Noise Word. If changes are made to the file, you must restart the Microsoft SPS Search service. Troubleshooting the Search functionality is the most time consuming and sometimes the most frustrating of all. Since the Thesaurus is case sensitive, both cases of the word should be tried if necessary.

DAY 11: Next Steps
Now I just wait for the hardware, fill out the appropriate change request forms, hold the proper follow-up meetings, purchase the software licenses, do my normal day-to-day chores and work with the facilities managers at each location to actually get the hardware racked and cabled up. The hard part -- planning -- has been done. Then we'll have to work on training the users. Now that will be a major headache.

Featured

comments powered by Disqus

Subscribe on YouTube