Spring Cleaning for AD
Spring is here, and it's time to clean up some of the dark corners of your Active Directory forest.
Most of you could probably name a few areas within your Active Directory forest that could use some tidying up and cleaning out, if only you had the time. Just like your closets at home, there are many places for digital dust bunnies to hide in AD. There's a little-known tool that can greatly speed up the cleaning process.
Our company implemented AD in 2001, and designed and documented a fairly efficient architecture. As with most companies, though, people come and go. The ownership of AD changed hands a couple of times within the IT department. By the time it got to me early last year, it was long overdue for some cleaning. Several VB scripts I had written to audit our environment were time-consuming and incomplete at best. When I contacted our Microsoft rep to see if they could offer any assistance, I learned about a program called the Active Directory Health Check (ADHC).
The ADHC is available under the Microsoft Premier Support program. One of Microsoft's sharpshooters comes to your site, installs a toolkit on your server and then trains you how to use it as they make a clean sweep through your AD environment. The process usually lasts anywhere from three to five days, depending on the size of your organization.
To schedule an ADHC engagement, you can work through your technical account manager or contact Microsoft Services via their Web page. This is one of the more popular services they offer, so it may take up to three months to get on the schedule and get someone on site.
Before the Microsoft technicians arrive, they'll give you the AD Risk Assessment Program tool, which will verify that you meet all of the prerequisites. You'll have to run this tool from a forest root server with rights to all of the child domains. It will verify basic connectivity to all of your DCs using several methods: WINS, DNS, WMI and testing several of the standard AD ports. It generates a handy report that lists all of the forest DCs and which tests passed or failed.
Set the Ground Rules
We spent four weeks cleaning up name resolution issues and begging firewall admins across North America to adjust their rules so the DCs could all freely talk to the health-check server. Microsoft Knowledge Base article KB832017 lists all the ports you need to keep open for AD, and Microsoft Downloads has a tool called PortQryUI that makes it easy to check.
You can save yourself a bit of time by pre-installing the Admin Pack and Resource Kit prior to the ADHC engagement. Microsoft will also ask you to run a script that drops a registry key on your DCs to enable AD database white space logging. It's harmless, and you'll see later on why it's so important.
Once you've cleaned up all the rabbit trails in your forest, it's time for the big game hunters to step on the scene. Microsoft Services has a team of professionals dedicated to the ADHC program, and these guys know more AD command-line tools and switches than you ever knew existed. They're pros and they're good teachers.
The ADHC is divided into four basic areas -- reporting, results, settings and help:
- Reporting is where the action happens. There's a long list of tools you can launch from here.
- Results displays all your historical data so you can see the impact of your cleanup efforts and even annotate the output. You can copy and paste the results into Excel for more crunching and roll-up.
- Settings lets you customize basic ADHC options and narrow the scope from the entire forest down to a particular domain.
- Help is well written within this tool. It describes every report, column and field in detail.
Unless you have a very strict imaging process, it's easy for your DCs to fall out of alignment with your configuration standards. This is especially true when child domains are locally administered beyond your realm of influence. We have 55 DCs scattered across North America in 12 separate domains. Now at a glance we can check things like IP config (including DNS, WINS and so on), event log sizes, time service settings, operating system and patch levels, up time and drive space. With this new level of visibility, it's easy to see where the problems exist.
A large part of the ADHC is checking the replication health of sites, site links, connection objects and directory partitions. Since the original AD implementation, our site topology grew as more companies joined the forest. Unfortunately, we gave little thought to ensuring replication remained efficient.
We put in some static links as Band-Aids, and those were the first things to go. Our hub-and-spoke design had become overburdened in the center. We had to create a dual-site hub and evenly redistribute the spokes. I wrote scripts to delete and re-create all of the connection objects for the new design. Then Microsoft recommended we let the Knowledge Consistency Checker (KCC) and Inter-Site Topology Generator (ISTG) handle everything else dynamically. It takes an organization much larger than ours to require manual configuration, and the dynamic algorithm can't handle all of the permutations.
The ADHC has many reports to confirm replication health. My favorite is the basic status report. It shows change latency for the directory partitions across all of your DCs. Another report shows how long it takes changes to traverse from one end of your site topology to the other. This is great for identifying DCs or sites that lag. The GUI tool makes it painless to run those command lines with umpteen switches across all of your DCs. You'll notice that your Flexible Single Master Operations (FSMO) role servers show many more naming contexts for replication.
If you have DNS problems, then you really have problems. Fortunately, the ADHC tool gives you at-a-glance configuration and health data across the DNS configuration for all your DCs. The first thing we noticed was that our DNS forwarding was all over the map. We were able to realign our DNS forwarding addresses manually, and then confirm that none of the IP changes were fat-fingered by running the report a second time.
Stand and Delegate
Another big area for cleanup was zone delegations. Many of our admins weren't aware how much manual maintenance AD DNS requires, particularly when adding or removing DCs. DCs come and go, and we assumed that DCPROMO and dynamic registration take care of all the details, but that's not the case.
Our forest root has zone delegations for each of the child domains. When you look at the properties of these delegations, they list all the DCs in those domains as DNS servers to answer query referrals. Stale or missing DC entries will make your DNS queries take longer as they time out trying to get replies from DCs that no longer exist. The DCDIAG-DNS and DNSLINT reports will help you identify and fix these issues.
When AD first came out, I was teaching as a Microsoft certified trainer. There was a section in the course that touted a new feature called Distributed Link Tracking (DLT). This was supposed to keep track of shortcuts, so if someone moved a file that a shortcut pointed to, the shortcut would automatically know where to find the new location. It sounded fancy, but Microsoft didn't build out any functionality for that feature.
The result was the DLT service creating a ton of AD objects that would have absolutely no value and waste database space. The Microsoft pro that does your ADHC will delete these objects for you. Then you must set a group policy option for all of your servers to make sure the DLT service is disabled from creating any more of these objects.
This is why we had to turn on white space logging. Delete enough of these DLT objects, and there will be a ton of white space. The event log entries that you see talking about AD online defrag really don't buy you much. To truly reclaim this wasted space, you have to do an offline defrag as described in KB article KB232122. This makes for loads of fun when you have a Friday night with nothing better to do.
So why is this important if the objects are useless anyway? We were able to shave half the size from our 1GB AD database. The actual benefit comes in replication and search time. A much lighter database that respond more quickly to queries and zing across your site links with newfound speed.
After a week of scrubbing and cleaning, our environment was just about sparkling. The Microsoft support engineer left us with a 100-page report that detailed his findings and prescribed additional tasks for clean up that we'd need to do on our own. The report was fully documented with instructions and KB references for the remaining items.
We spent the last afternoon reviewing all the procedures and looking at the difference in the ADHC reports we ran on the first day as compared to the last day. The extent of the cleanup was amazing. For the first time in a long time, I had confidence in our AD configuration. The ADHC process was very affirming for some of the good design choices we had made, and it was deeply educational. My knowledge of AD configuration and troubleshooting was doubled after watching the master at work all week.
We scheduled our ADHC for a midsummer week where the entire company shuts down for maintenance and 99 percent of the workforce is on vacation. This gave us the flexibility and convenience of cleaning up most of the environment while nobody was in the office. However, we still had to coordinate some changes, like offline defrag, at a later date. We took about eight weeks after the engagement to plan and implement the remaining items through our normal change-control process.
The ADHC process includes a tool called the AD Topology Diagrammer that will automatically draw pictures of your environment in Visio. This makes documenting your fresh topology a snap. After your engagement is complete, the tool is yours to keep. It's licensed one-per-forest, so you'll need multiple licensed installs if you're responsible for more than one AD forest.
Following our ADHC engagement, it was time for us to do a hardware refresh on 10 DCs. A couple of them were basic DCs with DNS, but most had a combination of additional services like WINS, DHCP and FSMO roles.
We chose to build parallel DCs on new hardware, migrate the services and then decommission the old boxes. This plan worked quite smoothly, and the AD Health Check tool was our primary means to do confirmations before, during and after the refreshes. It caught minor configuration differences we may have forgotten to double check during the migration.
If you have to justify the cost of the Active Directory Health Check to your superiors, just tell them that it will fix all of the AD problems you've been battling for the last year. That ought to make them sit up and take notice. The digital dust bunnies will have no place to hide in your AD forest.