Binary Exfoliation Using XML
Applies to: Word 2003, 2007, 2010
Over time, oft-reused documents or templates that have been around awhile tend to gather 'binary dust'.
To quickly clear away the grime and 'rebuild' the document/template anew, the Microsystems Solutions Center has exercised a technique we refer to as "Binary Exfoliation". Learn how -- and when -- to use this new process, brought to you by the File or Office Button | Save As | XML feature introduced in Word 2003.
Symptoms which indicate the need to 'exfoliate' your document or template:
Binary build-up is present if your document exhibits any of the following characteristics:
- your template or document is void of graphics, yet has ballooned in file size
- tracked changes ignite when you perform an Insert | File
- your Headers/Footers hang or misbehave when inserting a DocID
- you can't bear the thought of manually-rebuilding your templates when you move into Word 2003, 2007, 2010
- you need to exterminate missing -- yet unsubstituteable -- fonts
- document crashes during printing
- you are unable to use comparison utilities
General instructions for binary exfoliation:
Using Word 2003, 2007 or 2010…
- File or Office Button | Open desired document or template
- File or Office Button | Save As | XML
- File or Office Button | Close
- File or Office Button | Open, File | Save As | document or template
- Save your document or template again using the DocXtools "Slow Save At End" function available from the Help tab of the Toolkit
File exfoliation, complete!
What you lose, what you retain:
For 2003 .doc and .dot files, this process round-trips everything except keyboard shortcuts, internal document versions and the "binary build-up" you want to eliminate anyway. For 2007 and 2010 .docx and .dotx file formats, this process also results in the loss of enhanced file features such as Quick Style attributes that are not available in Word 2003.
To remove or reassign unwanted fonts:
Open the resulting XML file in any ASCII file editor, and seek out the <w:fonts>
tag. The list which appears between this tag and the closing
</w:fonts> tag identifies all fonts used in the document. Delete the font identifiers you don't want, or replace them with the font(s) you do. It might even help you tame those odd font choices ("Times New Roman Bold Bold") that tend to hang out around list templates.
Microsystems' take on Word's XML result:
Think of the resulting XML file as "Reveal Codes on Steroids": finally, a way to peer into Word's psyche.
More Word Tips »