Importing Word Documents
What's covered?This page covers general matters about importing Word documents into a RoboHelp project. Many of the points covered here have been identified over a long period and may well have been fixed in later versions of RoboHelp. I suggest you start by reading the page for your version and come here when and if you hit problems not described there. |
I recommend that you do not import Word documents into your carefully crafted main project. Import them into a project set up just for the purposes of the import.
- If it goes well, then import the htm files created into your main project.
- If you have problems then nothing is lost, just trash the temporary project and start again.
If you intend to create printed documentation from your project later, try it early on as problems from the import can manifest themselves there. Leave it until later and you may not make the connection.
Introduction
It is quite common for questions on the RoboHelp forum to show the poster's frustration when Word documents do not import without problems.
- The document looked good in Word so if it doesn't look right in RoboHelp, it is RoboHelp that gets the blame.
- Other times the imported document looks OK in RoboHelp so other changes are made to the project and when something goes wrong, RoboHelp gets the blame while Word looks all sweet and innocent while being at the root of the problem
It is just too easy to blame RoboHelp. As my very good friend Rick Stone (RoboWizard) points out, you are mixing oil and water and that can be tricky!
Importing a Word document into RoboHelp can be a painless procedure if you follow some rules. Ignore them and what you see in RoboHelp may not be what you want.
Some years ago I was using the improved import features of RoboHelp 6 that carried into later versions and I imported a number of documents that looked great in Word but in the online help they were completely unacceptable. In identifying the causes, I found more about what is behind peoples' comments about the import not working. The reality was less about RoboHelp's shortcomings and more about importing poorly formatted documents.
The documents had been created by someone who had put a lot of effort into making them look good and they truly were professional looking documents, but that person had not done things the right way. If they wanted an indented paragraph, they selected the paragraph and used the indent icon rather than apply a style that was indented. If they wanted different bullet symbols they applied them manually to those lists rather than use styles. On and on it went. I spent quite a time cleaning up the documents, reversing these malpractices and applying proper styles. After that the import was so clean that I only had a few minor changes to make in RoboHelp. What I found then remains valid when working in the later versions of RoboHelp.
In Word there are right ways of doing things and wrong ways but the appearance of both is the same so people think, wrongly, that their document is good and blame RoboHelp when importing a document that does not work as expected.
Quite apart from such issues, there are many other factors that can affect your import. If you are not getting the results you expect, this page covers what to loof at.
Reviewing and Fixing
Having imported the document(s), these are the things to look for, not necessarily in the order presented. Remember this is a list of many of the things that can cause problems, you are unlikely to encounter all of them.
Item |
Comment |
| General appearance | From a quick visual check in the WYSIWYG editor, your topics should look pretty much like they did in Word. If they don't and the styles seem to be messed up, look in Word for manual formatting that has been applied at paragraph level as opposed to character level. Manual FormattingWhat you should be aiming to have is a Word document that uses styles for each paragraph format that you want, rather than styles with variations being created manually. For example you should use Normal for most paragraphs and Normal Red for paragraphs that are to be formatted with red text, not selecting a paragraph and applying Red to the text. Unfortunately the latter is what most Word users will do but it is not the correct way. Any manual formatting will convert to an inline style in your online help. You should be aiming to have it convert to an html class as that is controlled from your style sheet and makes mass changes much easier. To find manually formatted paragraphs In Word 2003 make sure
you have this check box ticked in Tools | Options | Edit. That's the one that gives you the much longer list of styles
that you see in Word 2003. At other times you might want to clear
that one! I place the cursor in each paragraph in turn and make
sure the style does not have a + symbol after it. For example
it shows as rather than the style with a + symbol after it. In Word 2007 / 2010 display the Style Pod > Click Options > Tick Font Formatting. Word will then show manual formatting in the pod.
The + sign indicates you have taken the style and modified it for that paragraph. "Normal Red" or "Heading 1 Red" would be OK as that indicates your document contains styles with that name. The + symbol is what you are looking for. You may find that you have to repeat the process several times over in the Word document as Word will ignore what we may regard as similar formatting because technically there is some difference. The other way of doing this is to press Ctrl + A and then Ctrl
+ Spacebar as that removes all manual formatting. You then get
the document back to the required appearance using styles rather
than manual formatting. Be careful with that one though as it
may make more changes than you really want. Check immediately
and use the wonderful Undo feature if necessary. It's OK to have individual words manually formatted as the import routine creates classes for those. That's useful as you can quickly change the class so that something bold in Word can be blue and bold in RoboHelp. |
Headings |
These should import with no unwanted changes to your HTML. |
Paragraphs |
Paragraphs should show in the style dropdown as either Normal or with a class that represents the style applied in Word. You should find any paragraphs with the Normal style in
Word come through as standard <p> tags
in HTML. If you find instances of the <p> tag being written as Other paragraphs will have a class applied and will look something like <p class=TheNameApplied> such as Indent, Bold, whatever name you used in Word for the style or changed it to during the import.That is correct. |
Bullet Points and Numbered Lists |
From RoboHelp 6 onwards, these should import correctly so that if you place the cursor at the end of a list and press Enter, you will get a properly bulleted or numbered item. I did find the indentation was not right in one document but I think that was something unique to the document. If you have any similar problems, I would be interested to learn about them. Click here if you are using RoboHelp X5
They should have the gap shown in the screenshot. If they do not, then look in the code and you will probably find lots of spaces have been created, represented by in which case you need to look at how many went wrong and decide whether to correct things in RoboHelp or start again. You could limit your corrections to just those areas where you will need to edit the bulleted and numbered paragraphs. For the others, as long as it looks OK you can take a view on whether it needs a clean up. Your HTML for a bullet point should look something
like this. This is what it will look like if it did not
import correctly. The problem in RoboHelp with bullet points that do not import properly is when you try to add another bullet point. Try it and you'll see. You could cheat and copy and paste a paragraph and then change the text but that's only OK if you are tweaking the text. Any serious volume of changes and you will soon get fed up with that idea. Changing the bullet points in Word is probably quicker than cleaning up after the import. |
Custom Styles |
During the import RoboHelp will create classes and include these in the HTML. For example you may have a style in Word called Indent. Normally this will end up in the HTML as <p class=Indent> and in the style sheet RoboHelp creates, it will define this style so that it has the same appearance as in Word. It may be that after importing you are going to apply your own style sheet where the style is perhaps called Indented. Note that during the import RoboHelp tells you the style names it is going to apply and you can change them. In this case you would change it to Indented so that RoboHelp creates the HTML as <p class=Indented> and that will then work with your style sheet. |
Tables |
Tables should import cleanly using RoboHelp 6 or above. If you are using X5 or earlier then as shown in Importing Using X5, tables look wrong in the WYSIWYG editor but they do display correctly in the output, at least they did in my tests. You can add rows and the result will be the same. Whether or not you can live with it is another matter. If necessary roll up your sleeves and set about changing the borders. See Useful Tools if you have lots of tables to edit. |
Images |
Check for missing images. This can be caused by various things, check the following. Word Settings - Rely on VML for displaying graphics in browsersOver quite a long period there have been reports that when a Word document with images is imported, the images are missing in RoboHelp. The problem was that it was only an issue for some people. I got hold of a document where this happened every time for several users and sent it to Adobe. They advised that MS Word has a document specific setting Rely on VML for displaying graphics in browsers which is un-selected by default but in the test document it was selected. If you clear that option and save the document, the images will import. You can find the Rely on VML for displaying graphics in browsers setting here: Word 2007/2010 Word 2003 Word Settings - Allow PNG as a graphic formatThere is also a setting Allow PNG as a graphic format. That option will prevent any PNGs in your Word document from being converted to JPGs and imported. The PNGs will show in Project Manager but will have a cross to indicate they are missing. RoboHelp will not import any PNGs in your Word document as PNGs. Your options are to make sure this option is not ticked and allow RoboHelp to import the images as JPGs OR to force the PNGs to show as missing by selecting this option and then manually adding them in the correct folder in Windows Explorer. (You can extract the images from the Word document by saving it as a web page whereon Word will save all the images to a sub-folder.) Steve Cohen found that by changing his Word documents as above, the import worked fine but then the images did not appear in WebHelp. The Word document had spaces in the filename and when he replaced those with underscores, all was well. Linked images V Embedded images in WordI know that if you link to a document rather than import, then if the document has linked images rather than embedded images, they will not be displayed in the RoboHelp output. I have yet to test if that is also a problem with imported documents. Image alignment in WordThe following problem was reported on HATT. "The issue is that when the Word document is imported, blank spaces are left for the images and they are not included. What is curious is that he used version 7.0 to do this previously - without incident. Claudia responded "I recently ran into the problem importing Word 2003 docs into an earlier version of RoboHelp. What I found was that the pictures that disappeared were formatted differently than the ones that appeared. To see this formatting, open the Word doc, right-click the picture and select Format Picture. A tabbed property sheet appears. Click the Layout tab. If the picture is not set as Wrapping Style: In line with text, it will not be dealt with properly when imported into RoboHelp. My solution was to go through the Word docs I wanted to import and change each picture to In line with text." I suspect the discrepancy between the question stating it did not occur in earlier versions and the response stating that it did, could be that the documents concerned had not been formatted in the way described. DocX FormatKatie Hanson reported that one of her tech guys suggested saving the document in DOCX format and then her problems went away. |
Bookmarks |
If after the import you see text with a grey background, the problem is the way the bookmarks were defined in Word. Instead of placing the cursor in front of some text and creating the bookmark, this problem occurs where the text has been selected and then the bookmark has been created. You will need to run the import again doing one of two things. Either clean up the bookmarks individually (try an interim import after you have changed some, just to make sure it is working) or you need to strip out the bookmarks and recreate them later. My macros include one for this purpose but do run it on a copy of the document, not the original. You can of course live with the grey background, it does no harm and simply means the topic does not look quite right in the RoboHelp editor. |
Less obvious things that can cause problems later
Remove the meta tags that reference the source document.
|
RoboHelp adds these meta tags when importing a Word document. They can cause various problems, mostly but not exclusively when creating printed documentation. <meta name=Originator content=ImportDoc> These can cause problems when generating a printed output. 1] I was seeing tables with some borders twice the correct width and other borders correct. Removing them fixed that issue. 2] One poster found that the topic footers were appearing in his printed output. Many people would like to be able to do that but, ironically, this poster did not. Also the topic footers appeared at the bottom of the body of each page, not in the Word footer. |
After import you apply a different CSS but your content does not display the styles correctly.
|
During the import and embedded style sheet may be created. I have found this in topics created during import and of course it overrode what was in my stylesheet. <style><!-- A:visited { |
| Check the body tag | I found the language had been defined as below. I'm UK based so I removed the language reference. <body lang=EN-US> |
| Check for unwanted code. | I found this at the end of my topics. To be honest, I don't know why it was there but my hunch is that one day I'll regret leaving it in and find it is the cause of some problem. If all else fails, try removing such code. <implicit_p><b style="font-weight: bold;"><span style="font-size: 20.0pt; font-family: Verdana;"><br |
| Images may not show in Project Manager | Importing the images into a new project, I found they were in the topics but not shown in Project Manager. By looking with Windows Explorer, I found they were in a folder within the project. The folder had the name of the source document, Working_Copy_2008_Procedures, but that folder was not shown in Project Manager. I added a new folder named Working_Copy_2008_Procedures and suddenly the images appeared! |
Using your own CSS
During the import, RoboHelp will create its own stylesheet for the document.
For some authors that style sheet will be sufficient but if you are planning to change the topics to your own pre-existing style sheet after the import, then you should aim to match the styles. Normal in Word will map to <p> and heading styles will also map correctly. Other styles you need to test and it may be easier to modify a class in your style sheet to match the class name the RoboHelp has applied during the import.
If you are going to use your own pre-existing style sheet (css file), make sure the names of the styles match in the Word document and the CSS. You can change the names to make them match either in Word or during the import.
If the HTML and the topics look good, you can now apply your own style sheet. You can either do that with the individual topics or you can select all the imported topics in the Topics tab, right click and then apply your style sheet to all the topics. If after apply your own style sheet you notice that the style dropdown in RoboHelp is displaying the style in CAPS, it means that style is not defined in the style sheet. Your HTML is saying something like <p class=YourStyleName> but that style is not defined in the css file you have just attached, so RH tells you by showing it in CAPS.
Useful Tools
There are two main tools that you may find useful in this process. Macro Express and FAR. See Useful Tools and Links to find their websites.
Macro Express
At its simplest, you perform an action once recording it as a macro. Then you repeat it as many times as you want. It's much the same as recording a macro in Word except this works with any program you use. Last time I looked it was free for one month so you can easily find out if it is for you. There are other programs around and some of them free but in my opinion, this one is far and away the best. It can do all sorts of other useful things, like remember all your standard paragraph wordings.
FAR
FAR stands for Find and Replace which is in fact just one of the things it does. The beauty of this Find and Replace tool is that it finds strings across multiple lines (most do not and miss the string if it is not all on one line) and it works across multiple files. With careful use of this tool, you can quickly change all sorts of things that are not quite right. Bear in mind though that you can just as quickly wreck the whole project if you mess up, so take a copy of the project before you start. Again this tool is free to start with and it is my preferred tool
Donations
If you find the information and tutorials on my site save you time figuring it out for yourself and help improve what you produce, please consider making a small donation.
Topic Revisions
![]()
Date |
Changes to this page |
19 Dec 2011 |
What's Covered revised. |
06 Jun 2011 |
Topic extensively revised to amalgamate the old version specific topics. |
14 Apr 2010 |
Images not imported amended to add reference to Snippet 125. |
20 Jul 2009 |
Images not imported added. |
04 Apr 2008 |
Section on RoboHelp 7 added. |
13 Apr 2007 |
Topic completely revised to include RoboHelp 6 and other findings. |
05 Jul 2005 |
"What's covered" revised to include reference to a new topic covering the X5 import wizard. Associated revisions in text. No fundamental changes. |
25 Jun 2005 |
Paragraph added re the class=InLineNormal which is created during import. Also minor typo and clarification changes. |
26 Feb 2005 |
Most common import problems listed and topic amended to cover changes in RoboHelp X5. |
14 Jan 2005 |
Link added to HTML topic. |
02 Nov 2004 |
New topic. |

