Reverse Engineering

 

 

What's covered?

Sometimes you only have the output files for webhelp or a compiled chm file and nobody can locate the source files.

This topic covers how to recover from that situation.

I would like to thank Pete Lees who reviewed the first version of this topic and made some very useful comments, not least the advice you will find under Loss Prevention.

Introduction

The first step must be a very thorough search for the source files. It is definitely worth the effort. If you have Copernic Desktop Search or Google Desktop Search installed, make sure they point to the network locations you use and try to track the files that are in every project. Enlist the help of IT in looking through backups. These instructions are for when you really have exhausted all avenues.

WebHelp

You have two options, one is free and one you have to pay for.

The free option is Rick Stone's excellent topic on his RoboWizard site! Click here. The downside, as you will see, is it involves a lot of manual cleaning up. If you use this method, when you have recovered your project, come back here and see the section below on Loss Prevention. Some of the tips will be relevant for webhelp as well.

The paid option is to purchase a bundle of scripts written by Willam van Weelden. Click here. There are a number of scripts for different output types. The scripts take just a few minutes to work their magic and you will then have a project again. They also recover map ids and the table of contents.

FlashHelp and Browser Based AIR Help

See the WebHelp section above. Scripts for these output types included in the package on Willam's site..

CHM Help

The simplest solution is to use the scripts on Willam van Weelden's site. See the WebHelp section above. This type of output though is relatively easy to recover yourself by following the steps below. The basics have been covered across a number of posts on various forums. What I have done here is explain them a bit more fully and provide some options.

It is important with these methods that where you have to define the name of the CHM file, you use the name of the file you are decompiling exactly as it is. Do not, for example change myhelp.chm to myhelp2.chm. If change the name of the CHM file during the process, the glossary and browse sequence files will also be renamed. The necessary files will be in your folder but they will not appear in the project and the glossary and browse sequences will not show in your project.

See What's Lost for information about context sensitive help and conditional tags. Any index words that were contained in the source html files and any Stop words will be lost.

Method 1 - Using FAR

If you have a copy of FAR, then you have the tools you need.

  1. Open FAR and click the Authoring tab.
  2. Click HH Utils.
  3. Click File | Open and browse to the CHM file
  4. Click Decompile, select Extract All and browse to the folder to which you want to extract the files (create a new folder for this).
  5. Click OK when advised the process is complete.
  6. Now you need to create an HHP file that can be used to create a RoboHelp project. Click the Express icon back on the Authoring tab.
  7. In the first field, enter or browse to the folder used in Step 4.
  8. In the second field, enter the Title as it appears in the name of the CHM file you are decompiling.
  9. In the third field, enter the default topic. If you don't know what it is, select any file, you can change it later.
  10. In the fourth field enter the same name as the CHM file you are decompiling.
  11. Click Create Help.
  12. Double click the HHP file that has been created to open a RoboHelp project.

Method 2 - Using Keytools

Pete Lees posted details of a free tool that decompiles the CHM file and creates the HHP file in one process. The tool is Key Tools and it was available from Ralph Walden's site - www.keyworks.net. That site appears to have been taken over by someone else and the official download is no longer there. If you want to download it from my site you can click here. Whilst I am not aware of any incompatibilities, it is your responsibility to ensure it works satisfactorily on your operating system.

  1. Open Keytools and select the decompile option.
  2. Enter or browse to the two folders requested.
  3. Click OK.
  4. When the process is complete, close Keytools and open Windows Explorer. Browse to the HHP file and double click it. A new RoboHelp project will be created.
  5. You will need to check for Broken links and generally make sure the project is as you want it.

Method 3 - Using HTML Help Studio

HTML Studio can be found on the Tools tab or RoboHelp and the method is described on the Adobe site. Click here. You can either point HTML Help Studio to the CHM or you can right click the CHM and select Convert to Source.

You will also need to download the HHP builder. Click here.

Method 4 - The Man's Way

No pretty interface. Just enter this into a command prompt window, amended as necessary.

hh.exe -decompile <target_directory> <path>\<filename>.chm

I have not tried this method but if Pete Lees says it works, it works.

What will be lost?

Conditional Tags

Conditional Build Tags will be lost but any text that had a conditional build tag applied will, surprisingly, still be in the topics although not displayed. Use a multi file find and replace tool and search on condition: to find the names of the tags. Then create new tags with exactly the same names as those you find. Right click them and you will see the associated topics listed.

Context Sensitive Help

The ability to retrieve this information from a CHM was lost in RoboHelp 6 if you follow Methods 1 - 4 above. However, you should check with your developers as should have a copy of the file that they can give you. Willam van Weelden's scripts do recover this information from a CHM, as well as the other outputs covered above.

Index Words and Stop Words

Any index words that were contained in the source html files and any Stop words will also be lost.

Index and TOC Lost

If you followed the instructions above but have still lost the index and table of contents, all is not lost. That is what I exactly what I found with a file that had been created using AuthorIt. Pete Lees identified the reason and recovering them is quite straightforward.

  1. Compile the help as extracted, that is without the TOC and index.
  2. Close RoboHelp and go to the root folder of your project. You will find two hhc files and two hhk files.
  3. Check the name of your xpj file, let's say it is yourproject.xpj.
  4. Look for yourproject.hhc and yourproject.hhk and delete those files.
  5. Locate toc.hhc and index.hhk and rename them as yourproject.hhc and yourproject.hhk.
  6. Reopen RoboHelp and you should find the TOC and index restored.

Loss Prevention

This section came about thanks to a brilliant suggestion from Pete Lees.

The previous section covers what most people will lose if they have to recover their source files from a CHM output. If you've been through that loss, then you may want to be better protected if the situation arises again.

I have listed below the various things that will be lost and the file that contains that information in a RoboHelp project. Pete's suggestion is that those files are added to baggage so that the procedures above will recover them.

Data

How to protect yourself

Stop List

Add yourproject.stp to the baggage files.

Map IDs / HHP file

If you add the .h file(s) to the baggage files that may cause them to show up in the results of a full-text search.

You could make copies of the .h files and change the file name extensions to something else (say, .xh), and then add these copies to the project baggage or you could zip up the files and add the .zip file to the baggage. However that is a manual task so it is prone to error.

KLink and ALink keywords (that is, keywords embedded in the HTML source of topics)

KLink and ALink keywords are keywords that you have embedded in the HTML source of topics, rather than in the hhk file. The way to protect yourself against losing these keywords is not to use them!

It's a trade off between the reason for using these keywords (they are automatically included in any project to which you may copy these topics) and the downside if the source files are lost. That's a choice you have to make.

Donations

If you find the information and tutorials on my site save you time figuring it out for yourself and help improve what you produce, please consider making a small donation.

Topic Revisions

Date

Changes to this page

20 Feb 2017

Topic reviewed. Changed to show new location of Willam van Weelden's scripts.

14 Sep 2013

WebHelp section revised to cover Willam van Weelden's scripts. CSH section revised as not relevant from Rh6 onwards.

12 Apr 2013

Context Senstive Help section revised.

10 Jun 2011

Key Tools download added. See Method 2.

02 Dec 2007

Reference to Convert to Source added to Method 3.

09 Sep 2006

Topic extensively revised to cover how Glossary, Browse Sequences, Conditional Tags, and Map IDs can be recovered.

12 Feb 2005

HHP file added to Map IDs row in table under Loss Prevention.

04 Oct 2005

Method 4 added.

What Will Be Lost? revised

Loss Prevention added.

03 Oct 2005

New topic.