Web Hosting Forums

Results 1 to 7 of 7

This is a discussion on Word to HTML in the Hosting Talk & Chit-chat forum
For work, we have to convert huge word files to HTML and word does such a messy job of it. Does anyone know of a ...

  1. #1
    Loyal Client
    Join Date
    Aug 2001
    Posts
    8

    Word to HTML

    For work, we have to convert huge word files to HTML and word does such a messy job of it.

    Does anyone know of a GOOD tool to do this?

    d

  2. #2
    Loyal Client
    Join Date
    Jul 2001
    Posts
    84
    Dreamweaver has an automated command to strip a lot of the Word junk from imported Word documents...

    Although not perfect, it's better then a kick in the teeth!

  3. #3
    Loyal Client
    Join Date
    Aug 2001
    Posts
    8
    Yeah, They're close but no cigar.

    I guess we need it to be close to perfect but perhaps thats unrealistic.

    I download a plethora of utils that convert .doc to .html or do it via .rtf but they all suffer from the same problems of tabbing etc.

    i don't know why microsofts own 'Save As' feature cannot give me adequate code. Forgetting for the moment the sheer uglyness of the html generated, Word can display the document perfectly in rtf and doc formats, it should be able to save perfectly for html (perfect output, not perfect code)

    oh well, looks like a manual job (400 word files all with bloody tabbing)

  4. #4
    hell no, we won't go!
    Join Date
    Sep 2002
    Posts
    1,093
    your forgetting that MS also makes FrontPage!!
    - Colin

    I like food.

  5. #5
    Loyal Client
    Join Date
    Aug 2001
    Posts
    8
    Aaahh.. Now if only it were that simple!!!

    Frontpage could load the files and i could save them but the tabbing was out again.

    no biggie, i'll do it the old fashioned way...

  6. #6
    Loyal Client
    Join Date
    Jul 2001
    Posts
    48
    If Word tab formatting is the only problem you're ending up with, here's what I'd try:

    1. Before converting the Word file to HTMl, I'd do a search and replace of every tab character in the .doc file to five of something -- replace ^t with ***** or %%%%% or five of any characters that the document would not otherwise have five of together.

    2. Do the conversion to HTML.

    3. Take the HTML file and do a search and replace (either in a text editor like NoteTab Lite or by making a simple Perl script to run the file through) of the ***** string (or whatever five-character string you chose) to

         

    (it's not showing up, I've typed ampersand nbsp semicolon five times, the HTML code for five non-breaking spaces)

    turning all the old Word tabs into five non-breaking HTML spaces.

    Don't know if that would solve the problem, but if it helps, seems better than doing it all manually

    Sharon

  7. #7
    Loyal Client
    Join Date
    Aug 2001
    Posts
    8
    Thats a great idea.

    Obvious when you think of it (although i never thought of it)

    cheers

    D

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •