Clean up bad html stops file import
-
I am trying to migrate a 20year old large plone site to wordpress using HTTrack of http://www.redwatch.org.au and HTML Import 2 into WP 4.6.29 with PHP5.6.40 on XAMPP. I have tidied up HTTrack import files and the plug in is working fine with one exception – if I choose to clean up bad HTML option I loose text in all old plone directories and the plugin does not recognise and import images not imbedded in articles or other files like pdfs etc. – The problem may only occur on old Plone sites because of how Plone handles files.
There are lots of of old posts from MS Word that ideally I would like to clean up. Has anyone else had this problem and found a solution?
I have tried adding various tags and attributes to the settings that relate to the code that the plugin is deleting but without success. In the case of the files being lost the <div class=”entry-content”> section is being removed which has the relevant img src and other relevant information. src is listed as an allowed attribute but it still disappears.
If it is not possible to fix this does anyone know if there is a plugin that will tidy up MS Word tags in an imported website?
Thanks in advance for any help with this old plugin.
The topic ‘Clean up bad html stops file import’ is closed to new replies.