Skip to content

failed to translate html file with MS document translator #156

@laf-1226

Description

@laf-1226

We translated some html files with MS document translators, all html files are well-formed, but translation of some html files failed with error message: Error while processing document: xxxx.html Object reference not set to an instance of an object.

Here is the example file which failed to be translated.
The sample file is a little complicated, we created a small one to reproduce the error.

  • xxxxxx
  • if

  • size is really big (for example, text length>5000 chars), this file will fail with MS document translator.
    While if we put some text between

    and

  • , the file will be translated successfully.

    we don't know why, but we are wondering if something wrong with the code below:

        private static void AddNodes(HtmlNode rootnode, ref List<HtmlNode> nodes)
        {
            string[] DNTList = { "script", "#text", "code", "col", "colgroup", "embed", "em", "#comment", "image", "map", "media", "meta", "source", "xml"};  //DNT - Do Not Translate - these nodes are skipped.
            HtmlNode child = rootnode;
            while (child != rootnode.LastChild)
            {
                if (!DNTList.Contains(child.Name.ToLowerInvariant())) {
                    if (child.InnerHtml.Length > maxRequestSize)
                    {
                        AddNodes(child.FirstChild, ref nodes);
                    }
                    else
                    {
                        if (child.InnerHtml.Trim().Length != 0) nodes.Add(child);
                    }
                }
                child = child.NextSibling;
            }
        }
    

    Sorry that i failed to upload the sample files, either in html or in .docx format.
    Has someone met the similar issue with MS document translator? And does anyone know how to fix this issue? Many thanks!

  • Metadata

    Metadata

    Assignees

    No one assigned

      Labels

      No labels
      No labels

      Type

      No type

      Projects

      No projects

      Milestone

      No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions