Skip to content

Script files with UTF-8 BOM aren't executed #436

@MarioLiebisch

Description

@MarioLiebisch

While ChaiScript supports UTF-8 sequences in strings it seems that files encoded in UTF-8 with a BOM at the start will fail silently – this actually drove me nuts until I actively stepped through.

  • If the user provided input is a file (i.e. as parameter to use() or eval_file()), the BOM should be skipped.
  • If the user provided input as a string, the BOM should probably trigger "Unparsed Input".

As of right now (current master branch), Statements() will not match the first character of the BOM at the start to any known statement (which is correct and for obvious reasons) and as such just return false (which is then interpreted as a no-op).

Minimal example:

#include <iostream>
#include "chaiscript/chaiscript.hpp"

int main() {
  chaiscript::ChaiScript chai;

  chai.add(chaiscript::fun([](const std::string &text) { std::cout << text; }), "print");

  // The following expression is skipped rather than
  // causing an "Unparsed Input" exception or skipping
  // the BOM (in case of a file).
  chai.eval("\xef\xbb\xbfprint(\"Hello World!\");");

  return 0;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions