Skip to content

File.stream is delivering the byte order marker when reading lines #5695

@mlankenau

Description

@mlankenau

Environment

  • Elixir & Erlang versions (elixir --version): elixir 1.4.0, Erlang 19
  • Operating system: OSX 10.12.1

Current behavior

I am opening an utf-8 file with File.stream! with the default setting (read lines) like

File.stream!("my_file.txt")
> Enum.map(...)

When trying to match the first line to something expected it fails. It took me quite long to find out that the first line contains 3 extra bytes in the beginning (the BOM, https://de.wikipedia.org/wiki/Byte_Order_Mark).

I solved it in my project with adding the BOM to the expectation

<<239, 187, 191>> <> "HEADER" = line

This might be brittle (when the BOM changes) and confusing.

Expected behavior

When using :line mode or (as Jose suggested at elixirforum) another option :strip_bom is given, the bom should be skipped.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions