Skip to content

Encoding#308

Merged
lsf37 merged 6 commits intomasterfrom
encoding
Sep 16, 2018
Merged

Encoding#308
lsf37 merged 6 commits intomasterfrom
encoding

Conversation

@lsf37
Copy link
Member

@lsf37 lsf37 commented Sep 16, 2018

Adding an encoding option for reading lexer specs. Addresses issue #164.

@lsf37 lsf37 merged commit beef325 into master Sep 16, 2018
@lsf37 lsf37 deleted the encoding branch September 16, 2018 01:46
@regisd regisd added this to the 1.7.0 milestone Sep 16, 2018
@regisd regisd added the enhancement Feature requests label Sep 16, 2018
Copy link
Member

@regisd regisd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't have time to review before you submitted.

* If true, dot (.) metachar matches [^\n] instead of [^\r\n\u000B\u000C\u0085\u2028\u2029]|"\r\n"
*/
public static boolean legacy_dot;
/** The encoding to use for input files. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the output?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it'd be better to apply this encoding to both, so that things are consistent. I don't think it's worth doing two separate options, because if somebody wants the output file in a different encoding to the one specified, they can still recode it, as long as it's clear what the output encoding was

encoding = Charset.forName(encodingName);
} else {
Out.error(ErrorMessages.CHARSET_NOT_SUPPORTED, encodingName);
throw new GeneratorException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could default to platform encoding or utd-8 here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to fail if the user provided a specific encoding and it's not available. Otherwise it'd be too easy to overlook and get wrong results.

lsf37 pushed a commit that referenced this pull request Sep 20, 2018
@lsf37 lsf37 mentioned this pull request Sep 20, 2018
lsf37 pushed a commit that referenced this pull request Sep 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Feature requests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants