Skip to content

Add option to format with Unicode syntax.#206

Merged
brandonchinn178 merged 8 commits intofourmolu:mainfrom
kindaro:add-unicode-preference
Nov 8, 2022
Merged

Add option to format with Unicode syntax.#206
brandonchinn178 merged 8 commits intofourmolu:mainfrom
kindaro:add-unicode-preference

Conversation

@kindaro
Copy link
Copy Markdown
Contributor

@kindaro kindaro commented Jun 18, 2022

Resolves #134

Unicode looks delicious (with an appropriate font), but there is no input method for it out of the box. Therefore, most people avoid it. Now we do not need an input method — it will input itself, solving the problem for good.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jun 18, 2022

👋 @kindaro
Thank you for raising your pull request.
Please make sure you have followed our contributing guidelines. We will review it as soon as possible!

Reviewer: Please verify the following things have been done, if applicable.

  • CHANGELOG.md has been updated
  • Configuration docs in README.md have been updated
  • fourmolu.yaml updated to stay in sync with config in README.md
  • Tests have been added

@brandonchinn178
Copy link
Copy Markdown
Collaborator

Thanks! Can you make the code and config a bit less opinionated? A simple bool for the config, and rename the args to whenUnicodeOtherwise...

@georgefst
Copy link
Copy Markdown
Collaborator

This is cool, and I'm interested in trying it on some of my smaller personal projects. But I agree with @brandonchinn178 that more neutral language would be preferable, seeing as the current wording doesn't necessarily reflect all maintainers' views on unicode syntax.

@kindaro kindaro force-pushed the add-unicode-preference branch from b6d41bf to 938f118 Compare June 19, 2022 11:36
@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Jun 19, 2022

Like so?

@brandonchinn178
Copy link
Copy Markdown
Collaborator

Perfect, thanks! Can you follow the instructions in CONTRIBUTING.md to update docs + tests for your new option?

@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Jun 19, 2022

@brandonchinn178   There is no such file in this repository. Can you link me to it? Better even, tell me what you want me to do.

@brandonchinn178
Copy link
Copy Markdown
Collaborator

Sorry, DEVELOPER.md

@georgefst
Copy link
Copy Markdown
Collaborator

I think this might actually work best as a ternary option, with the values always, never and detect. Where detect will output unicode if and only if the UnicodeSyntax extension is enabled, similarly to what we do with ImportQualifiedPost. I feel detect would be the sensible default.

@georgefst
Copy link
Copy Markdown
Collaborator

Is there any particular reason why this doesn't cover all unicode syntax e.g. forall/``∀`?

I don't mind if the constructs currently implemented are the only ones you care about! But if so, we should open a follow-up issue tracking the fact that we should implement the rest.

@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Jun 20, 2022

  • Change the metaHelp text.
  • Detect the UnicodeSyntax extension.
  • Add missing Unicode symbols.

@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Jun 20, 2022

So, as George says, ImportQualifiedPost is automatic. And it does not have any command line flags to it. If the extension is enabled, the syntax is used — easy!

Why then have a command line option for UnicodeSyntax? Maybe we should hard-wire it in the same way as ImportQualifiedPost? You want it — you switch the extension on (you have to anyway, lest parse errors befall you). You want none of it — you switch the extension off.

@brandonchinn178
Copy link
Copy Markdown
Collaborator

That makes sense to me! Have you checked to see if this is something Ormolu wants? If so, it would be better to make the change upstream instead.

@georgefst
Copy link
Copy Markdown
Collaborator

I see little harm in it being configurable. But I'm not that bothered either way. If someone has turned on the extension, it does seem odd for them not to want unicode output. Then again, what if UnicodeSyntax is added to GHC202*?

And I agree that if we're going to go with the non-configurable version, then it's worth at least mentioning upstream.

@brandonchinn178
Copy link
Copy Markdown
Collaborator

I see little harm in it being configurable.

That's fair. I think this is something Ormolu would be interested in anyway, so I think we should wait and see how that conversation pans out. If they don't want it, we can merge it here. If they do, we could think about making an option to forcibly turn it on.

@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Jun 23, 2022

They do not want it, so we are back here.

I say there should only be two options: «automatic» and «never». If you try to add Unicode syntax to a module where the extension is not enabled, there will be a parse error, so it is pointless to add an option «always». Agreeable?

@georgefst
Copy link
Copy Markdown
Collaborator

If you try to add Unicode syntax to a module where the extension is not enabled, there will be a parse error, so it is pointless to add an option «always».

My concern was that a user's environment might be set up such that Fourmolu doesn't know the extension is active. But in that case, I guess they can just pass the extension to Fourmolu explicitly.

@brandonchinn178
Copy link
Copy Markdown
Collaborator

Personally, I dont think we need any option. If someone wants to disable it for a particular module, they can turn off the extension with NoUnicodeSyntax

@georgefst
Copy link
Copy Markdown
Collaborator

I have a weak preference towards the ternary option, but I don't want to bikeshed. I'm happy with no option if that's what others prefer. We can always add options later if someone has a use case.

@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Jun 25, 2022

  1. It makes sense that fourmolu should behave exactly as ormolu with some «compatibility» configuration But ormolu declined to add this feature. Therefore, we must have at least a setting to switch this feature off.
  2. The extension UnicodeSyntax can be supplied to fourmolu with a specialized option, which will result in this feature being switched on for all modules. Therefore, the option to switch this feature on is redundant.

I shall add settings «never» and «automatic». Hopefully soon.

@brandonchinn178 brandonchinn178 added the new config option Adds or would add new configuration option label Aug 5, 2022
@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Aug 12, 2022

Another argument for giving a flag to switch the extension off: currently, if you enable ImportQualifiedPost, fourmolu will format all your imports that way — then, if you remove the extension, it could not parse the code anymore! So, there is no way back, even though in principle fourmolu can as easily format the imports back.

By extension, we may want to set the UnicodeSyntax extension on with an appropriate option to fourmolu and yet ask it to write ASCII syntax. So, we cannot say that the option «always» is not needed either. Enabling the extension is different from setting the option to «always».

@kindaro kindaro force-pushed the add-unicode-preference branch from 3763321 to 1d6405f Compare August 12, 2022 16:11
@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Aug 12, 2022

Hey I have been busy but finally — rebased, added a three way option, and automatic detection of the extension. Ready for review!

@kindaro kindaro requested review from dpwiz and georgefst August 12, 2022 16:12
Copy link
Copy Markdown
Collaborator

@brandonchinn178 brandonchinn178 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! Please make sure to do everything in this list: https://github.com/fourmolu/fourmolu/blob/main/DEVELOPER.md#adding-a-new-configuration-option

@brandonchinn178
Copy link
Copy Markdown
Collaborator

brandonchinn178 commented Aug 12, 2022

Also, it looks like there are test failures. The biggest one being that fourmolu.yaml should be updated to include unicode: never (being that fourmolu.yaml represents the configuration to match ormolu's style)

@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Aug 13, 2022

@brandonchinn178   Done!

@kindaro kindaro force-pushed the add-unicode-preference branch 2 times, most recently from 4114918 to e3c7144 Compare August 13, 2022 19:06
Copy link
Copy Markdown
Collaborator

@brandonchinn178 brandonchinn178 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there are still some test failures

_ == _ = False

add1 ∷ Quote m ⇒ Int → m Exp
add1 x = [|x + 1⟧
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the opening quasiquote here be unicode?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should. I shall investigate.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, when I make the left bracket into Unicode, idempotence breaks. It will take some time to get to the bottom of this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is in Ormolu as well — the parser parses as [e| and then we must print it as [e| as well. Issue to Ormolu: tweag/ormolu#934.

Copy link
Copy Markdown
Contributor Author

@kindaro kindaro Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ormolu fixed their problem. Once it it merged to fourmolu, I can tweak the code so that ⟦x⟧ is handled correctly. Ideally I should like to have this pull request merged first — I had enough of rebasing it over and over.

@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Aug 13, 2022

Seems like there are still some test failures

Yes, I changed some white space in input.hs and forgot to refresh the output files because I thought it should not make difference but turns out it does.

It is the small things that will be my undoing.

@kindaro kindaro force-pushed the add-unicode-preference branch from deb7e62 to 4c5726c Compare November 8, 2022 17:05
@kindaro kindaro force-pushed the add-unicode-preference branch from 4c5726c to 0c6f31e Compare November 8, 2022 17:46
@kindaro
Copy link
Copy Markdown
Contributor Author

kindaro commented Nov 8, 2022

I rebased, there are no test failures, changelog.d updated, ready to ship.

@kindaro kindaro requested review from brandonchinn178, dpwiz and georgefst and removed request for brandonchinn178, dpwiz and georgefst November 8, 2022 18:40
Comment on lines +192 to +193
UnicodeDetect | unicodeExtensionIsEnabled -> unicodeText
UnicodeDetect | otherwise -> asciiText
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup PR: this could be either

UnicodeDetect -> if unicodeExtensionIsEnabled then unicodeText else asciiText

or

UnicodeDetect
  | unicodeExtensionIsEnabled -> unicodeText
  | otherwise -> asciiText

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if would be good.

breakpoint,
breakpoint',
getPrinterOpt,
whenUnicodeOtherwise,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be exported anymore, right? Followup PR could keep this as an internal helper

closeQuote = "⟧" `whenUnicodeOtherwise` "|]"

-- | Print @⊸@ or @%1 ->@ as appropriate.
lolly = "⊸" `whenUnicodeOtherwise` "%1 ->"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup PR: let's rename this to linearArrow

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? This naming is not my idea. It is called lolly in GHC's source code.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh interesting. Ok 🤷

@brandonchinn178 brandonchinn178 merged commit 44389c1 into fourmolu:main Nov 8, 2022
@brandonchinn178
Copy link
Copy Markdown
Collaborator

Thanks!

@kindaro kindaro mentioned this pull request Nov 8, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new config option Adds or would add new configuration option

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Preserve Unicode Syntax

4 participants