Skip to content

No trailing space for BuiltStyledStreamWriter#1154

Closed
innerlee wants to merge 9 commits intoopen-source-parsers:masterfrom
innerlee:zz/trailingspace
Closed

No trailing space for BuiltStyledStreamWriter#1154
innerlee wants to merge 9 commits intoopen-source-parsers:masterfrom
innerlee:zz/trailingspace

Conversation

@innerlee
Copy link

Before pr, the positions with <-HERE annotation has a trailing space:

[
  {
    "headshoulders_info" : <-HERE
    [
      {
        "det_score" : 0.73144500000000001,
        "rect" : <-HERE
        {
          "bottom" : 205,
          "left" : 200,
          "right" : 309,
          "top" : 102
        }
      }
    ],
    "image" : <-HERE
    {
      "height" : 368,
      "id" : 0,
      "uuid" : "/workspace/face/data/shot_0001/img_00000.jpg",
      "width" : 640
    },
    "status" : 0
  }
]

After:

[
  {
    "headshoulders_info" :
    [
      {
        "det_score" : 0.73144500000000001,
        "rect" :
        {
          "bottom" : 205,
          "left" : 200,
          "right" : 309,
          "top" : 102
        }
      }
    ],
    "image" :
    {
      "height" : 368,
      "id" : 0,
      "uuid" : "/workspace/face/data/shot_0001/img_00000.jpg",
      "width" : 640
    },
    "status" : 0
  }
]

code:

Json::StreamWriterBuilder builder;
builder["indentation"] = "  ";
string document = Json::writeString(builder, root);

Signed-off-by: lizz <lizz@sensetime.com>
Signed-off-by: lizz <lizz@sensetime.com>
Signed-off-by: lizz <lizz@sensetime.com>
Signed-off-by: lizz <lizz@sensetime.com>
Signed-off-by: lizz <lizz@sensetime.com>
Signed-off-by: lizz <lizz@sensetime.com>
Signed-off-by: lizz <lizz@sensetime.com>
@coveralls
Copy link

coveralls commented Apr 10, 2020

Coverage Status

Coverage increased (+0.01%) to 93.869% when pulling eda6a60 on innerlee:zz/trailingspace into 3beb37e on open-source-parsers:master.

@dota17
Copy link
Member

dota17 commented Apr 11, 2020

Seems good, but if you want to skip trailing space of the colonSymbol, why not use settings item enableYAMLCompatibility, which defines like this, and here are its usage. IMHO, it has the effect you want, without changing the existing code, increasing the complexity of the code, am I right?

@innerlee
Copy link
Author

The enableYAMLCompatibility was the first thing I tried. Unfortunately it doe not remove trailing spaces in the positions as shown in the first thread. You may try them out.

BillyDonahue added a commit to BillyDonahue/jsoncpp that referenced this pull request Apr 15, 2020
Copy link
Contributor

@BillyDonahue BillyDonahue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some suggestions in this patch.
You should be able to merge this change into your PR branch.

BillyDonahue@16a0ca5

name.data(), static_cast<unsigned>(name.length()), emitUTF8_));
*sout_ << colonSymbol_;
if ((childValue.type() == objectValue &&
!childValue.getMemberNames().empty()) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getMemberNames() returns a full std::vector. It's expensive.
All we do is see if it's empty and throw it away. There's a better way to test this condition.
you can test childValue.empty() directly.

Can you explain the rest of this conditional expression though?
I'm a little concerned about how it interacts with comments.
Let's add a test for that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first part deals with objects and the second part deals with list.

Object example:
image

List example:
image

The conditions are copied from
https://github.com/open-source-parsers/jsoncpp/blob/master/src/lib_json/json_writer.cpp#L697-L699
and
https://github.com/open-source-parsers/jsoncpp/blob/master/src/lib_json/json_writer.cpp#L1002-L1006
, respectively.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the interaction with comments, I'm not very clear. Could you please add them if appropriate?

((cs_ == CommentStyle::All) || isMultilineArray(childValue))))
*sout_ << colonSymbolNoTrailingSpace_;
else
*sout_ << colonSymbol_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to just write a part of the colonSymbol_ range here instead of having another member variable to hold a copy with the space cut off.

useSpecialFloats_(useSpecialFloats), emitUTF8_(emitUTF8),
precision_(precision), precisionType_(precisionType) {}
precision_(precision), precisionType_(precisionType) {
if (colonSymbol_[colonSymbol_.size() - 1] == ' ') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if colonSymbol_ is empty this is undefined behavior. Shouldn't need this new member anyway.
Can use find_last_not_of(' ') or something.

BillyDonahue and others added 2 commits April 15, 2020 09:57
Signed-off-by: lizz <lizz@sensetime.com>
@cdunn2001
Copy link
Contributor

Good idea, but there is a much simpler way to do this.

Let's treat the "trailing colon" and "embedded colon" as different symbols, and let's inject colonTrailing and colonEmbedded as 2 new "features" with proper default values. Other features may override either or both. Specifically, if indentation==0, then these are both trimmed, and if yaml-mode, then these are set to : and : automatically.

Features add UI complexity, but they are transparent. As a user, I would prefer transparency and control.

@innerlee
Copy link
Author

The trailing space is a kind of bug imho. It does not related to yaml-mode or indentation. The extra trailing spaces does not have any functionality and they are not part of any standard. Removing them is purely for aesthetic reasons so I'm not sure if exposing them as new "features" is worth the mental effort.

@cdunn2001
Copy link
Contributor

The trailing space is a kind of bug imho.

First, I don't think the JSON spec calls that a bug. I respect your opinion, but we need to concentrate on engineering here. Let's just consider this a reasonable preference that you have.

Second, there are good coders in the world who write actual regression tests, and I respect them as well. It is reasonable for those people to record the exact output in their test. You're giving yourself the right to change their output. I have seen Issues filed over changes in the past.

Third, to the extent that you are right, we should not even support indentation or spacing. You're treating JSON as a human-readable format, but that is not part of the spec either. My life would be easier if we compressed whitespace.

I honestly do not understand your point of view anymore. You think the whitespace is significant, but you don't want a user to have control over it? How is that logically consistent?

I don't know the right thing to do here. Certainly, there are always trade-offs, but that's not what you're saying at all. You assert that this is a bug of a sort, but what about its prevalence? Are you familiar with the standard Python json parser? By default, it includes trailing spaces. Here is the API:

json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)

If specified, separators should be an (item_separator, key_separator) tuple. The default is (', ', ': ') if indent is None and (',', ': ') otherwise. To get the most compact JSON representation, you should specify (',', ':') to eliminate whitespace.

Like you, I have always been bothered by that. I use that all the time for work. When a co-worker used : , I typically used a Perl program to strip trailing whitespace after dumping from Python. (That way, my git would not complain about trailing whitespace.) When he left, I switched everything to : (no space after colon, but with linefeeds and indentation) and updated my tests. But I can tell you that I have not submitted a change-request to Python maintainers. They probably have already heard the same complaint many times!

What would you think of the Python solution? In other words, expose the colonSymbol but don't distinguish "trailing" colon from "embedded" colon?

@innerlee
Copy link
Author

Hi I haven't used python's json very much. Could you give an example that python's json.dump produce trailing spaces? My tries:

In [7]: json.dumps(json.loads(a), indent=2, separators=(',', ': '))
Out[7]: '[\n  {\n    "headshoulders_info": [\n      {\n        "det_score": 0.731445,\n        "rect": {\n          "bottom": 205,\n          "left": 200,\n          "right": 309,\n          "top": 102\n        }\n      }\n    ],\n    "image": {\n      "height": 368,\n      "id": 0,\n      "uuid": "/workspace/face/data/shot_0001/img_00000.jpg",\n      "width": 640\n    },\n    "status": 0\n  }\n]'

In [8]: json.dumps(json.loads(a), indent=2, separators=(',', ' : '))
Out[8]: '[\n  {\n    "headshoulders_info" : [\n      {\n        "det_score" : 0.731445,\n        "rect" : {\n          "bottom" : 205,\n          "left" : 200,\n          "right" : 309,\n          "top" : 102\n        }\n      }\n    ],\n    "image" : {\n      "height" : 368,\n      "id" : 0,\n      "uuid" : "/workspace/face/data/shot_0001/img_00000.jpg",\n      "width" : 640\n    },\n    "status" : 0\n  }\n]'

Note that there are no spaces before any of \n

@cdunn2001
Copy link
Contributor

Oh, you're right, because Python puts [ or { on the same line as the colon. Maybe that's a better solution for us? If Python put those on new lines, then it would have this problem.

Here is why I had a problem at work:

The default is (', ', ': ') if indent is None and (',', ': ') otherwise.

My co-worker specified indent but not separators.

Now that I recall, my problem at work with was trailing whitespace after commas, which happens if you want the whitespace for short lists. It's essentially the same problem. And oddly, when I try it now, I find that there is a difference between python2 and python3....

... Aha!

Note: Since the default item separator is ', ', the output might include trailing whitespace when indent is specified. You can use separators=(',', ': ') to avoid this.

So this was "fixed" in python3. Note the major-version bump. And I'm not kidding about that. Guido threw a ton of changes into python3, knowing that a major-version bump does not happen often. But you are not arguing for a major-version bump of jsoncpp, right? And yet you also argue against adding a "feature". Nobody likes the "features". Everybody thinks they add too much complexity. But I observe that everybody also thinks backward-compatibility is somebody else's problem.

@cdunn2001
Copy link
Contributor

Anyway, maybe the best solution is to add a feature to keep { and [ on the same line as the :. In that case, we could modify the whitespace around : without worrying about backward-compatibility. All we have to do is document how the feature works. And we could use colonSymbol directly, rather than scanning it for whitespace.

@innerlee
Copy link
Author

I understand the argument for compatibility. If changing test case is considered breaking, then it is better wait for the next major revision.
If someone want to implement a non-breaking version, that is also very cool.
I also do not very satisfied with this current implementation. It will be also very cool if anyone want to write a cleaner version.

Some background info: For me, it is purely for aesthetic reasons. I accidentally used jsoncpp for some output and noticed this little annoyance. My wish is to let it have a clean output :)

@cdunn2001
Copy link
Contributor

Yes, we should re-open this when we bump the major version someday. Thanks for the suggestions.

@cdunn2001 cdunn2001 closed this Sep 2, 2020
@daravi
Copy link

daravi commented Jan 23, 2021

Hi this is also an issue to us. For now we have a script that removes them but it would be nice to get rid of that script. Please add a version of this when you bump to version 2.

DHowett added a commit to microsoft/terminal that referenced this pull request Feb 6, 2026
DHowett added a commit to microsoft/terminal that referenced this pull request Feb 6, 2026
DHowett added a commit to microsoft/terminal that referenced this pull request Feb 6, 2026
DHowett added a commit to microsoft/terminal that referenced this pull request Feb 6, 2026
DHowett added a commit to microsoft/terminal that referenced this pull request Feb 9, 2026
…9836)

The maintainers of jsoncpp chose to decline
open-source-parsers/jsoncpp#1154 until they make a new major release.
We are not burdened by that requirement.

Closes #10887
Refs #19834
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants