Skip to content

Remove duplicated DataPipe reference from bucketbatcher#176

Closed
ejguan wants to merge 1 commit intogh/ejguan/15/basefrom
gh/ejguan/15/head
Closed

Remove duplicated DataPipe reference from bucketbatcher#176
ejguan wants to merge 1 commit intogh/ejguan/15/basefrom
gh/ejguan/15/head

Conversation

@ejguan
Copy link
Contributor

@ejguan ejguan commented Jan 21, 2022

Stack from ghstack:

Summary:
Fixes #173

Note that the [input to `strip`](https://docs.python.org/3/library/stdtypes.html#str.strip)

> is a string specifying the **set of characters** to be removed. [Emphasis mine]

Thus, stripping works something like

```python
for char in chars:
    string.replace(char, "")
```

rather than

```python
string.replace(chars, "")
```

This means that always stripping `"\r\n"` is harmless even if the line terminator is only `"\n"` or `\"r"`.

Reviewed By: ejguan

Differential Revision: D33684458

Pulled By: NivekT

fbshipit-source-id: 9821b77d60d3afe038ae698965beefe319783aa1

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 21, 2022
ejguan added a commit that referenced this pull request Jan 21, 2022
Summary:
Fixes #173

Note that the [input to `strip`](https://docs.python.org/3/library/stdtypes.html#str.strip)

> is a string specifying the **set of characters** to be removed. [Emphasis mine]

Thus, stripping works something like

```python
for char in chars:
    string.replace(char, "")
```

rather than

```python
string.replace(chars, "")
```

This means that always stripping `"\r\n"` is harmless even if the line terminator is only `"\n"` or `\"r"`.

Reviewed By: ejguan

Differential Revision: D33684458

Pulled By: NivekT

fbshipit-source-id: 9821b77d60d3afe038ae698965beefe319783aa1

ghstack-source-id: 37a119b
Pull Request resolved: #176
@ejguan ejguan changed the title fix newline stripping in plain text readers (#174) Remove duplicated DataPipe reference from bucketbatcher Jan 21, 2022
@ejguan ejguan closed this Jan 21, 2022
@facebook-github-bot facebook-github-bot deleted the gh/ejguan/15/head branch February 21, 2022 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants