Skip to content

bpo-30597: Shows expected input in custom 'print' error message#2009

Merged
serhiy-storchaka merged 12 commits into
python:masterfrom
CuriousLearner:fix-issue30597
Jun 20, 2017
Merged

bpo-30597: Shows expected input in custom 'print' error message#2009
serhiy-storchaka merged 12 commits into
python:masterfrom
CuriousLearner:fix-issue30597

Conversation

@CuriousLearner

Copy link
Copy Markdown
Member

This is a WIP for issue: http://bugs.python.org/issue30597

@lisroach lisroach left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some unit tests for this?

@lisroach

lisroach commented Jun 8, 2017

Copy link
Copy Markdown
Contributor

There appears to be a newline character between the error_msg and the error_msg_end:

>>> print "helllo world"
  File "<stdin>", line 1
    print "helllo world"
                       ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean 'print("hello world"
)'?

@CuriousLearner

Copy link
Copy Markdown
Member Author

Hi @lisroach, yes, I'm on my way to figuring that out; so it's still a work in progress. Also, I need to do additional checks which @ncoghlan mentioned on the bug. I'll update this patch soon.

About the unit test, what is the best place to write them at? Probably I can refer to some earlier test cases written and would be able to write this up too.

@ncoghlan ncoghlan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic approach here looks good to me, just a few structural and technical comments inline.

Comment thread Objects/exceptions.c Outdated
start, text_len, -1)) {
Py_XSETREF(self->msg,
PyUnicode_FromString("Missing parentheses in call to 'print'"));
char *error_msg_start = "Missing parentheses in call to 'print'. Did you mean 'print(";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting complex enough now that I think it makes sense to separate it out as a second static helper: _set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)

Comment thread Objects/exceptions.c Outdated
strlen(error_msg_end)) + 1;

strcat(error_msg, error_msg_start);
strcat(error_msg, msg);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the discussion on the issue tracker, we want to check for a trailing , here, and if so, also append end=' ' to request the Python 3 equivalent of Python 2's soft-space behaviour.

(That will also impact the working memory allocation - the simplest option would be to always allocate enough space to handle the suggest_end_arg case, since the working string will be null-terminated anyway)

Comment thread Objects/exceptions.c Outdated
char *msg = data + PRINT_OFFSET;
char *error_msg_end = ")'?";

char *error_msg = malloc(strlen(error_msg_start) + \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This should use PyMem_Malloc and PyMem_Free rather than the direct C APIs. Also, it should actually free the allocated working memory :)
  2. We may want to include a short block comment explaining that data is UTF-8 encoded, so we mostly treat it as an indivisible opaque blob, except that we know it starts with print and might end with ,. This is also why we work directly with char * and convert to a Python Unicode object at the end, rather than pre-allocating the object via PyUnicode_New and populating it incrementally.

@ncoghlan

ncoghlan commented Jun 9, 2017

Copy link
Copy Markdown
Contributor

@CuriousLearner My guess would be that the misbehaviour @lisroach noted comes from data including the statement's trailing newline, so you'll need to deal with that. I'd also suggest testing a case like print "message"; pass to see how that behaves.

As far as test cases go, it looks like I didn't add any the first time around ("Missing parentheses" doesn't show up anywhere in the test directory for me).

So I'd suggest adding a new test class to Lib/test/test_print.py (e.g. TestPy2MigrationHint) that uses exec() to provoke SyntaxError and checks for the expected details.

@CuriousLearner

Copy link
Copy Markdown
Member Author

@ncoghlan I've revised the patch. I've also included test for the new code. Can you take a pass?

Also, on the bug you mentioned that the right shift operator case is distinctive enough. So, would it be needed to included here?

Please let me know and I'll do the changes. Thanks :)

@ncoghlan

Copy link
Copy Markdown
Contributor

@CuriousLearner Could you file a separate bug on bugs.python.org about the right-shift operator question? The search results indicate that people are hitting it reasonably frequently, so we may be able to save them a trip to Google by adding a custom 'Did you mean "print(<output>, file={:100!r})'.format(rhs)"? message when the LHS is the print builtin.

@CuriousLearner CuriousLearner changed the title bpo-30597: [WIP] Shows expected input in custom 'print' error message bpo-30597: Shows expected input in custom 'print' error message Jun 10, 2017
Comment thread Objects/exceptions.c Outdated
analogous to Python3 print syntax.
*/

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Virtually any API call can fail and raise an exception. Always check the result.

Comment thread Objects/exceptions.c Outdated
*/

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
PyObject *data = _PyUnicode_XStrip(self->text, 1, strip_sep_obj);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call _PyUnicode_XStrip() after PyUnicode_Substring().

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhiy-storchaka Any specific reason to it that way?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than one space after "print".

Comment thread Objects/exceptions.c Outdated
data = PyUnicode_Substring(data, PRINT_OFFSET, text_len);
// gets the modified text_len after stripping `print `
text_len = PyUnicode_GET_LENGTH(data);
char *maybe_end_arg = " end=\" \"";

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to use const char * for immutable C strings.

Comment thread Objects/exceptions.c Outdated
PyObject *error_msg = PyUnicode_FromFormat(
"Missing parentheses in call to 'print'. Did you mean print(%U)?",
data);
if (PyUnicode_Tailmatch(data, PyUnicode_FromString(","), start, text_len, 1))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just read the last character of the string by PyUnicode_READ_CHAR(str, PyUnicode_GET_LENGTH(str)-1) (don't forget to check that PyUnicode_GET_LENGTH(str) > 0).

Comment thread Objects/exceptions.c Outdated
"Missing parentheses in call to 'print'. Did you mean print(%U)?",
data);
if (PyUnicode_Tailmatch(data, PyUnicode_FromString(","), start, text_len, 1))
error_msg = PyUnicode_FromFormat(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost duplicates the former formatting. Both cases can be merged if use an empty string for maybe_end_arg in the first case.

Comment thread Objects/exceptions.c Outdated
// PRINT_OFFSET is to remove print word from the data.
const int PRINT_OFFSET = 6;
Py_ssize_t text_len = PyUnicode_GET_LENGTH(data);
data = PyUnicode_Substring(data, PRINT_OFFSET, text_len);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old value of data is leaked.

Comment thread Objects/exceptions.c
data, maybe_end_arg);

Py_XSETREF(self->msg, error_msg);
return 1;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At that point multiple results of PyUnicode_XXX calls are leaked.

Comment thread Objects/exceptions.c Outdated
_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)
{
/*
`data` is UTF-8 encoded and treated as an indivisible opaque blob. All

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is outdated.

Comment thread Objects/exceptions.c Outdated

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
PyObject *data = _PyUnicode_XStrip(self->text, 1, strip_sep_obj);
// PRINT_OFFSET is to remove print word from the data.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be more explicit about what is removed ("print ").

Comment thread Objects/exceptions.c Outdated
{

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
Py_UCS4 soft_space_check = PyUnicode_READ_CHAR(PyUnicode_FromString(","), 0);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of PyUnicode_FromString(",") is not checked for NULL and leaked.

You don't need this. You can use just soft_space_check = ','. Actually I think the named variable is not needed.

Comment thread Objects/exceptions.c
PyObject *data = PyUnicode_Substring(self->text, PRINT_OFFSET, text_len);

if (data == NULL)
return -1;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strip_sep_obj is leaked here.

Comment thread Objects/exceptions.c Outdated
const char *maybe_end_arg = "";

if (text_len > 0) {
if (PyUnicode_READ_CHAR(data, text_len-1) == soft_space_check) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two if's can be merged.

Comment thread Objects/exceptions.c Outdated
_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)
{

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add also a space and a tab.

Comment thread Objects/exceptions.c Outdated
if (data == NULL)
return -1;

data = _PyUnicode_XStrip(data, 1, strip_sep_obj);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of _PyUnicode_XStrip() is not checked for error. Old value of data is leaked.

Comment thread Objects/exceptions.c
}
}

PyObject *error_msg = PyUnicode_FromFormat(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result is not checked for error.

@CuriousLearner

Copy link
Copy Markdown
Member Author

@serhiy-storchaka , @ncoghlan , @lisroach

Possible, if you can take a look now?

@serhiy-storchaka serhiy-storchaka left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an entry in Misc/NEWS and your name in Misc/ACKS.

Comment thread Objects/exceptions.c Outdated
static int
_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)
{

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove excessive empty lines.

Comment thread Lib/test/test_print.py Outdated
with self.assertRaises(SyntaxError) as context:
exec(python2_print_str)

self.assertTrue('print("Hello World")' in str(context.exception))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertIn().

Or use assertRaisesRegex() with the argument r'print\("Hello World"\)' or re.escape('print("Hello World")').

Comment thread Lib/test/test_print.py Outdated
with self.assertRaises(SyntaxError) as context:
exec(python2_print_str)

self.assertTrue('print("Hello World", end=" ")' in str(context.exception))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above.

Comment thread Lib/test/test_print.py
exec(python2_print_str)

self.assertTrue('print("Hello World", end=" ")' in str(context.exception))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add also tests for excessive whitespaces: 'print "Hello World", '.

Comment thread Objects/exceptions.c Outdated
return -1;
}

PyObject *new_data = _PyUnicode_XStrip(data, 1, strip_sep_obj);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use 2 instead of 1.

@ncoghlan ncoghlan dismissed their stale review June 20, 2017 06:31

I don't have anything to add to @serhiy-storchaka's latest review, but also agree with Serhiy's requests for further updates

@CuriousLearner

Copy link
Copy Markdown
Member Author

@serhiy-storchaka I've updated the patch. Also I've updated the branch to get even with master of upstream.

Can you please take a look now?

Also, for the ACKS, my name is already there.

@serhiy-storchaka serhiy-storchaka left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C code LGTM. Added few last minor comments.

May be it is worth to reformulate the Misc/NEWS entry (I don't like the titlecased Print). Nick, your thoughts?

Comment thread Lib/test/test_print.py Outdated
self.assertIn('print("Hello World", end=" ")', str(context.exception))

def test_string_with_excessive_whitespace(self):
python2_print_str = 'print "Hello World", '

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add yet a space between print and its argument.

Comment thread Misc/NEWS Outdated
-----------------

- bpo-30597: `Print` now shows expected input in custom error message when
used as a Python 2 statement.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "Patch by Sanyam Khurana."

Comment thread Misc/NEWS Outdated
Core and Builtins
-----------------

- bpo-30597: `Print` now shows expected input in custom error message when

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use print here.

@ncoghlan

Copy link
Copy Markdown
Contributor

I adjusted the NEWS entry directly, so I think this is good to go once the CI gives it a passing grade.

@CuriousLearner

Copy link
Copy Markdown
Member Author

Thanks @ncoghlan @serhiy-storchaka ! :)

On a second note, I just noticed that on the bug, Python 3.6 is also selected for this patch. So would this be back-ported? (As I don't see a backport tag with the PR right now).

@serhiy-storchaka serhiy-storchaka added the type-feature A feature request or enhancement label Jun 20, 2017
@serhiy-storchaka

Copy link
Copy Markdown
Member

This is a minor new feature. I don't think it should be backported.

@serhiy-storchaka serhiy-storchaka merged commit 3a7f035 into python:master Jun 20, 2017
@serhiy-storchaka

Copy link
Copy Markdown
Member

Thank you for your contribution Sanyam!

@CuriousLearner

Copy link
Copy Markdown
Member Author

Thanks for all your time @serhiy-storchaka :)

ncoghlan pushed a commit to ncoghlan/cpython that referenced this pull request Jul 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type-feature A feature request or enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants