Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-111495: Add PyFile_* CAPI tests #111709

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

sobolevn
Copy link
Member

@sobolevn sobolevn commented Nov 3, 2023

Looks like PyFile_SetOpenCodeHook is already tested here:

cpython/Programs/_testembed.c

Lines 1177 to 1232 in 20cfab9

static int test_open_code_hook(void)
{
int result = 0;
/* Provide a hook */
result = PyFile_SetOpenCodeHook(_open_code_hook, &result);
if (result) {
printf("Failed to set hook\n");
return 1;
}
/* A second hook should fail */
result = PyFile_SetOpenCodeHook(_open_code_hook, &result);
if (!result) {
printf("Should have failed to set second hook\n");
return 2;
}
Py_IgnoreEnvironmentFlag = 0;
_testembed_Py_InitializeFromConfig();
result = 0;
PyObject *r = PyFile_OpenCode("$$test-filename");
if (!r) {
PyErr_Print();
result = 3;
} else {
void *cmp = PyLong_AsVoidPtr(r);
Py_DECREF(r);
if (cmp != &result) {
printf("Did not get expected result from hook\n");
result = 4;
}
}
if (!result) {
PyObject *io = PyImport_ImportModule("_io");
PyObject *r = io
? PyObject_CallMethod(io, "open_code", "s", "$$test-filename")
: NULL;
if (!r) {
PyErr_Print();
result = 5;
} else {
void *cmp = PyLong_AsVoidPtr(r);
Py_DECREF(r);
if (cmp != &result) {
printf("Did not get expected result from hook\n");
result = 6;
}
}
Py_XDECREF(io);
}
Py_Finalize();
return result;
}

@sobolevn
Copy link
Member Author

sobolevn commented Nov 3, 2023

Tests fail on Windows (I have a very limited experience with this platform):

 ======================================================================
ERROR: test_file_get_line (test.test_capi.test_file.TestPyFileCAPI.test_file_get_line)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\a\cpython\cpython\Lib\test\test_capi\test_file.py", line 40, in test_file_get_line
    f.writelines([first_line])
  File "D:\a\cpython\cpython\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode characters in position 10-15: character maps to <undefined>

Is it correct?

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This family has little functions, but they should be tested with many cases.

Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
Lib/test/test_capi/test_file.py Show resolved Hide resolved
Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
Lib/test/test_capi/test_file.py Show resolved Hide resolved
Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
Lib/test/test_capi/test_file.py Outdated Show resolved Hide resolved
@serhiy-storchaka
Copy link
Member

Tests fail on Windows

Because the default encoding on Windows is not UTF-8. Always specify encoding for text files.

@sobolevn
Copy link
Member Author

sobolevn commented Nov 5, 2023

@serhiy-storchaka thanks a lot for your detailed review! You are one of the best reviewers I know :)

@sobolevn
Copy link
Member Author

sobolevn commented Nov 5, 2023

Address sanitizer build fails with:

 ======================================================================
FAIL: test_string_args_as_invalid_utf (test.test_capi.test_file.TestPyFile_FromFd.test_string_args_as_invalid_utf) (arg_pos=5)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/cpython/cpython/Lib/test/test_capi/test_file.py", line 77, in test_string_args_as_invalid_utf
    self.assertRaises(
AssertionError: (<class 'ValueError'>, <class 'LookupError'>) not raised by file_from_fd

----------------------------------------------------------------------

Maybe I should use a different string? Suggestions?

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -0,0 +1,296 @@
import unittest
import io
import os
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: sort imports :-)

@vstinner
Copy link
Member

Address sanitizer build fails with:
FAIL: test_string_args_as_invalid_utf (test.test_capi.test_file.TestPyFile_FromFd.test_string_args_as_invalid_utf) (arg_pos=5)
AssertionError: (<class 'ValueError'>, <class 'LookupError'>) not raised by file_from_fd

It's unrelated to Address sanitizer. It's just that this CI builds Python is release mode. And in release mode, the error handler is only used if the string cannot be decoded (decoding error). In debug mode, the error handler is always checked.

You can skip this test if support.Py_DEBUG is false.

@vstinner
Copy link
Member

To reproduce the Address Sanitizer issue, I used:

./configure --with-address-sanitizer CC=clang
ASAN_OPTIONS='detect_leaks=0:allocator_may_return_null=1:handle_segv=0' make
ASAN_OPTIONS='detect_leaks=0:allocator_may_return_null=1:handle_segv=0' ./python -m test test_capi.test_file -v

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have not finished the review yet. It is difficult with so many tests. So I can find other issues later.

The main problem is that they incorrectly create non-decodable files. You should use binary files to write them.

It would be nice also to reduce the number of lines where it is possible.

def test_name_invalid_utf(self):
with open(os_helper.TESTFN, "w", encoding="utf-8") as f:
file_obj = _testcapi.file_from_fd(
f.fileno(), "abc\xe9", "w",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not invalid UTF-8. When you pass the Python string, it is encoded to UTF-8, therefore the C string is always valid UTF-8. You have to pass a bytes object, e.g. b'\xff'. See for example tests for PyDict_GetItemString() or PyObject_GetAttrString().

TypeError,
r"open\(\) argument 'mode' must be str, not None",
_testcapi.file_from_fd,
f.fileno(), "abc\xe9", NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use TESTFN?

When the assertRaisesRegex() call is multiline, it may be clearer to write it in the context manager form. The functional form is convenient when you can fit all in a single line.

Comment on lines +24 to +27
file_obj = _testcapi.file_from_fd(
f.fileno(), os_helper.TESTFN, "w",
1, "utf-8", "strict", "\n", 0,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write this in more compact form. First, add shorted name from_fd = _testcapi.file_from_fd, then this call can fit in just two lines:

            file_obj = from_fd(f.fileno(), os_helper.TESTFN, "w",
                               1, "utf-8", "strict", "\n", 0)

f.fileno(), os_helper.TESTFN, "w",
1, "utf-8", "strict", "\n", 0,
)
self.assertIsInstance(file_obj, io.TextIOWrapper)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check for the .name attribute.

)

def test_string_args_as_null(self):
for arg_pos in (4, 5, 6):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to write it without loop. So you can see what case is crashed if there is a crash. And it is easier to modify every case.

For example, I suggest to add checks for corresponding attributes (.encoding, .errors, .newlines).

def test_file_empty_line(self):
first_line = ""
with open(os_helper.TESTFN, "w", encoding="utf-8") as f:
f.writelines([first_line])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to write an empty line.

Comment on lines +125 to +127
first_line = "\xc3\x28\n"
with open(os_helper.TESTFN, "w", encoding="utf-8") as f:
f.writelines([first_line])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, it does not create invalid UTF-8.

)
self.assertEqual(self.write_and_return(False), "False")

def test_file_write_custom_obj(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a test for object that raises an exception in __str__() or __repr__()?

Comment on lines +249 to +250
self.assertRaises(AttributeError, self.write, NULL, object(), 0)
self.assertRaises(TypeError, self.write, NULL, NULL, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test these cases also with Py_PRINT_RAW.

Add cases for writing NULL to StringIO.

Comment on lines +147 to +148
with open(os_helper.TESTFN, "w", encoding="utf-8") as f:
f.writelines([first_line, second_line])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many tests can use StringIO. E.g.

f = io.StringIO('first_line\nsecond_line\n')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants