Skip to content

Log data fetched from URLs to make debugging easier (at trace log level).#1589

Merged
liiight merged 7 commits intoFlexget:developfrom
tobinjt:develop
Dec 27, 2016
Merged

Log data fetched from URLs to make debugging easier (at trace log level).#1589
liiight merged 7 commits intoFlexget:developfrom
tobinjt:develop

Conversation

@tobinjt
Copy link
Copy Markdown
Contributor

@tobinjt tobinjt commented Dec 25, 2016

Motivation for changes:

Make it easier to debug Flexget not getting useful results from a URL.

Detailed changes:

To make debugging easier, log contents of fetched URLs at trace level.
Browsers process Javascript, handle iframes, and more that flexget
doesn't; seeing the actual data processed by flexget rather than looking
at the URL in your browser makes it easier to figure out why you're not
getting any results.

Addressed issues:

  • Fixes # .

Config usage if relevant (new plugin or updated schema):

paste_config_here

Log and/or tests output (preferably both):

The test failures appear to be completely unrelated to my change.

================================================================================================= FAILURES =================================================================================================
______________________________________________________________________________________ TestFilesystem.test_non_ascii _______________________________________________________________________________________

self = <flexget.tests.test_filesystem.TestFilesystem object at 0x1067a4290>, execute_task = <function execute at 0x1072da0c8>

    def test_non_ascii(self, execute_task):
        task_name = 'non_ascii'
        should_exist = ['\u0161 dir', '\u0152 file.txt']
        task = execute_task(task_name)

>       self.assert_check(task, task_name, 'positive', should_exist)

test_filesystem.py:210:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <flexget.tests.test_filesystem.TestFilesystem object at 0x1067a4290>, task = <flexget.task.Task object at 0x107ed7210>, task_name = 'non_ascii', test_type = 'positive'
filenames = ['š dir', 'Œ file.txt']

    def assert_check(self, task, task_name, test_type, filenames):
        for file in filenames:
            file = Path(file)
            if test_type == 'positive':
                assertion_error = 'Failed %s %s test, did not find %s' % (test_type, task_name, file)
>               assert task.find_entry(title=file.namebase), assertion_error
E               AssertionError: Failed positive non_ascii test, did not find š dir
E               assert None
E                +  where None = <bound method Task.find_entry of <flexget.task.Task object at 0x107ed7210>>(title=Path(u'\u0161 dir'))
E                +    where <bound method Task.find_entry of <flexget.task.Task object at 0x107ed7210>> = <flexget.task.Task object at 0x107ed7210>.find_entry
E                +    and   Path(u'\u0161 dir') = Path(u'\u0161 dir').namebase

test_filesystem.py:91: AssertionError
----------------------------------------------------------------------------------------------- Captured log -----------------------------------------------------------------------------------------------
manager.py                 115 INFO     last manager was not torn down correctly
manager.py                 172 DEBUG    sys.defaultencoding: ascii
manager.py                 173 DEBUG    sys.getfilesystemencoding: utf-8
manager.py                 174 DEBUG    flexget detected io encoding: utf-8
manager.py                 175 DEBUG    os.path.supports_unicode_filenames: True
conftest.py                309 DEBUG    database_uri: sqlite:///:memory:
plugin.py                  383 DEBUG    Trying to load plugins from: [u'/Users/johntobin/src/upstream-flexget/flexget/tests/plugins', '/Users/johntobin/src/upstream-flexget/flexget/plugins']
plugin.py                  404 DEBUG    (u'Plugin `%s` requires `%s` to load.', u'memusage', u'ext lib `guppy`')
plugin.py                  468 DEBUG    Plugins took 0.07 seconds to load. 271 plugins in registry.
manager.py                 713 DEBUG    Connecting to: sqlite:///:memory:
db_schema.py                92 DEBUG    Initializing plugin feed schema version to 0
db_schema.py                92 DEBUG    Initializing plugin api_trakt schema version to 7
db_schema.py                92 DEBUG    Initializing plugin trakt_auth schema version to 0
db_schema.py                92 DEBUG    Initializing plugin make_rss schema version to 0
db_schema.py                92 DEBUG    Initializing plugin series schema version to 13
db_schema.py                92 DEBUG    Initializing plugin pending_approval schema version to 0
db_schema.py                92 DEBUG    Initializing plugin alpharatio schema version to 0
db_schema.py                92 DEBUG    Initializing plugin seen schema version to 4
db_schema.py                92 DEBUG    Initializing plugin pogcal_acquired schema version to 0
db_schema.py                92 DEBUG    Initializing plugin archive schema version to 0
db_schema.py                92 DEBUG    Initializing plugin digest schema version to 1
db_schema.py                92 DEBUG    Initializing plugin remember_rejected schema version to 3
db_schema.py                92 DEBUG    Initializing plugin simple_persistence schema version to 4
db_schema.py                92 DEBUG    Initializing plugin myepisodes schema version to 0
db_schema.py                92 DEBUG    Initializing plugin delay schema version to 2
db_schema.py                92 DEBUG    Initializing plugin failed schema version to 3
db_schema.py                92 DEBUG    Initializing plugin tail schema version to 0
db_schema.py                92 DEBUG    Initializing plugin api_tvdb schema version to 7
db_schema.py                92 DEBUG    Initializing plugin imdb_lookup schema version to 8
db_schema.py                92 DEBUG    Initializing plugin import_series schema version to 0
db_schema.py                92 DEBUG    Initializing plugin template_hash schema version to 0
db_schema.py                92 DEBUG    Initializing plugin status schema version to 2
db_schema.py                92 DEBUG    Initializing plugin movie_list schema version to 0
db_schema.py                92 DEBUG    Initializing plugin torrent411 schema version to 0
db_schema.py                92 DEBUG    Initializing plugin api_tmdb schema version to 5
db_schema.py                92 DEBUG    Initializing plugin log_once schema version to 0
db_schema.py                92 DEBUG    Initializing plugin discover schema version to 0
db_schema.py                92 DEBUG    Initializing plugin entry_list schema version to 1
db_schema.py                92 DEBUG    Initializing plugin regexp_list schema version to 1
db_schema.py                92 DEBUG    Initializing plugin api_bluray schema version to 0
db_schema.py                92 DEBUG    Initializing plugin subtitle_list schema version to 1
db_schema.py                92 DEBUG    Initializing plugin api_rottentomatoes schema version to 2
db_schema.py                92 DEBUG    Initializing plugin version_checker schema version to 0
db_schema.py                92 DEBUG    Initializing plugin secrets schema version to 0
db_schema.py                92 DEBUG    Initializing plugin morethantv schema version to 0
db_schema.py                92 DEBUG    Initializing plugin t411 schema version to 0
db_schema.py                92 DEBUG    Initializing plugin input_cache schema version to 1
db_schema.py                92 DEBUG    Initializing plugin telegram_chat_ids schema version to 0
db_schema.py                92 DEBUG    Initializing plugin tvmaze schema version to 6
db_schema.py                92 DEBUG    Initializing plugin imdb_list schema version to 0
db_schema.py                92 DEBUG    Initializing plugin rutracker_auth schema version to 0
db_schema.py                92 DEBUG    Initializing plugin backlog schema version to 2
manager.py                 639 DEBUG    New config data loaded.
db_schema.py                39 DEBUG    entering flexget version 2.8.12.dev to db
plugin_parsing.py           32 DEBUG    setting default movie parser to internal. (options: {u'internal': <flexget.plugins.parsers.parser_internal.ParserInternal object at 0x1042f5fd0>, u'guessit': <flexg
et.plugins.parsers.parser_guessit.ParserGuessit object at 0x1043e4090>})
plugin_parsing.py           32 DEBUG    setting default series parser to internal. (options: {u'internal': <flexget.plugins.parsers.parser_internal.ParserInternal object at 0x1042f5fd0>, u'guessit': <flex
get.plugins.parsers.parser_guessit.ParserGuessit object at 0x1043e4090>})
conftest.py                 89 INFO     ********** Running task: non_ascii **********
task.py                    576 DEBUG    executing non_ascii
status.py                  110 DEBUG    Adding new task non_ascii
logger.py                  129 VERBOSE  Starting to scan folders.
logger.py                  129 VERBOSE  Scanning folder filesystem_test_dir//Test3. Recursion is set to True.
filesystem.py              165 DEBUG    Scanning filesystem_test_dir//Test3
filesystem.py              170 DEBUG    Checking if filesystem_test_dir//Test3/š dir qualifies to be added as an entry.
filesystem.py              170 DEBUG    Checking if filesystem_test_dir//Test3/s dir/Œ file.txt qualifies to be added as an entry.
backlog.py                 182 DEBUG    0 entries purged from backlog
logger.py                  129 VERBOSE  Produced 2 entries.
urlrewriting.py             30 DEBUG    Checking 0 entries
logger.py                  129 VERBOSE  Summary - Accepted: 0 (Rejected: 0 Undecided: 2 Failed: 0)
simple_persistence.py      150 DEBUG    Flushing simple persistence for task non_ascii to db.
========================================================================================== pytest-warning summary ==========================================================================================
WC1 None [pytest] section in setup.cfg files is deprecated, use [tool:pytest] instead.
WI1 /Users/johntobin/src/upstream-flexget/lib/python2.7/site-packages/pytest_capturelog.py:171 'pytest_runtest_makereport' hook uses deprecated __multicall__ argument
WC1 None pytest_funcarg__cov: declaring fixtures using "pytest_funcarg__" prefix is deprecated and scheduled to be removed in pytest 4.0.  Please remove the prefix and use the @pytest.fixture decorator in
stead.
WC1 None pytest_funcarg__caplog: declaring fixtures using "pytest_funcarg__" prefix is deprecated and scheduled to be removed in pytest 4.0.  Please remove the prefix and use the @pytest.fixture decorator
 instead.
WC1 None pytest_funcarg__capturelog: declaring fixtures using "pytest_funcarg__" prefix is deprecated and scheduled to be removed in pytest 4.0.  Please remove the prefix and use the @pytest.fixture decor
ator instead.
======================================================= 1 failed, 1065 passed, 26 skipped, 3 xfailed, 1 xpassed, 5 pytest-warnings in 421.41 seconds =======================================================```

To make debugging easier, log contents of fetched URLs at trace level.
Browsers process Javascript, handle iframes, and more that flexget
doesn't; seeing the actual data processed by flexget rather than looking
at the URL in your browser makes it easier to figure out why you're not
getting any results.
Copy link
Copy Markdown
Member

@liiight liiight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not even sure what data will be added here and what benefit this has.

log.debug('Fetching URL %s with args %s and kwargs %s', url, args, kwargs)
result = super(Session, self).request(method, url, *args, **kwargs)
try:
log.trace('Contents for URL %s start on next line', url)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a log entry that says that there's more log in another line is just silly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collapsed to one line.

result = super(Session, self).request(method, url, *args, **kwargs)
try:
log.trace('Contents for URL %s start on next line', url)
log.trace('%s', result.text)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safe way to do this is test if result even has text:

if hasattr(result, 'text'):
   ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requests/models:Request defines the text property, so this will always be available.

try:
log.trace('Contents for URL %s start on next line', url)
log.trace('%s', result.text)
except UnicodeDecodeError:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to catch a unicode error here, stuff like that should be handled (and are handled) in larger scale.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two problems if this error isn't caught here, and I've added a comment to the code describing them.

  1. The call stack unwinds until the error is caught, so turning up the log level causes a change in behaviour, which is bad.
  2. This stack trace is dumped to stderr:
    Traceback (most recent call last):
    File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
    File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/handlers.py", line 156, in shouldRollover
    msg = "%s\n" % self.format(record)
    File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/init.py", line 734, in format
    return fmt.format(record)
    File "/Users/johntobin/src/upstream-flexget/flexget/logger.py", line 142, in format
    return logging.Formatter.format(self, record)
    File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/init.py", line 465, in format
    record.message = record.getMessage()
    File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/init.py", line 329, in getMessage
    msg = msg % self.args
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9604: ordinal not in range(128)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I under that this raises exception without the except but we have an entirely different issue that deals with this problem.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your comment :( Would you prefer that I remove the exception handling?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tobinjt I think he just meant we are dealing with that logging crash in #1558, and we shouldn't sprinkle code that handles that exception around every logging call.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gazpachoking Thanks, that cleared it up, and I've removed the exception handling.

@tobinjt
Copy link
Copy Markdown
Contributor Author

tobinjt commented Dec 26, 2016

@liiight
Copy link
Copy Markdown
Member

liiight commented Dec 26, 2016

OK and what does that data provide you with? What problem is this solving?

@tobinjt
Copy link
Copy Markdown
Contributor Author

tobinjt commented Dec 26, 2016

I can log the exact data that Flexget processed when I'm trying to debug why my config isn't working. E.g. with extratorrent.cc they started returning a block of encoded Javascript rather than HTML; my browser shows me useful results but Flexget doesn't get anything useful out of the page.

#1558 will systemically handle
this problem.
@liiight liiight merged commit 64c9a76 into Flexget:develop Dec 27, 2016
try:
log.trace('Contents for URL %s start on next line', url)
log.trace('%s', result.text)
except UnicodeDecodeError:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

gazpachoking added a commit that referenced this pull request Jan 23, 2017
@gazpachoking gazpachoking mentioned this pull request Jan 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants