Skip to content

Getting 404 when trying to scrape hashtag #2144

@mapto

Description

@mapto

Describe the bug
I get JSON Query to explore/tags/foodie/: 404 Not Found [retrying; skip with ^C] when I try to scrape a hashtag.
Clearly, the https://www.instagram.com/explore/tags/foodie/ is a normal lively hashtag.
I've also tried other hashtags and get the same error.

To Reproduce

from instaloader import Instaloader, Hashtag
from secret import USER

L = Instaloader()
L.load_session_from_file(USER)  # session file duly created
print(f"Logged in as: {L.test_login()}")

h = Hashtag.from_name(L.context, "foodie")

Expected behavior
I'd like to iterate over h.get_posts_resumable()

Error messages and tracebacks

JSON Query to explore/tags/foodie/: 404 Not Found [retrying; skip with ^C]
JSON Query to explore/tags/foodie/: 404 Not Found [retrying; skip with ^C]

---------------------------------------------------------------------------
QueryReturnedNotFoundException            Traceback (most recent call last)
File /usr/local/lib/python3.10/dist-packages/instaloader/instaloadercontext.py:405, in InstaloaderContext.get_json(self, path, params, host, session, _attempt, response_headers)
    404 if resp.status_code == 404:
--> 405     raise QueryReturnedNotFoundException("404 Not Found")
    406 if resp.status_code == 429:

QueryReturnedNotFoundException: 404 Not Found

During handling of the above exception, another exception occurred:

QueryReturnedNotFoundException            Traceback (most recent call last)
File /usr/local/lib/python3.10/dist-packages/instaloader/instaloadercontext.py:405, in InstaloaderContext.get_json(self, path, params, host, session, _attempt, response_headers)
    404 if resp.status_code == 404:
--> 405     raise QueryReturnedNotFoundException("404 Not Found")
    406 if resp.status_code == 429:

QueryReturnedNotFoundException: 404 Not Found

During handling of the above exception, another exception occurred:

QueryReturnedNotFoundException            Traceback (most recent call last)
File /usr/local/lib/python3.10/dist-packages/instaloader/instaloadercontext.py:405, in InstaloaderContext.get_json(self, path, params, host, session, _attempt, response_headers)
    404 if resp.status_code == 404:
--> 405     raise QueryReturnedNotFoundException("404 Not Found")
    406 if resp.status_code == 429:

QueryReturnedNotFoundException: 404 Not Found

The above exception was the direct cause of the following exception:

QueryReturnedNotFoundException            Traceback (most recent call last)
Cell In[5], line 6
      2 hashtag = "foodie"
      4 print(L.context)
----> 6 h = Hashtag.from_name(L.context, hashtag)
      7 print(h)
      8 h

File /usr/local/lib/python3.10/dist-packages/instaloader/structures.py:1662, in Hashtag.from_name(cls, context, name)
   1660 # pylint:disable=protected-access
   1661 hashtag = cls(context, {'name': name.lower()})
-> 1662 hashtag._obtain_metadata()
   1663 return hashtag

File /usr/local/lib/python3.10/dist-packages/instaloader/structures.py:1676, in Hashtag._obtain_metadata(self)
   1674 def _obtain_metadata(self):
   1675     if not self._has_full_metadata:
-> 1676         self._node = self._query({"__a": 1, "__d": "dis"})
   1677         self._has_full_metadata = True

File /usr/local/lib/python3.10/dist-packages/instaloader/structures.py:1671, in Hashtag._query(self, params)
   1670 def _query(self, params):
-> 1671     json_response = self._context.get_json("explore/tags/{0}/".format(self.name), params)
   1672     return json_response["graphql"]["hashtag"] if "graphql" in json_response else json_response["data"]

File /usr/local/lib/python3.10/dist-packages/instaloader/instaloadercontext.py:435, in InstaloaderContext.get_json(self, path, params, host, session, _attempt, response_headers)
    433         if is_other_query:
    434             self._rate_controller.handle_429('other')
--> 435     return self.get_json(path=path, params=params, host=host, session=sess, _attempt=_attempt + 1,
    436                          response_headers=response_headers)
    437 except KeyboardInterrupt:
    438     self.error("[skipped by user]", repeat_at_end=False)

File /usr/local/lib/python3.10/dist-packages/instaloader/instaloadercontext.py:435, in InstaloaderContext.get_json(self, path, params, host, session, _attempt, response_headers)
    433         if is_other_query:
    434             self._rate_controller.handle_429('other')
--> 435     return self.get_json(path=path, params=params, host=host, session=sess, _attempt=_attempt + 1,
    436                          response_headers=response_headers)
    437 except KeyboardInterrupt:
    438     self.error("[skipped by user]", repeat_at_end=False)

File /usr/local/lib/python3.10/dist-packages/instaloader/instaloadercontext.py:423, in InstaloaderContext.get_json(self, path, params, host, session, _attempt, response_headers)
    421 if _attempt == self.max_connection_attempts:
    422     if isinstance(err, QueryReturnedNotFoundException):
--> 423         raise QueryReturnedNotFoundException(error_string) from err
    424     else:
    425         raise ConnectionException(error_string) from err

QueryReturnedNotFoundException: JSON Query to explore/tags/foodie/: 404 Not Found

Instaloader version
4.10.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBuggood first issueEasy to tackle, good starting point for new Instaloader contributors

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions