• Resolved speechfree

    (@speechfree)


    The SEO Framework is adding X-Robots-Tag: noindex to sitemap.xml files. I need to remove this header but there’s no filter for it.

    I found filters like the_seo_framework_robots_meta_array for meta tags, but nothing for HTTP headers on sitemaps.

    Can you add a filter so we can unset or why this is happening? Is there a setting in the plugin that I am missing?

    Thanks!

Viewing 3 replies - 1 through 3 (of 3 total)
  • Plugin Author Sybre Waaijer

    (@cybr)

    Hello!

    The X-Robots-Tag: noindex header on sitemaps is intentional: it prevents the sitemap itself from appearing in search results while still allowing search engines to discover and use the URLs within it.

    This header is outputted on the virtual sitemap.xml, sitemap.xsl, and robots.txt ‘files’.

    If you truly need to remove this header, you can use the the_seo_framework_set_noindex_header filter. Hook a little early into the_seo_framework_sitemap_header to specifically disable it for sitemaps:

    add_action(
    	'the_seo_framework_sitemap_header',
    	function () {
    		add_filter( 'the_seo_framework_set_noindex_header', '__return_false' );
    	},
    	9, // before default priority 10, where the header is set
    );
    

    However, please be aware that removing this header may cause the sitemap to be indexed in search results, which is generally unwanted.

    Thread Starter speechfree

    (@speechfree)

    Hello,

    Thanks for quick reply. We did not know that it was intentional, but we are having a problem with SE Ranking saying that XML sitemap is missing. Also, Google Search Console cannot fetch the sitemap even though the sitemap is accessible publicly. So, we thought that X-Robots-Tag: noindex was the issue.

    Do you have any idea about this issue? What would you suggest?

    Thanks

    Plugin Author Sybre Waaijer

    (@cybr)

    Hello!

    The X-Robots-Tag: noindex header is not the cause of this issue — it’s enabled for all plugin users, and we rarely see this issue pop up.

    If you do a URL Inspection for the sitemap, it will tell you it’s not indexable — this is expected and correct behavior.

    If you go to the Sitemaps Report and Google states it cannot fetch the sitemap, there can be many reasons:

    1. Google Search Console lag: GSC reports are often delayed by a few days. The sitemap may already be processed successfully, but the report hasn’t updated yet. Check the “Last read” date in the Sitemaps Report.

    2. Robots.txt blocking: Check if your robots.txt file is blocking access to the sitemap. Visit /robots.txt on your site and ensure there’s no Disallow rule affecting the sitemap path or Googlebot.

    3. Server timeouts: If your site is slow or was temporarily offline, Google’s crawler may time out. Check your server error logs.

    4. Security plugins: Some security plugins block requests that don’t have typical browser user agents. Temporarily disable security plugins to test, or tweak their settings to allow Googlebot and /sitemap.xml access.

    5. Caching issues: We are aware that some caching plugins can interfere with the output (a patch is pending). Try clearing all caches or temporarily disable the caching plugin and test again.

    6. CDN or firewall blocking: Services like Cloudflare or Sucuri may block Googlebot. Check your firewall/CDN logs for blocked requests.

    7. Incorrect sitemap URL: Ensure you’re submitting the correct URL. With The SEO Framework, the default sitemap is at /sitemap.xml.

    Could you share the specific error message Google Search Console displays, along with the sitemap URL? That will help narrow down the cause.

Viewing 3 replies - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.