Skip to content

szepeviktor/multipart-robotstxt-editor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multipart robots.txt editor

Useful robots.txt-reading bot categories

  • search engine crawler
  • feed bots
  • SEO crawlers
  • social crawler
  • advertisment checker bot
  • archivers

Search engine crawler user agent IDs

  • Googlebot
  • Googlebot-Image
  • Googlebot-Mobile
  • bingbot
  • BingPreview
  • msnbot
  • Yahoo! Slurp
  • YandexBot
  • YandexImages
  • MJ12bot
  • Baiduspider

Feed bots user agent IDs

  • Feedfetcher-Google
  • Feedstripes

SEO crawlers user agent IDs

  • AhrefsBot

Social crawler user agent IDs

  • facebookexternalhit

Advertisment checker bot user agent IDs

  • AdsBot-Google

Archivers user agent IDs

  • ia_archiver

Command line tools and HTTP libraries user agent IDs

  • Wget
  • curl
  • libwww-perl
  • Python-urllib

TODO

  • vs_api: needs to save once to create the option -> admin_notice

  • vs_api: option autoload on/off option

  • vs_api: required fields

  • vs_api: settings: return admin_notice

  • vs_api: phpdoc

  • vs_api: i18n

  • vs_api: row, col, size args

  • vs_api: legends! where?

  • vs_api: TODO radios

  • vs_api: TODO multi checkboxes

  • vs_api: HTML textarea + editor (settings-api-tabs-demo-ban megnézni)

  • vs_api: issue: ideas for Tabs (separate pages, 1 page + js hide/show, 1 page + ?tab=)

  • admin notice in case of subdir, parse_url(home URL)

  • At least one "Disallow" field must be present in the robots.txt file. - check for that

README

  • one day transient with fallback to WP records
  • file creation instruction: wget -O ABSPATH . "robots.txt" home . "robots.txt"
  • subdir installs with path in Site Address (Settings / General)
  • no run on Settings / Reading / "Discourage search engines from indexing this site"
  • about FIXME: several UA-s and one is "*"
  • recommended sitemaps: http://smythies.com/robots.txt http://www.lemgo.net/robots.txt
  • video: (you can drag&drop it into the URL field below after emptying that field)
  • video: "URL of the remote robots.txt" deafult value is local (more about other defaults)

About

Customize your WordPress site's robots.txt and include remote content to it.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages