Wikipedia:Controlling search engine indexing
There are a variety of ways in which Wikipedia attempts to control search engine indexing, commonly termed "noindexing" on Wikipedia, as the default position is that pages are indexed. All of the methods rely on using the noindex HTML meta tag, which tells search engines not to index certain pages. Respecting the tag, especially in terms of removing already indexed content, is up to the individual search engine, and in theory the tag may be ignored entirely.
The control methods are:
- Controlling an entire namespace, via MediaWiki software settings
- Controlling classes of pages, via MediaWiki:Robots.txt (Wikipedia's Robots.txt file)
- Controlling individual pages by adding the __NOINDEX__ magic word into them, either directly or using the {{NOINDEX}} template.
- Controlling pages by adding the __NOINDEX__ magic word into standard templates used in certain situations.
Namespace and robots.txt
[edit]Namespace control
[edit]On English Wikipedia the entire user talk:
namespace is automatically noindexed via a software setting.[1]
At the same time, __NOINDEX__ is disabled in article space and consequently has no effect there.[2]
Robots.txt noindexing
[edit]MediaWiki:Robots.txt noindexes sensitive or potentially sensitive types of page, primarily in the Wikipedia namespace - for example deletion debates.
NOINDEX magic word
[edit]Individual pages
[edit]Individual pages can be noindexed by adding the __NOINDEX__ magic word into that page, either directly or using the {{NOINDEX}} template. However, __NOINDEX__ is disabled in article space and consequently has no effect there. Therefore this list of articles containing NOINDEX should be empty (although it does little harm if it is not). Pages with the keyword are listed in Category:Noindexed pages.[3]
Standard template noindexing
[edit]Some standard templates include the __NOINDEX__ keyword, thereby noindexing pages to which the templates are applied. Such templates should be listed in Category:Wikipedia templates which apply NOINDEX.
Biographies of Living Persons talkpage noindexing
[edit]The templates {{BLP}} and {{BLP others}} include the {{NOINDEX}} parameter. The {{BLP}} template is added automatically by the {{WikiProject Biography}} talkpage template, if given the parameter |living=yes
; see the documentation of that template for more details. Pages using these templates are automatically categorised in Category:Biography articles of living people.
Other templates
[edit]These templates include {{NOINDEX}} together with a relevant message:
- {{User sandbox}}, {{Userspace draft}}
- {{Sockpuppet}}, {{Sockpuppeteer}}, {{IPsock}}, {{Banned user}}, {{Blocked user}} and others
- {{Db-meta}} and hence the various speedy deletion templates built on it
See also Category:Wikipedia templates which apply NOINDEX.
- {{Uw-userspacenoindex}} provides a user warning message for inappropriate use of userspace which required noindexing.
INDEX magic word
[edit]Individual pages
[edit]Individual pages can override namespace noindexing by adding the __INDEX__ magic word into that page, either directly or using the {{INDEX}} template. Such pages appear in Category:Indexed pages. However, INDEX does not override noindexing via MediaWiki:Robots.txt.[4]
Past discussions
[edit]Namespace discussions
[edit]- Wikipedia:Requests for comment/User page indexing
- Wikipedia:Search engine indexing – Proposal to change the namespace settings for indexing
- Wikipedia:NOINDEX of noticeboards – Dead/moot proposal to NOINDEX noticeboards
- Wikipedia:Village pump (proposals)/Archive_35#Namespaces in Robot.txt – Proposal to noindex several obscure namespaces like "Image talk." Strong majority opposed.
- Wikipedia:Village pump (proposals)/Archive 36#Re-enable searches in the user talk space – Proposal to re-index user talk pages. Majority opposed.
- Wikipedia:Village pump (policy)/Archive 59#NOINDEX of all non-content namespaces – Mixed discussion to exclude all non-content namespaces from indexing.
- Wikipedia:Village_pump_(policy)/Archive_62#Where_and_when_to_use_NOINDEX_to_remove_pages_from_search_engines – Proposal to exclude certain pages from indexing.
- Wikipedia:Talk pages not indexed by Google – A proposal to tell Google not to index the Talk: namespace.
Individual template discussions
[edit]- Template talk:Non-free_media#Adding NOINDEX – Proposal to NOINDEX non-free images. No consensus.
- Template talk:WPBiography#Noindex – Proposal to NOINDEX BLP talk page template
- Template talk:Administrators' noticeboard navbox all – NOINDEX on AN archives template
Current issues
[edit]- bug 24169 – "Create an __NOINDEX__ equivalent to prevent indexing by internal search engine"
Notes
[edit]- ^ This is $wgNamespaceRobotPolicies. See Wikimedia's $wgNamespaceRobotPolicies setting for enwiki
- ^ This is controlled by the MediaWiki software setting $wgExemptFromUserRobotsControl, which defaults to $wgContentNamespaces, which is set to main space on almost all Wikimedia projects – see here and here.
- ^ The listing is done by MediaWiki tracking the keyword. The category name is determined by MediaWiki:Noindex-category.
- ^ It does override mw:Manual:$wgArticleRobotPolicies, but on English Wikipedia this is only used for two pages anyway: Wikimedia's $wgArticleRobotPolicies setting for enwiki