What If You Don't Want Your Pages To Be Crawled and Cached by Search Engines
Nowadays there are many ways to create web pages. You don't really need a full blown website to be able to create a web page. You may have web pages that you want to hide from general public. The pages meet the following criteria:
Let's see this scenario:
One day you created a page and you didn't put a link to it on your site. Then you told your family members about the page's URL. You thought nobody else would find it.
You just made a mistake. Google and Yahoo will find your page if you or any one from your family ever visited the pages with either Google toolbar PageRank enabled or Yahoo Companion Toolbar.
Google PageRank function records the URL you're visiting
When you use Google toolbar with PageRank enabled, the toolbar automatically sends and records the web page's URL you're visiting in Google's database. If a page's URL is not found in Google's database, Googlebot - the web crawler of Google, will visit this page later to index it.
When you install Google Toolbar, Google does remind you about the fact that information will be sent to Google about the web page you're visiting. Here is cited from Google Toolbar installation step 2 - Choose Your Configuration:
By using the advanced features of the Google Toolbar, you may be sending information about the sites you visit to Google. This is needed to make some features of the Toolbar available to you.
Your surfing activities are tracked whether you use Google Toolbar to search the web or directly type a page's URL in Google search page. Google records your visits anyway.
One day when you check what pages on your site have been indexed by Google, your hidden page comes up and you are worried. Furthermore, this page is cached. Even though you remove that page from your site, it can still be found and viewed from the cached version.
How to check what pages have been indexed?
Go to Google, type in "site:www.yoursite.com" without quotes. This query will list all the pages that have been indexed but it will only display up to 999 records as this is the limit set by Google for any queries.
How to prevent your hidden pages to be indexed and cached?
One simple but not sound solution is to disable PageRank function on the toolbar. To stop Google automatically track your surfing information, you can uncheck the PageRank checkbox to disable it.
Steps to disable PageRank function:
Unfortunately, disable the PageRank function is not going to completely solve your problem because, in our example, your other family members could have PageRank enabled.
A sound solution
Your problem can be tackled by using meta robots html tag. The following two tags are what you need to use. Put the tag in the <head> section of your HTML documents.
<meta name="robots" content="noindex,nofollow,noimageindex">
<meta name="robots" content="noarchive">
How to remove an indexed and cached page
If your page has already been indexed and cached, to remove from search engine databases, use one of the following two methods:
For detailed explanation about URL Removals tool, read Google Webmaster Central Blog Requesting removal of content from our index for more information.
One last note
Is your page now 100% hidden? Not really. If you have outbound links on the hidden page and you click the links and navigate to other websites, your hidden page's URL will appear in other sites web traffic log as HTTP referrer.
You can remove outbound links from your hidden pages if that's suitable.
Copyright © 2017 GeeksEngine.com. All Rights Reserved.
This website is hosted by HostGator.
No portion may be reproduced without my written permission. Software and hardware names mentioned on this site are registered trademarks of their respective companies. Should any right be infringed, it is totally unintentional. Drop me an email and I will promptly and gladly rectify it.