Although you do not know it, some of your web pages might block Google. If Google cannot access all of your web pages, you’re losing visitors and sales. Here are five reasons why Google cannot access your pages:
1. Errors in the robots.txt file of your website keep Google away
The disallow directive of the robots.txt file is an easy way to exclude single files or whole directories from indexing. To exclude individual files, add this to your robots.txt file:
User-agent:
*
Disallow: /directory/name-of-file.html
To exclude whole directories, use this:
User-agent:
*
Disallow: /first-directory/
Disallow: /second-directory/
Note that your website visitors can still see the pages that you exclude in the robots.txt file. Check your website with the website audit tool in SEOprofiler to find out if there are any issues with the robots.txt file.
2. Your pages use the meta robots noindex tag
In this case, search engines won’t index the page and they also won’t follow the links on the page. If you want search engines to follow the links on the page, use this tag:
The page won’t appear on Google’s result page then but the links will be followed. If you want to make sure that Google indexes all pages, remove this tag.
The meta robots noindex tag only influences search engine robots. Regular visitors of your website still can see the pages. The website audit tool in SEOprofiler will also inform you about issues with the meta robots noindex tag.
3. Your pages send the wrong HTTP status code
- 301 moved permanently: this request and all future requests should be sent to a new URL.
- 403 forbidden: the server refuses to respond to the request.
The website audit tool in SEOprofiler shows the different status codes that are used by your website and it also highlights pages with problematic status codes.
4. Your pages are password protected
Search engine robots won’t be able to access the pages. Password protected pages can have a negative influence on the user experience so you should thoroughly test this.
5. Your pages require cookies or JavaScript
Cookies and JavaScript can also keep search engine robots away from your door. For example, you can hide content by making it only accessible to user agents that accept cookies.
It might be that
your web pages use
very complex
JavaScripts to execute your content. Most search engine robots do not
execute complex JavaScript code so they won’t be able to read your
pages. Google can parse these pages to some extend but you’re making it
unnecessarily difficult then.
How to find these problems on your website
In general, you want Google to index page pages. For that reason, it is important to find potential problems on your site. The website audit tool in SEOprofiler locates all issues on your site and it also shows you how to fix these problems. If you haven’t done it yet, try SEOprofiler now: