Before you begin an SEO campaign designed to improve the search engine rankings of your website, and any individual page on it, you need to ensure Google ever indexed the page in the first place. Often when people make these checks, they find large chunks of their site seem to be missing from the Google Index. If their site is indexed, why are pages missing?
What Google Needs to Index a Web Page
As you know, the Google Bots (not real bots, they are lines of code, but the imagery is cool) are at work 24/7, crawling around the Internet accessing webpages to be added to the Google Index. However, just because you create a web page does not mean that they will automatically do so.
Before Google can rank your content, their bots need to find it, be permitted to ‘read’ it evaluate it, and then finally index it. If anything goes wrong at any point in this process, a page may not be indexed into search results, even if other pages from the same website are. So contrary to popular opinion, just posting your content online is not enough.
As clever as they are, Google Bots are not psychic, nor do they have ‘spidey senses’ that alert them to the creation of your new page.
To index a page, Google has to be able to find it. This means that somewhere has to link to it–whether that link come from other linked pages in the same site, or from other websites.
Depending on the quality and the relevance to yours of the places the new content is linked from, it might take time for Google to schedule following those links and finding your pages.
This also means that the page cannot be ‘hidden’ in any way–which, for example, might mean content is password protected, blocked via robots.txt, or only available to users in certain countries.
This is the reason SEOs will always advise that you create and submit site maps to Google regularly and that you try to link content you would like to indexed quickly into an existing menu or with content that you are certain already appears in the Google Index. The more of a guiding hand you give the Google Bots, the more likely they are to find your content.
Just because a Google Bot find a page does not automatically mean it will add it to its index. It takes the time to evaluate its suitability first, and in doing so takes a number of factors into account:
Low Quality Content
Just how Google determines that a page is low quality has something of a secret sauce element to it, and their definition changes often. The company themselves often point to the Google Quality Rater’s Guidelines as a good guide these days, but some known factors that they consider include:
Low word count
While there is no ‘set’ word count considered perfect anything under 300 words is certainly likely more likely to be judged ‘thin content’ and rejected from Google’s Index.
Excessive use of the same keywords
Keywords are still important, but it is their careful and sparing use that is successful, not using them repeatedly.
If the page is a close or even direct duplicate of content found elsewhere on the web (including on the same site) it is unlikely that Google will include it in its index.
If the page loads too slowly, bots know humans don’t like to wait, so they don’t either. Speed is known to be an increasingly important factor in the quality of a page, but it is still something that many webmasters are overlooking. Discovering just how fast – or slow – Google considers your page, and what needs to be done to fix the problems – is easy to do though. Simply input the URL into Google’s Pagespeed Insights tool and it will tell you all you need to know.
These are just some factors, there are many more. It’s one thing that makes SEO so challenging and one of the biggest reasons people may need to seek professional SEO help for their site right from the start.
Google Bots read code, not words. If a web page is coded in such a way that it is hard for them to read they will not hang around trying to do so. They will determine either that the page is empty if they can’t see it at all of that the page is low quality and move on.
Indexing Is Not Forever
Congratulations! Your page passed the test and is appearing in Google. But – and this is another bit many people do not know – that does not mean it will stay there. Google repeatedly crawls and re-evaluates content – so if your quality drops, or if you accidentally prevent Google from evaluating the content, then your page might get dropped out of the index.
How Do I Know Which of My Pages Google Has Indexed?
Now that you know that Google may not have indexed all of your website’s pages you are no doubt keen to head off and check. But what is the best way to do that? Itis actually very simple to do so:
- Click in Google’s search field.
- Type site: followed by the website URL you want to limit the search to. Ensure that there’s no space between site: and the website address.
- Follow the website URL with a single space and then type the search phrase.
You don’t need to use the http:// or https:// portion of the website’s URL, but it doesn’t change the result if you include it. You will then be returned a search that lists all of the pages from within that URL that are listed within the Google Index.