Google’s John Mueller said on Reddit that you should start with the most important pages if you want to translate your website into different languages.
According to John Mueller, these pages “get very little traffic, add very little value, and they add a significant overhead (crawling, indexing, canonicalization, ranking, maintenance, hreflang, structured data, etc.).”
Here’s John’s full statement:
“You definitely shouldn’t block / disallow these in robots.txt — if they’re disallowed from crawling, we wouldn’t be able to canonicalize them at all, or see any of the metadata on them.
It’s easy to dig into endless pits of complexity with hreflang. ‘Let’s create all languages! Let’s make pages for all countries! What if someone in Japan wants to read it in Swahili? Let’s make even more pages!’ My guess is most of these ‘pages created because you can’ get very little traffic, add very little value, and they add a significant overhead (crawling, indexing, canonicalization, ranking, maintenance, hreflang, structured data, etc.).
My recommendation would be first to limit the number of pages you create to those that are absolutely critical & valuable — maybe that already cuts the pages you’re thinking about. Think big here; if you’re talking about individual pages within a medium-sized site, it’s probably a non-issue. On the other hand, if you’re considering copying your whole site into 20 languages x 10 countries, that’s something else.
Past that, for hreflang, I’d focus first on pages where you’re seeing wrong-language traffic — often these are pages that get a lot of global, branded queries, where it’s hard to determine which language content they want. A search for ‘google’ can match a lot of language pages, hreflang can help to differentiate. On the other hand, a search for ‘search engine’ is pretty clear & matches pages where you write about ‘search engine’ already, so pages like that don’t need as much help being language-targeted. That said, sometimes the balance between ‘save effort by thinking’ and ‘just do it everywhere’ is not that straightforward to determine :).
Check your web pages
The website audit tool in SEOprofiler can check your website in many different languages: