An SEO professional was concerned about duplicate content on their website, more specifically, whether or not international page translations were considered duplicate content.
They want to publish multiple pages of the same content in both languages for a better user experience.
They were curious about whether or not executing this method would cause duplicate content issues, and they also wondered about the best possible way to implement something along these lines.
They were also wondering about how to best utilize the canonical and hreflang tags.
John explained that for anything that’s translated by Google, this is considered entirely different content.
This is not anything that Google would say is physically considered duplicate content, just because it’s translated.
From Google’s point of view, duplicate content is considered duplication only if they physically match each other – words, and everything else.
John also confirmed that there is, in fact, a filter where they would select one of these pages and hide the other one.
He reiterated that translated content is definitely not duplicate content.
He then expounded on the best configuration to use for hreflang. John explained that the ideal situation would be to use hreflang between such pages on a per-page basis.
The best thing to do would be to analyze the pages in Search Console, and identify whether or not they are being indexed properly.
If they are being indexed correctly, you probably don’t have to make so much of an effort to implement hreflang.
This happens at approximately the 22:12 mark in the video.
John Mueller Hangout Transcript
John (22:12)
I’m working on a Tanzanian website. Users search in two languages, in English and Swahili. We would like to publish the same content in both languages for better UX. Would that cause any duplicate content issues in the search results in general show a mix of English and Swahili content? How would we best use a canonical tag and hreflang?
John (22:32)
So the good news is anything that is translated is completely different content. So it’s definitely not something where we would say this is duplicate content, just because it’s a translated version of a piece of content. From our point of view, duplicate content is really if the words and everything match and are really duplicates. And then in cases like that, we might pick one of these pages and show and we might not show the other one. But if they’re translated there, they’re completely different words or different pages, essentially. So it’s definitely not something we would consider duplicate content.
The ideal configuration here is to use hreflang between these pages on a per page basis. And this is something that I would assume is almost optional in a case like this. So it’s something where I would, before you go off and do a lot of implementation work, for hreflang, especially for a larger website, it’s a lot of work. I would double check if you’re actually seeing any issues that users with the wrong language are going to the wrong page, or user with a specific language are going to the wrong page.
And you can kind of see that in Search Console in the performance report, when you look at the queries that that reach your website, especially if you’re looking at the top queries, you can kind of based on your knowledge, estimate which language that query is in and then look at the pages that were shown in the search results or that were visited from there.
And based on that, you can kind of make an estimation of “Is Google showing the right pages in the search results?” And if Google is already showing the right pages in the search results, then I think you can probably save yourself the effort with hreflang. But if we’re showing the wrong pages in the search results, then definitely the hreflang annotations would help here.
Usually, this is something that is more an issue on almost, I say generic queries where people are searching for your company, for example. Then just based on someone searching for a company name, we might not really know which language this user is searching for. And then we might show the wrong version of the page. So it might make sense, especially if you’re setting these annotations manually to, like, first of all, double check, is it a problem at all? And if it is a problem, does it just affect individual pages? And if it does just affect individual pages, then put the hreflang annotations there, which might be like for your homepage, or your main category page, you add those annotations.
And for everything else, probably it’s working, it might be working well. So in particular, if someone is looking for something somewhat broad or generic, like, I don’t know, for example, blue running shoes, then obviously, in English, they’ll be typing in blue running shoes. And then we can match that to your blue running shoes pages in English. If they’re searching in Swahili, I don’t know what the term is. But I imagine it’s a different term.
And because it’s a different term, we can automatically match that to your existing Swahili pages. So for many cases, you might not need to do anything special here, but I would kind of double check.