During a hangout, one SEO professional asked John Mueller about URLs in structured data markup.
Their question was: does Google crawl URLs located in structured data markup, or does Google just store the data?
John explained that for the most part, when Google looks at HTML pages, if they see something that looks like a link, they may go off and try to get the URL out as well.
This is something where Google, if they find a URL in JavaScript, can try and pick that up and try to use it.
If they find a link in a text file on a site, they can try and crawl that and use it. But, it’s not really a normal link. So, it’s something where John recommends if you want Google to go off and crawl that URL, then make sure that there’s an actual HTML link to that URL, with clear anchor text as well that you give some information about the destination page.
If you don’t want Google to crawl that specific URL, then maybe block it with robots.txt, or on that page, use a rel=canonical pointing to your preferred version, anything like that.
So those are the directions that John would go there. He would not blindly assume that just because it’s in structured data, that it will be found.
It might be found, it might not. John would instead focus on what you want to have happen there.
If you want to have it seen as a link, then make it a link. If you don’t want to have it crawled or indexed, then block crawling or indexing.
This is all totally up to you.
This happens at approximately the 23:20 mark in the video.
John Mueller Hangout Transcript
John (Question)
Let me see, does Google crawl URLs located in structured data markup, or does Google just store the data?
John (Answer)
So for the most part, when we look at HTML pages, if we see something that looks like a link, we might go off and kind of like try to get your URL out as well. That’s something where if we find a URL in JavaScript, we can try to pick that up and try to use it. If we find a link in kind of a text file on a site, we can try to crawl that and use it.
But it’s, it’s not really a normal link. So it’s something where I would recommend if you want Google to go off and crawl that URL, make sure that there’s an actual HTML link to that URL, with a clear anchor text as well that you give some information about the destination page. If you don’t want Google to crawl that specific URL, then maybe block it with robots.txt, or on that page, use a rel=canonical pointing to your preferred version, anything like that. So those are kind of the directions I would go there. I would not blindly assume that just because it’s in structured data, it will not be found.
Nor would I blindly assume that just because it’s in structured data, it will be found. It might be found, it might not be found. I would instead focus on what you want to have happen there. If you want to have it seen as a link, then make it a link. If you don’t want to have it crawled or indexed then block crawling or indexing. That’s all totally up to you.