In a Search Central Lightning Talks video, Google Developer Relations Engineer Martin Splitt explains how source HTML can differ from rendered HTML, and how this can impact your SEO.
He asks, at the beginning of the video: Did you ever inspect the source of a website and wondered why it looks different from what you see in your browser when you open the same website?
Or perhaps you heard about the rendered HTML or the DOM and don’t know how these are different from what you see in view source, for instance, then this video is for you.
He talks about the following:
- The very beginning of the document loading process as you hit return in the address bar,
- HTTP requests to the server hosting the website,
- How the server processes and responds to the HTTP request,
- How the server sends back and processes the JavaScript request,
- Why people think what the original source in the browser is what matches what’s on the page,
- How the browser prepares and displays the original source into the rendered source,
- The differences between the source HTML, the rendered HTML, and how the DOM plays into this rendering process,
- How JavaScript might change the DOM,
- Where the DOM lives and how it relates to the processing of the document,
- and much more.
We highly recommend listening to this podcast if you care at all how much your source HTML and rendered HTML code can impact your SEO efforts.
Source HTML vs. Rendered HTML Transcript
Martin 0:00
Did you ever inspect the source of a website and wondered why it looks different from what you see in your browser when you open the same website? Or maybe you heard about the rendered HTML or the DOM and don’t know how these are different from what you see in view source, for instance, then this video is for you. In this video, I would like to take you on a journey behind the scenes of your browser. So let’s jump in.
Martin 00:26
To understand what happens between a server sending some HTML into your browser, and the browser showing you the website in all its glory, we need to start at the moment where you hit return in the address bar and ask the server somewhere on the internet to give a website to you. The first step to get a website into your browser is to send an HTTP request to the server that is hosting the website.
Once the server receives the request, the server chooses how to respond to it. That can mean running a program like PHP script or Java program, or whatever, or simply sending back the content of a file. For this video, we will skip over what exactly the server does and focus on the fact that the server usually starts sending a bunch of HTML back to the browser. Many people think of the HTML that was sent from the server as the same as the website you will see in your browser. But that isn’t really accurate.
Instead, we can say that what you see in the browser is a dish and the HTML that was sent is the recipe. The browser needs to prepare or cook the dish based on the recipe it received from the server. It turns the HTML into a number of visual and sometimes interactive elements, that then show up on your screen. The interesting part is what happens between the HTML and what you see in your browser. This is called rendering. Let’s look a little closer at rendering to see what happens in between the beginning, the HTML that comes from the server, and the end, the interactive website that, you, well, interact with.
To understand the difference between the HTML that we received from the server, the so-called Source HTML, the rendered HTML and the DOM, we need to look closer at this rendering process. The rendering process is a series of steps that start with a source HTML. And if included the styling information in the CSS, the browser starts by parsing the source HTML, and if available, any CSS found inside, which creates two tree like structures called the Document Object Model, DOM for short, and the CSS object model, CSSOM for short.
For us, the DOM is the more important of the two models, so we’ll focus on it from now on. For our website, for example, the browser would create the following DOM tree. The browser needs this tree to identify individual elements on the website and their relationship to other elements. Like for instance, which text belongs to the heading and which to the paragraph or what file is the source of this image. The browser then takes the DOM and CSSOM and figures out how to fit all this stuff into the browser window. This is called layouting, and creates a render tree. The Render Tree basically contains the sizes and positions on the screen for each of the elements in the DOM. The browser uses this tree to paint the actual pixels that make up what we see in our browser window. It is important to understand that once the browser shows us the website, it may use JavaScript to allow us to interact with the website.
The JavaScript might then change the DOM by adding, changing or removing elements in the tree. For example, when I click this button, it adds an image to the DOM and thus to the website on the screen. Here we can see this in action. On the left, you see the interactive website. On the right, you see a representation of the DOM that the browser uses for this website. Whenever the button is clicked, the JavaScript adds a new image to the DOM tree, and the browser renders it. In the previous example, the DOM tree might look different at any moment, depending on what happened before.
The DOM itself only lives in the browser memory and isn’t visible per se, but we can represent it in different forms. One such form is the interactive tree view that we saw in the browser developer tools on the right hand side of the previous video. Alternatively, we could turn the DOM tree back into HTML. We call this the rendered HTML.
When we turn the DOM tree back into the rendered HTML, we might get different results depending on what happened to the DOM tree before. For example, the rendered HTML of the previous example website is exactly the same as the source HTML, as long as the button was not clicked.
As the button is clicked, JavaScript will run and it will change the DOM and in turn the rendered HTML. Now, you may already wonder, what is it that I see when I click View Source in my browser? Well, it’s the source HTML, or is it the rendered HTML? There are three ways to see the source HTML coming from the server. First, and probably most known, is to right click on a web page and select View Source or typing view dash source colon in front of the URL. That shows you the source HTML. Alternatively, you can also go into the developer tools on your browser, select the Sources tab and see the source HTML there.
Here, you can see that the right side shows the original HTML without the images that are currently visible on the website. A third way is to use the network tab in the browser developer tools, where you can also see the HTML that was sent back from the server. But what options do I have if we want to see what’s in the DOM?
Well, again, the browser developer tools have us covered. The Dev tools contain a tree-like representation of the DOM that we can explore and interact with. We find this in the Elements tab in Chrome. You may notice that, unlike the other Dev Tools panels and the view source, this shows us the current DOM content, including the images that were added by JavaScript. If we wanted to, we can turn the DOM back into HTML.
This can be quite complex, thanks to things like the Shadow DOM or cross-origin iframes. But for simpler websites, you can get the rendered HTML by going into the Dev Tools Console and running this JavaScript snippet. Again, please note that this isn’t always working, especially for more complex websites. For debugging, I recommend that you use the URL Inspection Tool in Search Console to get the rendered HTML that Google Search uses for indexing of a page.
All right, so that was quite a ride. Let’s summarize what we learned today.
We started with a source HTML, that’s the HTML that a server sends to our browser when we open a web page. Then we learned that the browser turns this HTML into the DOM. That is an interactive element by element representation of the website constructed from the source HTML. The DOM can change as JavaScript might modify it while the page is loading, upon the user interaction, or other events while it is open in the browser.
And then, last but not least, we identified rendered HTML as a snapshot of that DOM turned back into HTML. The rendered HTML reflects the DOM content on the page at the time that snapshot was taken. To see the rendered HTML, you can use the URL inspection tool in the Google Search Console. So now we’ve looked at what’s the source HTML, the rendered HTML and the DOM, and explored what tools you could use to debug issues on your websites. What tools are you using to debug issues on your websites?
Let us know in the comments below. And also, thanks a lot for watching and please like and subscribe to stay in the loop with our latest and greatest content around Google search. Thanks a lot. And bye bye. Want more technical SEO antics? Catch me on the Search Off the Record podcast, where we talk about all things Google search, go behind the scenes and who knows, maybe discuss my love for JavaScript.
Yay! Join me, Martin and the Google Search relations house and check us out wherever you download podcasts.