The Programmer's Guide to Better SEO: Semantic Markup and HTML5
on Seo, Html5, Semantic
Disclaimer: Content is king. Provide value, make yourself accessible, make your website easy to navigate and easy to understand for your users. Everything detailed below is about Search Engine OPTIMIZATION, and is not worth a thought until you have something worth optimizing.
Search Engines. Their goal is to provide quality, up-to-date, and relevant content. To accomplish this, they send out little bots that crawl and index trillions of web pages. Bots read these pages to better understand their content, and to help ensure they’re serving up the best possible results. How they weigh pages is a little bit of a mystery, but one thing is for sure:
Bots need to be able to read your web pages to properly index them.
Unfortunately, most websites are uneccessarily hard for these bots to read. Because of this, the bots are forced to make “best guesses” about the content; often causing less than optimal page rank (and thereby hurting SEO).
Semantic vs. Non-Semantic
Humans can see web page styles. This allows us to group content and understand each element’s context regardless of the underlying DOM elements. Robots lack this ability. When a bot reads a web page, it ignores all of the styling and uses the HTML tags to understand what it’s looking at.
Non-Semantic
With all of the styling ripped away, and using non-semantic elements (like <span>
and <div>
), this is what a bot would see. Going off of this structure alone, the bot is forced to make “best guesses” about the context of the page’s content.
Do you know where to find the unique content on this page? Do you know which content isn’t relevant?
I’m sure you can make some educated guesses. But it’s hard to know for sure.
Semantic
By using semantic elements, how and where to find the different kinds of information on our web page becomes much clearer. Bots can now find:
- Content unique to our page
- Where one article stops and another begins
- The components that make up each article
- Etc.
We can use semantic elements (basically “more descriptive <div>
tags) to provide context and clarity. But what elements do we have at our disposal? What are the rules that govern when and how we should use them?
Being Semantically Correct
There are two different kinds of semantic elements: Sectioning Elements and.. uh.. All of the other ones. Where Sectioning Elements are more descriptive versions of the classic <div>
, and the remaining elements imply or add additional behavior (<a href=""></a>
, <p></p>
, etc.)
Sectioning Elements
<main>
<article>
<section>
<nav>
<aside>
<header>
<footer>
<address>
For sectioning elements, there are only a few non-obvious rules regarding their usage:
<main>
Represents the main content of the page. There should only be one. You should not place <main> inside another semantic element.<article>
Should make sense if distributed outside the context of the page. Can contain other articles. Should have a heading.<aside>
Much like a play’s aside, content in an aside breaks away from the normal flow. It is usually related, and expands on something. Commonly contains<figures>
, groups of<nav>
s and quotes.<address>
Should contain the author’s contact information. Author of… the website, of the article, of the comment, etc.
The remaining sectioning elements can be used any number of ways so long as their meaning is preserved. For example: your webpage can have a <footer>
, but an <artical>
can also have a <footer>
.
Still Confused?
Tree House has a great article that examines each sectioning element, and HTML5 Doctor has a great flow-chart that walks you through picking the right element.
Other Semantic Elements
Most of HTML’s elements are semantic, but arguably the most important ones (in order) are:
- Title:
<title>
- Headers
<h1>
through<h6>
- Hyperlinks
<a>
- Emphasis
<b>
and<strong>
Google bots are reading your web page trying to figure out which keywords best correlate with your content. Like humans, they look to the <title>
of the document, and at the main headings (<h1>
, <h2>
, and <h3>
). They’ll likely assume emphasized and bold words are important, and that any other web pages you link to are related to your content.
It is less common to see these elements missused. Switching out an <aside>
with an <article>
makes no visual difference, but swap a hyperlink <a>
for <span>
and you will immediately notice.
<aside>
== Ignorance ==><article>
<a>
== Willfull Ignorance ==><span>
Still Confused?
HTML5Doctor has a list of all HTML elements, as well as descriptions and examples of how and when to use them.
FAQ
Why not <insert-any-name-here>
elements?
Actually… The newest W3C spec for WebComponents leaves room for programmers to create their own DOM Elements. Projects like Google’s Polymer seeks to implement the new spec, and groups like webcomponents.org are helping facilitate discussion and education on the topic.
Why should I take the time to care about Semantic Markup?
Going through your application and replacing lame duck <div>
s with cool, hip-happenin’ Sectioning Elements is a quick process. It nets you increased SEO, makes your website more accessible to modern screen readers, and you’re creating easier to maintain code.
What does a google bot look like?
Next Steps
MicroData & Schema.org
Semantic Markup on steroids. MicroData introduces additional tags search engines can use to pull specific data from your page. Not a part of the W3C spec, but standardized by the major search engines. Here is a treehouse post on MicroData
MetaData & Social Share
Your website’s metadata is important. It tells social websites how to format specific pages when they’re shared, it helps to further define your page’s keywords and relevance for SEO, and so. much. more.
Resources & More Reading:
- Overview: W3Schools Semantic Elements
- In-Depth: HTML5 Doctor: Let’s Talk Semantics
- Official: W3C