Search engines have made it clear: a vitally important part of the future of search is “rich results.” While controversial among SEOs (see Ahrefs’ “Are Google’s SERP Features Stealing Traffic From Your Site?”), it seems like every few months another box is added to Google’s Search Gallery.
Google is pretty good at understanding the general context of a site’s content. When it comes to intuiting the specifics of a page, though – usually the most important information to a searcher — crawlers need some help. This is where structured data comes in!
First, “structured data” is a general term that refers to any organized data that conforms to a certain format. It is hardly just an SEO thing: relational databases, a foundational core of all computation, rely on structured data. SQL — Structured Query Language – manages structured data.
When a website wants a piece of content to be representative of a “thing” – like a profile page, an event page, or a job posting – its code needs to be marked up properly. With the installation of structured data, a site converts its HTML from an unstructured, general blob to a frictionless document. The more your webpage reads like XML or a JSON object to a search engine, the cooler the things it can do with your content.
On the internet, the de facto “language” of structured data is schema.org. Schema.org is a democratic library of internet things. Take, for example, an airline flight: schema.org has a lexicon to notate the type of aircraft, the departure gate, and even a description of the meal service:
The project was originally founded as a joint effort between Google, Microsoft, Yahoo, and Yandex. It remains open source and it is technically editable by anyone… but, like most anything corroborated by the W3C, don’t expect that process to be simple. If the type of schema you want to use doesn’t actually exist, there is a technical and bureaucratic process you can go through to eventually get a new type of markup democratically included into the Schema.org library.
Schema.org was born from the spirit of “The Semantic Web,” coined by Tim Berners-Lee (the inventor of “www.”). In the words of the W3C:
The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.
The 4 Ways of Structuring Data
There are four main semantic annotation formats you can use to structure data on the internet.
JSON-LD – JSON-LD is the newest player in town, and it is the format that Google regularly recommends. It is novel in how it exists on a page; instead of “tagging” individual HTML elements, think of JSON-LD as one big blob of informational code near the head that says to the crawler, “Ok! The type of aircraft is this, its departure gate is this, and the in-flight meal is this. Now, please continue on to the content.”
JSON-LD is also cool in that it can supply information about a page without there actually being any “visual” content to represent that info (no corresponding containers are mandated).
RDFa + GoodRelations – The OG counterparts of JSON-LD are “HTML extensions.” They are conceptually different: instead of having your structured data in one digestible block, HTML extension syntaxes are sprinkled throughout a page’s content, structuring your data on-the-fly. Think of this syntax as just another attribute, like a class, appended to your HTML containers.
Whenever I work with a client that is using a backend-restrictive platform (like Shopify), I still find RDFa useful in marking up very dynamic elements, like individual “review” objects. While elements can be injected as JSON-LD asynchronously, simply writing in HTML extensions is often cleaner and quicker.
Microdata – Another HTML extension syntax, microdata extends HTML5 and it is mostly deprecated. That said, it still pops up occasionally and is good to be familiar with.
Microformat aka μF – Microformat is most commonly seen in the form of hAtom/ hentry. This may be me projecting, but I feel like microformat is a pariah. Probably the most common appearance of microformat is as an error in Search Console — many WordPress webmasters suffer rogue microformat injections via sloppy theme development.
Honorable Mention: Data Highlighter
For sites “with just a few things to mark up,” Google also offers a tool within Search Console that allows a site owner to quickly click-and-drag to apply structured data. There are a couple big reasons to not use the Data Highlighter, though:
- Your data highlighter markup will break when anything in your pages’ formatting changes
- Your highlighting will only apply to Google and will be invisible to other search engines
How Structured Data Helps Your SEO
How Structured Data Helps Your SEO
- Rich Snippets — Rich snippets, like the coveted gold stars, come from proper implementations of structured data. This obviously boosts CTR.
- Knowledge Graph — A brand or individual’s Knowledge Graph card can be influenced by the inclusion of Structured Data. @SameAs tags can help your social profiles be displayed here.
- AMP, Google News, etc. — For successful inclusion into AMP or other Google programs like Google News, a site must be compliant in including many different types of structured data. If your site is marked-up well, you’ll also enjoy beta releases, like the new “events” card.
- Contextual Understanding — Search engines state they are better able to “understand” the context and intent of your page if you include strong structured data, even if there’s not a direct visible result. This affects how your site shows up in search and in what indexes you are included.
- Other search engines — Every search engine treats structured data in different ways. Yandex has some fields that are required for successful processing that aren’t required by Google. Baidu’s first page results rely heavily on structured data.
The Ranking Factor Myth
Structured data is not a ranking factor, full stop. This must be clearly understood before bringing it up to clients.
What we have seen in the past, though, is the “cheating” of search results due in part to structured data. Google will pull branded SERPs to the top of the stack when it thinks a searcher is querying for that business directly. So, say you own Tim’s Pizzeria in Brooklyn, and I search “tims pizza brooklyn” — your site usually appears first even if your backlink profile is crappy, content is light, etc.
If Google does not yet understand that your site equals the site of Tim’s Pizzeria, local structured data can help with that. And as I mentioned above, it can help with the Knowledge Graph which is kind of a SERP (that would be Organizational markup).
Structured data is not magic and it doesn’t add to a site’s ‘quality’ in the eyes of Google. It’s important SEOs understand its usefulness and impact.
On the Other Hand…
That said… there is a cursory way that structured data could in theory help with SERPs: if dwell time and CTR are ranking factors, as many SEOs have suggested, rich snippets can significantly improve both of those metrics, which would hypothetically positively affect rankings.
I have repeatedly seen search traffic increase with the implementation of Schema for clients because CTR pops up a few percent.
Let’s Try It!
Probably the easiest piece of JSON-LD that any site can install is “Website” structured data. This markup tells the world that your site “is a set of related web pages and other items typically served from a single web domain and accessible via URLs.” …AKA pretty much worthless information, but a good starting point!
Paste this into your site’s, just like you would Google Analytics code, and replace ahrefs.com with your site’s root canonical URL:
Many readers might also be wondering how they apply this for eCommerce – here’s an over-expanded product JSON-LD block to steal:
One cool thing: Google can understand JSON-LD even when rendered asynchronously, so you can inject it into the page via a data layer (GTM), AJAX, etc. Here’s a great guide for that.
Structured Data Tools
For WordPress users, I’ll tentatively recommend “Schema” for a quick fix to the most crucial structured data needs. My disclaimer: a lot of SEO plugins’ structured data output is thin, garbage, or accidentally damaging.
What I mean by thin: Basically, because they are plugins, these tools often work by what is “inferred” from the page rather than what is specified directly. That means that they are beholden to WordPress hooks (author, datePublished, Featured Image, etc.)… which in turn makes their usefulness dependent on the theme’s developer. And when a site’s SEO is strictly dependent on developers, things usually get missed!
Also, Schema via plugins is never expansive — Google understands much much much more structured data than the search engine necessarily uses at any given time, lest it throw errors in the Testing Tool. This pool of understanding follows the expansion of the Schema.org library. I have had times where I’ve implemented super niche markup that is in Schema.org but that is not yet recognized by Google.
Sites that implement “experimental” schema find themselves winning immediately when G rolls out a new card because they covered all their bases. For example, look at how incredibly expansive Sephora’s product markup is — only half of those items are actively used in rich snippets, but others have been toyed with in the past (+ will be in the future). Take a peek at all the strange markup items that the NYT employs.
Here’s an example of granular experimental event markup I’ve implemented for a client:
This puts my client’s site in a few very exclusive clubs (for example, suggestedMinAge is used by just 100 to 1000 domains per Schema.org).
Another big problem with SEO plugins and schema… they are all trigger happy to implement basic schema, which can lead to structured data duplicates. Most of the time this isn’t a problem, but for some types of pages, like products, Google might assume that you have more than one product on that page rather than the same product with two different markups.
It’s an issue I’m working through with another client at the moment: Shopify has their lay product schema that they inject which is duplicating against our expansive and rich product schema that has aggregateRating and reviews inline.
Some might also suggest https://www.schemaapp.com/… I’ve never used it so I can’t really vouch one way or another! But I see:
“Schema App is a suite of tools that allows digital marketers to create and manage Schema markup without requiring them to be an expert in the Schema.org language or writing code.”
Which leaves me optimistically cautious for all the reasons listed above.
This Seems Overly Complicated
For immediate impact, just the baseline-level stuff will float most SEO’s boats. The basics can usually be safely handled by plugins or add-ons. Expect to deal with the issues we covered earlier if you go that route!
Incoming bias (I’ve been on this train since the infamous 1.4.8 update debacle): please consider the beautifully transparent and lightweight TSF as your SEO plugin solution over Yoast!.
For those of us working in-house or on a bigger site, I feel the SEO industry should give more attention than it does to expansive markup. Think about it — a strong understanding of structured data is like a golden ticket into beta search engine experiments. It guarantees that your organization is understood. And it’s not really something that needs to be actively maintained — if you get it right once, then (barring redesigns) it’s pretty much done forever.
Because it is code-driven, structured data is very much a boogeyman that SEOs love to hate and ignore. I fully expect the “Technical SEO is makeup” crowd to push back here. But a good bit of SEO is covering our bases, and Schema is under-served in that sense.
There is a complicated and infinitely vast underbelly to technical SEO, and a strong understanding of structured data is foundational. In the end, The Semantic Web might well be our own undoing; the more data we spoon-feed Google, the more Google can create cool modules and sap traffic away from site owners.
It’s also worth noting that whenever we structure our data well, we’re training search engines to better do it without us in the future. Don the tinfoil hats: Data Highlighter, while helpful, is a big ‘ol machine learning ploy ☺.
Intermittently, though, the benefits of structured data are too huge to ignore. Despite the potential for traffic, good markup puts your site on the bleeding edge of new rich feature tech that Google is developing. I encourage all SEOs to dive in!