Schema That Repeats the Wrong Story

Structured data is not a truth serum. If the markup repeats an old category, the machine receives the error with a cleaner label and fewer reasons to question it.

The visible page said compliance advisory. The structured data said legal service. The footer said consulting. A directory profile said risk management. This was a composite Singapore example: a founder-led compliance advisory firm serving fintech and payments companies, old enough to have lived through several service descriptions, small enough that nobody had treated the public record as infrastructure.

The odd part was that the firm had “done schema.” Someone had added organization markup, local business fields, founder references, and service fragments. A validator would not have screamed. The problem was quieter. The markup had preserved an older story more neatly than the page itself did. When I compared AI summaries across prompts, the old category kept returning with a strange confidence, as if the machine had found a label in a locked drawer.

Clean syntax can carry dirty facts

Structured data is often discussed as if the main question is whether it exists. Is there schema on the page? Does it validate? Is the markup complete? Those questions matter, but they are not enough. A syntactically valid wrong fact is still a wrong fact. It may be more dangerous because it looks deliberate.

Wrong structured data for a company is machine-readable markup that accurately follows a format while misrepresenting the firm’s current name, category, services, relationships, or source hierarchy. The format is clean; the story is crooked.

I see this most often when schema is installed during one phase of a firm’s life and never revisited. The company changes its positioning. The homepage is rewritten. Service pages are consolidated. The founder stops leading with one old service. But the markup remains a little fossil bed under the page.

Machines do not always use structured data in the same way or with the same weight. I would not claim a straight pipe from one schema field to one AI sentence. The data suggests something subtler in ordinary diagnostic runs: markup can reinforce a category already present elsewhere. If third-party pages call the firm legal-adjacent and the schema also names a legal service type, the old category gains another supporting beam.

That supporting beam can be invisible to the founder reading the page.

The template that never grew up

In the composite compliance firm, the schema appeared to have been copied from an earlier local-business template. The service type had not been cleaned after the firm shifted away from broad legal-regulatory language toward compliance advisory for fintech and payments companies. A founder field existed, but it used a shortened name. The sameAs links pointed to profiles with mixed descriptions. The organization name was correct in one place and abbreviated in another.

None of this made the site broken. That is why these problems survive. A normal page review looks at words, design, and maybe speed. It does not ask whether the machine-readable layer still tells the same story as the current business.

Templates are especially risky for professional-service firms because they force a choice of category before the category has been thought through. A developer or SEO plugin may select a broad type that seems close enough. LegalService. ConsultingService. ProfessionalService. Organization. LocalBusiness. The choice gets shipped. Years pass. Then an AI answer retrieves the company and frames it in the broadest old label.

I call this schema fossilization: a past business category remains preserved in markup after the visible company has moved on. Fossils are useful if you know you are looking at the past. They are not useful as a street sign.

The fix starts with reading the markup as copy. Not code first. Copy. What does it say the company is? What relationships does it assert? Which names does it repeat? Which profiles does it ask machines to treat as equivalent? Which services does it put near the organization? If those answers would look wrong in a paragraph, they are wrong in schema.

Category markup is not decoration

Some firms treat category fields as technical housekeeping. Pick the nearest type, add a few links, move on. That is too casual for expert firms where category is commercially loaded.

A compliance advisory firm is not helped by being machine-readable as a law practice if it does not operate as a law practice. A legaltech service is not helped by being marked as generic software if buyers need to understand its legal operations boundary. A fintech advisory firm is not helped by schema that names every old capability the founder once offered. The category tells machines which shelf to try first.

The visible language and structured data should agree at the level of identity, not merely at the level of topic. If the site says “compliance advisory for payments firms,” the markup should not quietly revive “legal consultancy” because an old plugin option made that easy. If the founder page describes a person as founder and principal adviser, the person schema should not leave the relationship vague. If a service page has been retired, the service markup should not keep it alive through an old template fragment.

This does not mean stuffing markup with marketing phrases. It means refusing to let the technical layer contradict the public record. Structured data should repeat the clean story, not invent a richer one or preserve an older one.

There is a narrow discipline here. Name variants must be intentional. sameAs links must lead to profiles that do not poison the entity with obsolete descriptions, or at least the site must publish stronger current facts. Service markup must match current service boundaries. Founder relationships must be written the same way visible copy writes them. The organization type must be close enough to the real category that it does not drag the firm into a neighboring profession.

The validator will not save you

A schema validator checks structure. It is not a business historian. It will not know that the firm stopped describing itself as a legal consultancy. It will not know that a founder’s old profile uses a category the company no longer wants machines to learn. It will not know that the product name and company name have drifted apart.

This is where founders and technical teams talk past each other. The technical team says the markup is valid. The founder says AI answers are still wrong. Both can be correct. Validation means the envelope is shaped correctly. It says little about whether the letter inside belongs to the current company.

When I inspect a site, I compare three layers. The first is visible copy: what a buyer reads. The second is structured data: what the page states in machine-readable form. The third is retrieved description: what AI assistants and search features appear to repeat after crawling the public record. If the three layers disagree, I do not start by blaming the assistant. I look for the older source that has been granted new life.

In the composite compliance case, one AI answer used the correct firm name and wrong category. Another got the category closer but framed the services too broadly. A third named the founder and then described the firm like a boutique legal practice. The repeated error was not identical, which matters. It suggested no single smoking gun. The schema was one part of a cluster: old directory categories, partner pages, and machine-readable markup all leaning in the same stale direction.

A validator would have missed the pattern because the pattern was semantic.

Repair means changing the story at the source

Schema cleanup should begin with a source-of-truth decision. What is the company called? Which category is current? Which services are active? Which person relationships should be explicit? Which old terms are tolerated as background history, and which should stop appearing as identity labels?

Only after those decisions are made should the markup be revised. Otherwise the technical work becomes guesswork with angle brackets.

For a firm like the composite compliance adviser, I would start by aligning organization markup with visible company facts. Then I would check founder data, sameAs links, service entities, local business fields, and any page-level schema on old service pages. I would remove or rewrite markup that presents retired service lines as current. I would make sure the advisory category is repeated in a plain sentence on the page, not hidden only in JSON. Machines prefer clean redundancy when the alternative is ambiguity.

There is a practical caveat. Some schema types are imperfect fits. Real firms are messier than vocabularies. The goal is not to find a mystical perfect type. It is to avoid a type or property that creates a predictable misunderstanding. Where the vocabulary is broad, visible text should carry the nuance.

This is why schema cleanup belongs with editorial cleanup. Technical markup cannot rescue confused copy. Clean copy without clean markup leaves useful evidence on the table. The two layers should behave like two witnesses who were in the same room.

The wrong story travels well

Once a firm discovers that schema matters, the temptation is to mark up everything. Every service, every founder mention, every review-like fragment, every office, every profile, every FAQ. I am cautious about that instinct.

More markup means more places for old facts to hide. It also creates a false sense of control. The firm begins to believe that because something is machine-readable, the machine must understand it as intended. That is not how retrieval works. Markup is a signal inside a larger evidence field. It competes and cooperates with visible text, links, external pages, directories, citations, and user prompts.

Over-marking can also blur the service boundary. If every adjacent topic becomes a service entity, the company starts to look broader than it is. For expert firms, breadth is often the source of machine confusion. The assistant cannot tell whether the company is advisory, legal, software, training, risk, or consulting because the public record keeps feeding it all of those nouns.

I prefer spare schema that says fewer things accurately. Organization. Founder relationship if relevant. Current services. Correct name variants. Trusted profiles. Clear location. Product or service relationships where they genuinely matter. Then the page copy does the heavier explanatory work.

The measure is simple: would I be comfortable if an AI answer quoted this markup as prose? If not, the markup is not clean.

Bad structured data has a special unpleasantness. It travels without being seen. A founder can read the homepage and believe the company record is clean while the machine layer keeps whispering an older version. That whisper may not dominate every answer. It may only appear when the prompt asks about category, founder, services, or local providers. But those are exactly the questions that matter before a buyer speaks to the firm.

I do not treat schema as magic. I treat it as testimony. Sometimes it is strong testimony; sometimes it is ignored; sometimes it corroborates a bad witness. The work is to make sure it does not testify against the company.

For the composite advisory firm, the cleaning move is not dramatic. Audit the structured data against the current public facts. Remove stale categories. Align service markup with current service pages. Make founder relationships explicit and consistent. Check profile links for category poison. Keep the markup small enough to maintain.

Then test again. Ask the assistants what the firm is, what category it belongs to, who leads it, and which services it offers. The answers may not become perfect. But if the schema had been repeating the wrong story, silence that voice first.

The Entity Ledger Note — Observed name: a Singapore compliance advisory firm with valid but stale structured data. Machine risk: old legal-service markup reinforces directory drift and makes the wrong category look intentional. Cleaning move: audit schema as factual copy, then align organization, founder, service, and sameAs fields with the current record. Residual fog: broad schema vocabularies may still require visible text to carry the exact category boundary.