That's not a semantic layer

The Colluseam

Google the term ‘Semantic Layer" and you’ll see a raft of sponsored answers from vendors of BI and data virtualization tools. The essential idea is that the semantic layer provides a mapping of business terminology to the underlying data assets.

The term is increasingly being added to the marketing language for numerous tools, and, while correlation is not causation, there’s more recent interest in the topic according to Google Trends:

Google Trends for “Semantic Layer” since 2009  It’s an axiom in Computer Science that every layer of abstraction brings a cost. In the case of a semantic layer, we must endeavor to keep the layer current with the underlying physical data stores. Should that currency be lost business users won’t trust it: instead, they will work around it and create new silos of information. 

Thus, the semantic layer better deliver real value that outweighs that cost. I am deeply skeptical that the current ones, as defined by the industry, do.

The value of a semantic layer is to map underlying data to business terminology that everyone agrees on. But, whose terminology? The products out there let the business create their own. This is misguided. Creating our own terminology should be the exception, not the rule. 

For business terminology to really deliver value it must be shared by a community of practitioners that span multiple businesses. If I describe an oil well using OSDU, a temperature using QUDT, or a grace period on a loan via FIBO, then I describe it in a way that people new to my business can easily consume (since the terminology is well documented) and perhaps they already know. If I use my own definitions, they have to learn them. When I share data outside of my organizational boundary the same applies.

A true semantic layer product would have knowledge of these open standards and guide the user to describe their data using these, not create more definitions. But that’s a harder product to implement. The industry takes the easy way out and gives the customer what they think they want, rather than guiding them to what they really need.

A semantic layer that implements this? That I can get behind - at least until the underlying datastores solve this. 😃