Archive for the ‘Standards’ Category

A Brief Ontology Of Time

May 23, 2018

Preamble

The melting of digital fences between enterprises and business environments is putting a new light on the way time has to be taken into account.

Joseph_Koudelka_time

Time is what happens between events (Josef Koudelka)

The shift can be illustrated by the EU GDPR : by introducing legal constraints on the notifications of changes in personal data, regulators put systems’ internal events on the same standing as external ones and make all time-scales equal whatever their nature.

Ontological Limit of WC3 Time Recommendation

The W3C recommendation for OWL time description is built on the well accepted understanding of temporal entity, duration, and position:

Cake_time

While there isn’t much to argue with what is suggested, the puzzle comes from what is missing, namely the modalities of time: the recommendation makes use of calendars and time-stamps but ignores what is behind, i.e time ontological dimensions.

Out of the Box

As already expounded (Ontologies & Enterprise Architecture) ontologies are at their best when a distinction can be maintained between representation and semantics. That point can be illustrated here by adding an ontological dimension to the W3C description of time:

  1. Ontological modalities are introduced by identifying (#) temporal positions with regard to a time-frame.
  2. Time-frames are open-ended temporal entities identified (#) by events.
Cake_timeOnto

How to add ontological modalities to time

It must be noted that initial truth-preserving properties still apply across ontological modalities.

Conclusion: OWL Descriptions Should Not Be Confused With Ontologies

Languages are meant to combine two primary purposes: communication and symbolic representation, some (e.g natural, programming) being focused on the former, other (e.g formal, specific) on the latter.

The distinction is somewhat blurred with languages like OWL (Web Ontology Language) due to the versatility and plasticity of semantic networks.

 

Ontologies and profiles are meant to target domains, profiles and domains are modeled with languages, including OWL.

That apparent proficiency may induce some confusion between languages and ontologies, the former dealing with the encoding of time representations, the latter with time modalities.

Further Readings

External Links

Advertisements

GDPR Ontological Primer

May 10, 2018

Preamble

European Union’s General Data Protection Regulation (GDPR), to come into effect  this month, is a seminal and momentous milestone for data privacy .

Nothing Personal (Arthur Szyk)

Yet, as reported by Reuters correspondents, European enterprises and regulators are not ready; more worryingly, few (except consultants) are confident about GDPR direction.

Misgivings and uncertainties should come as no surprise considering GDPR’s two innate challenges:

  • Regulating privacy rights represents a very ambitious leap into a digital space now at the core of corporate business strategies.
  • Compliance will not be put under a single authority but be overseen by an assortment of national and regional authorities across the European Union.

On that account, ontologies appear as the best (if not the only) conceptual approach able to bring contexts (EU nations), concerns (business vs privacy), and enterprises (organization and systems) into a shared framework.

A workbench built with the Caminao ontological kernel is meant to explore the scope and benefits of that approach, with a beta version (Protégé/OWL 2) available for comments on the Stanford/Protégé portal using the link: Caminao Ontological Kernel (CaKe_GDPR).

Enterprise Architectures & Regulations

Compared to domain specific regulations, GDPR  is a governance-oriented regulation set across business concerns and enterprise organization; but unlike similarly oriented ones like accounting, GDPR is aiming at the nexus of business competition, namely the processing of data into information and knowledge. With such a strategic stake, compliance is bound to become a game-changer cutting across business intelligence, production systems, and decision-making. Hence the need for an integrated, comprehensive, and consistent approach to the different dimensions involved:

  • Concepts upholding businesses, organizations, and regulations.
  • Documentation with regard to contexts and statutory basis.
  • Regulatory options and compliance assessments
  • Enterprise systems architecture and operations

Moreover, as for most projects affecting enterprise architectures, carrying through GDPR compliance is to involve continuous, deep, and wide ranging changes that will have to be brought off without affecting overall enterprise performances.

Ontologies arguably provide a conclusive solution to the problem, if only because there is no other way to bring code, models, documents, and concepts under a single roof. That could be achieved by using ontologies profiles to frame GDPR categories along enterprise architectures models and components.

CakeGDPR_00.jpg

Basic GDPR categories and concepts (black color) as framed by the Caminao Kernel

Compliance implementation could then be carried out iteratively across four perspectives:

  • Personal data and managed information
  • Lawfulness of activities
  • Time and Events
  • Actors and organization.

Data & Information

To begin with, GDPR defines ‘personal data’ as “any information relating to an identified or identifiable natural person (‘data subject’)”. Insofar as logic is concerned that definition implies an equivalence between ‘data’ and ‘information’, an assumption clearly challenged by the onslaught of big data: if proofs were needed, the Cambridge Analytica episode demonstrates how easy raw data can become a personal affair. Hence the need to keep an ontological level of indirection between regulatory intents and the actual semantics of data as managed by information systems.

CakeGDPR_data

Managing the ontological gap between regulatory understandings and compliance footprints

Once lexical ambiguities set apart, the question is not so much about the data bases of well identified records than about the flows of data continuously processed: if identities and ownership are usually set upfront by business processes, attributions may have to be credited to enterprises know-how if and when carried out through data analytics.

Given that the distinctions are neither uniform, exclusive or final, ontologies will be needed to keep tabs on moves and motives. OWL 2 constructs (cf annex) could also help, first to map GDPR categories to relevant information managed by systems, second to sort out natural data from nurtured knowledge.

Activities & Purposes

Given footprints of personal data, the objective is to ensure the transparency and traceability of the processing activities subject to compliance.

Setting apart (see below for events) specific add-ons for notification and personal accesses,  charting compliance footprints is to be a complex endeavor: as there is no reason to assume some innate alignment of intended (regulation) and actual (enterprise) definitions, deciding where and when compliance should apply potentially calls for a review of all processing activities.

After taking into account the nature of activities, their lawfulness is to be determined by contexts (‘purpose limitation’ and ‘data minimization’) and time-frames (‘accuracy’ and ‘storage limitation’). And since lawfulness is meant to be transitive, a comprehensive map of the GDPR footprint is to rely on the logical traceability and transparency of the whole information systems, independently of GDPR.

That is arguably a challenging long-term endeavor, all the more so given that some kind of Chinese Wall has to be maintained around enterprise strategies, know-how, and operations. It ensues that an ontological level of indirection is again necessary between regulatory intents and effective processing activities.

Along that reasoning compliance categories, defined on their own, are first mapped to categories of functionalities (e.g authorization) or models (e.g use cases).

CakeGDPR_activ1

Compliance categories are associated upfront to categories of functionalities (e.g authorization) or models (e.g use cases).

Then, actual activities (e.g “rateCustomerCredit”) can be progressively brought into the compliance fold, either with direct associations with regulations or indirectly through associated models (e.g “ucRateCustomerCredit” use case).

CakeGDPR_activ2

Compliance as carried out through Use Case

The compliance backbone can be fleshed out using OWL 2 mechanisms (see annex) in order to:

  • Clarify the logical or functional dependencies between processing activities subject to compliance.
  • Qualify their lawfulness.
  • Draw equivalence, logical, or functional links between compliance alternatives.

That is to deal with the functional compliance of processing activities; but the most far-reaching impact of the regulation may come from the way time and events are taken into account.

Time & Events

As noted above, time is what makes the difference between data and information, and setting rules for notification makes that difference lawful. Moreover, by adding time constraints to the notifications of changes in personal data, regulators put systems’ internal events on the same standing as external ones. That apparently incidental departure echoes the immersion of systems into digitized business environments, making all time-scales equal whatever their nature. Such flattening is to induce crucial consequences for enterprise architectures.

That shift together with the regulatory intent are best taken into account by modeling events as changes in expectations, physical objects, processes execution, and symbolic objects, with personal data change belonging to the latter.

Gdpr events

Mapping internal (symbolic) and external (actual) events is a critical element of GDPR compliance

Putting apart events specific to GDPR (e.g data breaches), compliance with regard to accuracy and storage limitation regulations will require that all events affecting personal data:

  • Are set in time-frames, possibly overlapping.
  • Have notification constraints properly documented.
  • Have likelihood and costs of potential risks assessed.

As with data and activities, OWL 2 constructs are to be used to qualify compliance requirements.

Actors & Organization

GDPR introduces two specific categories of actors (aka roles): one (data subject) for natural persons, and one for actors set by organizations, either specifically for GDPR assignment, or by delegation to already defined actors.

Gdpr actors

GDPR roles can be set specifically or delegated

OWL 2 can then be used to detail how regulatory roles can be delegated to existing ones, enabling a smooth transition and a dynamic adjustment of enterprise organization with regulatory compliance.

It must be stressed that the semantic distinction between identified agents (e.g natural persons) and the roles (aka UML actors) they play in processes is of particular importance for GDPR compliance because who (or even what) is behind an actor interacting with a system is to remain unknown to the system until the actor can be authentically identified. If that ontological lapse is overlooked there is no way to define and deal with security, confidentiality or privacy regulations.

Conclusion

The use of ontologies brings clear benefits for regulators, enterprise governance, and systems architects.

Without shared conceptual guidelines chances are for the European regulatory orchestra to get lost in squabbles about minutiae before sliding into cacophony.

With regard to governance, bringing systems and regulations into a common conceptual framework is to enable clear and consistent compliance strategies and policies, as well as smooth learning curves.

With regard to architects, ontology-based compliance is to bring cross benefits and externalities, e.g from improved traceability and transparency of systems and applications.

Annex A: Mapping Regulations to Models (sample)

To begin with, OWL 2 can be used to map GDPR categories to relevant resources as managed by information systems:

  • Equivalence: GDPR and enterprise definitions coincide.
  • Logical intersection, union, complement: GDPR categories defined by, respectively, a cross, merge, or difference of enterprise definitions.
  • Qualified association between GDPR and enterprise categories.

Assuming the categories properly identified, the language can then be employed to define the sets of regulated instances:

  • Logical property restrictions, using existential and universal quantification.
  • Functional property restrictions, using joints on attributes values.

Other constructs, e.g cardinality or enumerations, could also be used for specific regulatory constraints.

Finally, some OWL 2 built-in mechanisms can significantly improve the assessment of alternative compliance policies by expounding regulations with regard to:

  • Equivalence, overlap, or complementarity.
  • Symmetry or asymmetry.
  • Transitivity
  • etc.

Annex B: Mapping Regulations to Capabilities

GDPR can be mapped to systems capabilities using well established Zachman’s taxonomy set by crossing architectures functionalities (Who,What,How, Where, When) and layers (business and organization), systems (logical structures and functionalities), and platforms (technologies).

Rules_GDPR

Regulatory Compliance vs Architectures Capabilities

These layers can be extended as to apply uniformly across external ontologies, from well-defined (e.g regulations) to fuzzy (e.g business prospects or new technologies) ones, e.g:

Ontologies, capabilities (Who,What,How, Where, When), and architectures (enterprise, systems, platforms).

Such mapping is to significantly enhance the transparency of regulatory policies.

Further Reading

External Links

Business Intelligence & Semantic Galaxies

March 26, 2018

Given the number and verbosity of alternative definitions pertaining to enterprise and systems architectures, common sense would suggest circumspection if not agnosticism. Instead, fierce wars are endlessly waged for semantic positions built on sand hills bound to crumble under whoever tries to stand defending them.

Nature & Nurture (Wang Xingwei)

Such doomed attempts appear to be driven by a delusion seeing concepts as frozen celestial bodies; fortunately, simple-minded catalogs of unyielding definitions are progressively pushed aside by the need to understand (and milk) the new complexity of business environments.

Business Intelligence: Mapping Semantics to Circumstances

As long as information systems could be kept behind Chinese walls semantic autarky was of limited consequences. But with enterprises’ gates crumbling under digital flows, competitive edges increasingly depend on open and creative business intelligence (BI), in particular:

  • Data understanding: giving form and semantics to massive and continuous inflows of raw observations.
  • Business understanding: aligning data understanding with business objectives and processes.
  • Modeling: consolidating data and business understandings into descriptive, predictive, or operational models.
  • Evaluation: assessing and improving accuracy and effectiveness of understandings with regard to business and decision-making processes.

BI: Mapping Semantics to Circumstances

Since BI has to take into account the continuity of enterprise’s objectives and assets, the challenge is to dynamically adjust the semantics of external (business environments) and internal (objects and processes) descriptions. That could be explained in terms of gravitational semantics.

Semantic Galaxies

Assuming concepts are understood as stars wheeling across unbounded and expanding galaxies, semantics could be defined by gravitational forces and proximity between:

  • Intensional concepts (stars) bearing necessary meaning set independently of context or purpose.
  • Extensional concepts (planets) orbiting intensional ones. While their semantics is aligned with a single intensional concept, they bear enough of their gravity to create a semantic environment.

On that account semantic domains would be associated to stars and their planets, with galaxies regrouping stars (concepts) and systems (domains) bound by gravitational forces (semantics).

Galax_00

Conceptual Stars & Planets

Semantic Dimensions & the Morphing of Concepts

While systems don’t leave much, if any, room for semantic wanderings, human languages are as good as they can be pliant, plastic, and versatile. Hence the need for business intelligence to span the stretch between open and fuzzy human semantics and systems straight-jacketed modeling languages.

That can be done by framing the morphing of concepts along Zachman’s architecture description: intensional concepts being detached of specific contexts and concerns are best understood as semantic roots able to breed multi-faceted extensions, to be eventually coerced into system specifications.

Galax_Dims

Framing concepts metamorphosis along Zachman’s architecture dimensions

The Alignment of Planets

As stars, concepts can be apprehended through a mix of reason and perception:

  • Figured out from a conceptual void waiting to be filled.
  • Fortuitously discovered in the course of an argument.

The benefit in both cases would be to delay verbal definitions and so to avoid preempted or biased understandings: as for the Schrödinger’s cat, trying to lock up meanings with bare words often breaks their semantic integrity, shattering scraps in every direction.

In contrast, making room for semantic alignments would help to consolidate overlapping definitions within conceptual galaxies, as illustrated by the examples below.

Example: Data

Wikipedia: Any sequence of one or more symbols given meaning by specific act(s) of interpretation; requires interpretation to become information.

Merriam-Webster: Factual information such as measurements or statistics; information in digital form that can be transmitted or processed; information and noise from a sensing device or organ that must be processed to be meaningful.

Cambridge Dictionary: Information, especially facts or numbers; information in an electronic form that can be stored and used by a computer.

Collins: Information that can be stored and used by a computer program.

TOGAF: Basic unit of information having a meaning and that may have subcategories (data items) of distinct units and values.

Galax_DataInfo

Example: System

Wikipedia: A regularly interacting or interdependent group of items forming a unified whole; Every system is delineated by its spatial and temporal boundaries, surrounded and influenced by its environment, described by its structure and purpose and expressed in its functioning.

Merriam-Webster: A regularly interacting or interdependent group of items forming a unified whole

Business Dictionary: A set of detailed methods, procedures and routines created to carry out a specific activity, perform a duty, or solve a problem; organized, purposeful structure that consists of interrelated and interdependent elements.

Cambridge Dictionary: A set of connected things or devices that operate together

Collins Dictionary: A way of working, organizing, or doing something which follows a fixed plan or set of rules; a set of things / rules.

TOGAF: A collection of components organized to accomplish a specific function or set of functions (from ISO/IEC 42010:2007).

Further Reading

External Links

Unified Architecture Framework Profile (UAFP): Lost in Translation ?

July 2, 2017

Synopsis

The intent of Unified Architecture Framework Profile (UAFP) is to “provide a Domain Meta-model usable by non UML/SysML tool vendors who may wish to implement the UAF within their own tool and metalanguage.”

Detached Architecture (Víctor Enrich)

But a meta-model trying to federate (instead of bypassing) the languages of tools providers has to climb up the abstraction scale above any domain of concerns, in that case systems architectures. Without direct consideration of the domain, the missing semantic contents has to be reintroduced through stereotypes.

Problems with that scheme appear at two critical junctures:

  • Between languages and meta-models, and the way semantics are introduced.
  • Between environments and systems, and the way abstractions are defined.

Caminao’s modeling paradigm is used to illustrate the alternative strategy, namely the direct stereotyping of systems architectures semantics.

Languages vs Stereotypes

Meta-Models are models of models: just like artifacts of the latter represent sets of instances from targeted domains, artifacts of the former represent sets of symbolic artifacts from the latter. So while set higher on the abstraction scale, meta-models still reflect the domain of concerns.

Meta-models takes a higher view of domains, meta-languages don’t.

Things are more complex for languages because linguistic constructs ( syntax and semantics) and pragmatic are meant to be defined independently of domain of discourse. Taking a simple example from the model above, it contains two kinds of relationships:

  • Linguistic constructs:  represents, between actual items and their symbolic counterparts; and inherits, between symbolic descriptions.
  • Domain specific: played by, operates, and supervises.

While meta-models can take into account both categories, that’s not the case for languages which only consider linguistic constructs and mechanisms. Stereotypes often appear as a painless way to span the semantic fault between what meta-models have to do and what languages use to do; but that is misguided because mixing domain specific semantics with language constructs can only breed confusion.

Stereotypes & Semantics

If profiles and stereotypes are meant to refine semantics along domains specifics, trying to conciliate UML/SysML languages and non UML/SysML models puts UAFP in a lopsided position by looking the other way, i.e towards one-fits-all meta-language instead of systems architecture semantics. Its way out of this conundrum is to combine stereotypes with UML constraint, as can be illustrated with PropertySet:

UAFP for PropertySet (italics are for abstract)

Behind the mixing of meta-modeling levels (class, classifier, meta-class, stereotype, meta-constraint) and the jumble of joint modeling concerns (property, measurement, condition), the PropertySet description suggests the overlapping of two different kinds of semantics, one looking at objects and behaviors identified in environments (e.g asset, capability, resource); the other focused on systems components (property, condition, measurement). But using stereotypes indifferently for both kind of semantics has consequences.

Stereotypes, while being the basic UML extension mechanism, comes without much formalism and can be applied extensively. As a corollary, their semantics must be clearly defined in line with the context of their use, in particular for meta-languages topping different contexts.

PropertySet for example is defined as an abstract element equivalent to a data type, simple or structured, a straightforward semantic that can be applied consistently for contexts, domains or languages.

That’s not the case for ActualPropertySet which is defined as an InstanceSpecification for a “set or collection of actual properties”. But properties defined for domains (as opposed to languages) have no instances of their own and can only occur as concrete states of objects, behaviors, or expectations, or as abstract ranges in conditions or constraints. And semantics ambiguities are compounded when inheritance is indifferently applied between a motley of stereotypes.

Properties epitomize the problems brought about by confusing language and domain stereotypes and point to a solution.

To begin with syntax, stereotypes are redundant because properties can be described with well-known language constructs.

As for semantics, stereotyped properties should meet clearly defined purposes; as far as systems architectures are concerned, that would be the mapping to architecture capabilities:

Property must be stereotyped with regard to induced architecture capabilities.

  • Properties that can be directly and immediately processed, symbolic (literal) or not (binary objects).
  • Properties whose processing depends on external resource, symbolic (reference) or not (numeric values).

Such stereotypes could be safely used at language level due to the homogeneity of property semantics. That’s not the case for objects and behaviors.

Languages Abstractions & Symbolic Representations

The confusion between language and domain semantics mirrors the one between enterprise and systems, as can be illustrated by UAFP’s understanding of abstraction.

In the context of programming languages, isAbstract applies to descriptions that are not meant to be instantiated: for UAFP “PhysicalResource” isAbstract because it cannot occur except as “NaturalResource” or “ResourceArtifact”, none of them isAbstract.

“isAbstract” has no bearing on horses and carts, only on the meaning of the class PhysicalResource.

Despite the appearances, it must be reminded that such semantics have nothing to do with the nature of resources, only with what can be said about it. In any case the distinction is irrelevant as long as the only semantics considered are confined to specification languages, which is the purpose of the UAFP.

As that’s not true for enterprise architects, confusion is to arise when the modeling Paradigm is extended as to include environments and their association with systems. Then, not only that two kinds of instances (and therefore abstractions) are to be described, but that the relationship between external and internal instances is to determine systems architectures capabilities. Extending the simple example above:

  • Overlooking the distinction between active and passive physical resources prevents a clear and reliable mapping to architecture technical capabilities.
  • Organizational resource lumps together collective (organization), individual and physical (person), individual and organizational (role), symbolic (responsibility), resources. But these distinctions have a direct consequences for architecture functional capabilities.

Abstraction & Symbolic representation

Hence the importance of the distinction between domain and language semantics, the former for the capabilities of the systems under consideration, the latter for the capabilities of the specification languages.

Systems Never Walk Alone

Profiles are supposed to be handy, reliable, and effective guides for the management of specific domains, in that case the modeling of enterprise architectures. As it happens, the UAF profile seems to set out the other way, forsaking architects’ concerns for tools providers’ ones; that can be seen as a lose-lose venture because:

  • There isn’t much for enterprise architects along that path.
  • Tools interoperability would be better served by a parser focused on languages semantics independently of domain specifics.

Hopefully, new thinking about architecture frameworks (e.g DoDAF) tends to restyle them as EA profiles, which may help to reinstate basic requirements:

  • Explicit modeling of environment, enterprise, and systems.
  • Clear distinction between domain (enterprise and systems architecture) and languages.
  • Unambiguous stereotypes with clear purposes

A simple profile for enterprise architecture

On a broader perspective understanding meta-models and profiles as ontologies would help with the alignment of purposes (enterprise architects vs tools providers), scope (enterprise vs systems), and languages (modeling vs programming).

Back to Classics: Ontologies

As introduced long ago by philosophers, ontologies are meant to make sense of universes of discourse. To be used as meta-models and profiles ontologies must remain neutral and support representation and contents semantics independently of domains of concern or perspective.

With regard to neutrality, the nature of semantics should tally the type of nodes (top):

  • Nodes would represent elements specific to domains (bottom right).
  • Connection nodes would be used for semantically neutral (aka syntactic) associations to be applied uniformly across domains (bottom left).

That can be illustrated with the simple example of cars:

KM_CaseRaw

RDF graphs (top) support formal (bottom left) and domain specific (bottom right) semantics.

With regard to contexts, ontologies should be defined according to the nature of governance and stability:

  • Institutional: Regulatory authority, steady, changes subject to established procedures.
  • Professional: Agreed upon between parties, steady, changes subject to accords.
  • Corporate: Defined by enterprises, changes subject to internal decision-making.
  • Social: Defined by usages, volatile, continuous and informal changes.
  • Personal: Customary, defined by named individuals (e.g research paper).

Ontologies set along that taxonomy could also be refined as to be aligned with enterprise architecture layers: enterprise, systems, platforms, e.g:

Ontologies, capabilities (Who,What,How, Where, When), and architectures (enterprise, systems, platforms).

With regard to concerns ontologies should  focus on the epistemic nature of targeted items: terms, documents, symbolic representations, or actual objects and phenomena. That would outline four basic concerns that may or may not be combined:

  • Thesaurus: ontologies covering terms and concepts.
  • Document Management: ontologies covering documents with regard to topics.
  • Organization and Business: ontologies pertaining to enterprise organization, objects and activities.
  • Engineering: ontologies pertaining to the symbolic representation of products and services.
KM_OntosCapabs

Ontologies: Purposes & Targets

More generally, understanding meta-models and profiles as functional ontologies is to bring all EA business and engineering concerns within a comprehensive and consistent conceptual framework.

A workbench built with the Caminao ontological kernel is meant to explore the scope and benefits of that approach, with a beta version (Protégé/OWL 2) available for comments on the Stanford/Protégé portal using the link: Caminao Ontological Kernel (CaKe_).

Further Reading

Models
Architectures
Enterprise Architecture
UML

External Links

NIEM & Information Exchanges

January 24, 2017

Preamble

The objective of the National Information Exchange Model (NIEM) is to provide a “dictionary of agreed-upon terms, definitions, relationships, and formats that are independent of how information is stored in individual systems.”

(Alfred Jensen)

NIEM’s model makes no difference between data and information (Alfred Jensen)

For that purpose NIEM’s model combines commonly agreed core elements with community-specific ones. Weighted against the benefits of simplicity, this architecture overlooks critical distinctions:

  • Inputs: Data vs Information
  • Dictionary: Lexicon and Thesaurus
  • Meanings: Lexical Items and Semantics
  • Usage: Roots and Aspects

That shallow understanding of information significantly hinders the exchange of information between business or institutional entities across overlapping domains.

Inputs: Data vs Information

Data is made of unprocessed observations, information makes sense of data, and knowledge makes use of information. Given that NIEM is meant to be an exchange between business or institutional users, it should have no concern with data mining or knowledge management.

Data is meaningless, information meaning is set by semantic domains.

As an exchange, NIEM should have no concern with data mining or knowledge management.

The problem is that, as conveyed by “core of data elements that are commonly understood and defined across domains, such as person, activity, document, location”, NIEM’s model makes no explicit distinction between data and information.

As a corollary, it implies that data may not only be meaningful, but universally so, which leads to a critical trap: as substantiated by data analytics, data is not supposed to mean anything before processed into information; to keep with examples, even if the definition of persons and locations may not be specific, the semantics of associated information is nonetheless set by domains, institutional, regulatory, contractual, or otherwise.

Data is meaningless, information meaning is set by semantic domains.

Data is meaningless, information meaning is set by semantic domains.

Not surprisingly, that medley of data and information is mirrored by NIEM’s dictionary.

Dictionary: Lexicon & Thesaurus

As far as languages are concerned, words (e.g “word”, “ξ∏¥” ,”01100″) remain data items until associated to some meaning. For that reason dictionaries are built on different levels, first among them lexical and semantic ones:

  • Lexicons take items on their words and gives each of them a self-contained meaning.
  • Thesauruses position meanings within overlapping galaxies of understandings held together by the semantic equivalent of gravitational forces; the meaning of words can then be weighted by the combined semantic gravity of neighbors.

In line with its shallow understanding of information, NIEM’s dictionary only caters for a lexicon of core standalone items associated with type descriptions to be directly implemented by information systems. But due to the absence of thesaurus, the dictionary cannot tackle the semantics of overlapping domains: if lexicons alone can deal with one-to-one mappings of items to meanings (a), thesauruses are necessary for shared (b) or alternative (c) mappings.

vv

Shared or alternative meanings cannot be managed with lexicons

With regard to shared mappings (b), distinct lexical items (e.g qualification) have to be mapped to the same entity (e.g person). Whereas some shared features (e.g person’s birth date) can be unequivocally understood across domains, most are set through shared (professional qualification), institutional (university diploma), or specific (enterprise course) domains .

Conversely, alternative mappings (c) arise when the same lexical items (e.g “mole”) can be interpreted differently depending on context (e.g plastic surgeon, farmer, or secret service).

Whereas lexicons may be sufficient for the use of lexical items across domains (namespaces in NIEM parlance), thesauruses are necessary if meanings (as opposed to uses) are to be set across domains. But thesauruses being just tools are not sufficient by themselves to deal with overlapping semantics. That can only be achieved through a conceptual distinction between lexical and semantic envelops.

Meanings: Lexical Items & Semantics

NIEM’s dictionary organize names depending on namespaces and relationships:

  • Namespaces: core (e.g Person) or specific (e.g Subject/Justice).
  • Relationships: types (Counselor/Person) or properties (e.g PersonBirthDate).
vvv

NIEM’s Lexicon: Core (a) and specific (b) and associated core (c) and specific (d) properties

But since lexicons know only names, the organization is not orthogonal, with lexical items mapped indifferently to types and properties. The result being that, deprived of reasoned guidelines, lexical items are chartered arbitrarily, e.g:

Based on core PersonType, the Justice namespace uses three different schemes to define similar lexical items:

  • “Counselor” is described with core PersonType.
  • “Subject” and “Suspect” are both described with specific SubjectType, itself a sub-type of PersonType.
  • “Arrestee” is described with specific ArresteeType, itself a sub-type of SubjectType.

Based on core EntityType:

  • The Human Services namespace bypasses core’s namesake and introduces instead its own specific EmployerType.
  • The Biometrics namespace bypasses possibly overlapping core Measurer and BinaryCaptured and directly uses core EntityType.
Lexical items are meshed disregarding semantics

Lexical items are chartered arbitrarily

Lest expanding lexical items clutter up dictionary semantics, some rules have to be introduced; yet, as noted above, these rules should be limited to information exchange and stop short of knowledge management.

Usage: Roots and Aspects

As far as information exchange is concerned, dictionaries have to deal with lexical and semantic meanings without encroaching on ontologies or knowledge representation. In practice that can be best achieved with dictionaries organized around roots and aspects:

  • Roots and structures (regular, black triangles) are used to anchor information units to business environments, source or destination.
  • Aspects (italics, white triangles) are used to describe how information units are understood and used within business environments.
nformation exchanges are best supported by dictionaries organized around roots and aspects

Information exchanges are best supported by dictionaries organized around roots and aspects

As it happens that distinction can be neatly mapped to core concepts of software engineering.

P.S. Thesauruses & Ontologies

Ontologies are systematic accounts of existence for whatever is considered, in other words some explicit specification of the concepts meant to make sense of a universe of discourse. From that starting point three basic observations can be made:

  1. Ontologies are made of categories of things, beings, or phenomena; as such they may range from simple catalogs to philosophical doctrines.
  2. Ontologies are driven by cognitive (i.e non empirical) purposes, namely the validity and consistency of symbolic representations.
  3. Ontologies are meant to be directed at specific domains of concerns, whatever they can be: politics, religion, business, astrology, etc.

With regard to models, only the second one puts ontologies apart: contrary to models, ontologies are about understanding and are not supposed to be driven by empirical purposes.

On that basis, ontologies can be understood as thesauruses describing galaxies of concepts (stars) and features (planets) held together by semantic gravitation weighted by similarity or proximity. As such ontologies should be NIEM’s tool of choice.

Further Reading

External Links

Zebras cannot be saddled or harnessed

September 23, 2016

As far as standards go, the more they are, the less they’re worth.

nn

Read my code, if you can …

What have we got

Assuming that modeling languages are meant to build abstractions, one would expect their respective ladders converging somewhere up in some conceptual or meta cloud.

Assuming that standards are meant to introduce similarities into diversity, one would expect clear-cut taxonomies to be applied to artifacts designs.

Instead one will find bounty of committees, bloated specifications, and an open-minded if clumsy language confronted to a number of specific ones.

What is missing

Given the constitutive role of mathematical logic in computing systems, its quasi absence in modeling methods of their functional behavior is dumbfounding. Formal logic, set theory, semiotics, name it, every aspect of systems modeling can rely on a well established corpus of concepts and constructs. And yet, these scientific assets may be used in labs for research purposes but they remain overlooked for any practical use; as if the laser technology had been kept out of consumers markets for almost a century.

What should be done

The current state of affairs can be illustrated by a Horse vs Zebra metaphor: the former with a long and proved track record of varied, effective and practical usages, the latter with almost nothing to its credit except its archetypal idiosyncrasy.

Like horses, logic can be harnessed or saddled to serve a wide range of purposes without loosing anything of its universality. By contrast, concurrent standards and modeling languages can be likened to zebras: they may be of some use for their owner, but from an outward perspective, what remains is their distinctive stripes.

So the way out of the conundrum seems obvious: get rid of the stripes and put back the harness of logic on all the modeling horses.

What Can Be Done

Frameworks are meant to promote consensus and establish clear and well circumscribed common ground; but the whopping range of OMG’s profiles and frameworks doesn’t argue in favor of meta-models.

Ontologies by contrast are built according to the semantics of domains and concerns. So whereas meta-models have to mix lexical, syntactic, and semantic constructs, ontologies can be built on well delineated layers.

An ontological kernel has been developed as a Proof of Concept of the benefits of ontologies for enterprise architecture, the purpose being to extend the Caminao enterprise architecture paradigm to contexts and environments. That kernel has been built on two principles:

  • A clear-cut distinction between truth-preserving representation and domain specific semantics.
  • Profiled ontologies designed according to the nature of contents (concepts, documents, or artifacts), layers (environment, enterprise, systems, platforms), and contexts (institutional, professional, corporate, social.

A Protégé/OWL 2 implementation ensures portability and interoperability. A beta version is available for comments on the Stanford/Protégé portal with the link: Caminao Ontological Kernel (CaKe).

Further Readings