NIEM & Information Exchanges

Preamble

The objective of the National Information Exchange Model (NIEM) is to provide a “dictionary of agreed-upon terms, definitions, relationships, and formats that are independent of how information is stored in individual systems.”

(Alfred Jensen)

NIEM’s model makes no difference between data and information (Alfred Jensen)

For that purpose NIEM’s model combines commonly agreed core elements with community-specific ones. Weighted against the benefits of simplicity, this architecture overlooks critical distinctions:

  • Inputs: Data vs Information
  • Dictionary: Lexicon and Thesaurus
  • Meanings: Lexical Items and Semantics
  • Usage: Roots and Aspects

That shallow understanding of information significantly hinders the exchange of information between business or institutional entities across overlapping domains.

Inputs: Data vs Information

Data is made of unprocessed observations, information makes sense of data, and knowledge makes use of information. Given that NIEM is meant to be an exchange between business or institutional users, it should have no concern with data mining or knowledge management.

Data is meaningless, information meaning is set by semantic domains.

As an exchange, NIEM should have no concern with data mining or knowledge management.

The problem is that, as conveyed by “core of data elements that are commonly understood and defined across domains, such as person, activity, document, location”, NIEM’s model makes no explicit distinction between data and information.

As a corollary, it implies that data may not only be meaningful, but universally so, which leads to a critical trap: as substantiated by data analytics, data is not supposed to mean anything before processed into information; to keep with examples, even if the definition of persons and locations may not be specific, the semantics of associated information is nonetheless set by domains, institutional, regulatory, contractual, or otherwise.

Data is meaningless, information meaning is set by semantic domains.

Data is meaningless, information meaning is set by semantic domains.

Not surprisingly, that medley of data and information is mirrored by NIEM’s dictionary.

Dictionary: Lexicon & Thesaurus

As far as languages are concerned, words (e.g “word”, “ξ∏¥” ,”01100″) remain data items until associated to some meaning. For that reason dictionaries are built on different levels, first among them lexical and semantic ones:

  • Lexicons take items on their words and gives each of them a self-contained meaning.
  • Thesauruses position meanings within overlapping galaxies of understandings held together by the semantic equivalent of gravitational forces; the meaning of words can then be weighted by the combined semantic gravity of neighbors.

In line with its shallow understanding of information, NIEM’s dictionary only caters for a lexicon of core standalone items associated with type descriptions to be directly implemented by information systems. But due to the absence of thesaurus, the dictionary cannot tackle the semantics of overlapping domains: if lexicons alone can deal with one-to-one mappings of items to meanings (a), thesauruses are necessary for shared (b) or alternative (c) mappings.

vv

Shared or alternative meanings cannot be managed with lexicons

With regard to shared mappings (b), distinct lexical items (e.g qualification) have to be mapped to the same entity (e.g person). Whereas some shared features (e.g person’s birth date) can be unequivocally understood across domains, most are set through shared (professional qualification), institutional (university diploma), or specific (enterprise course) domains .

Conversely, alternative mappings (c) arise when the same lexical items (e.g “mole”) can be interpreted differently depending on context (e.g plastic surgeon, farmer, or secret service).

Whereas lexicons may be sufficient for the use of lexical items across domains (namespaces in NIEM parlance), thesauruses are necessary if meanings (as opposed to uses) are to be set across domains. But thesauruses being just tools are not sufficient by themselves to deal with overlapping semantics. That can only be achieved through a conceptual distinction between lexical and semantic envelops.

Meanings: Lexical Items & Semantics

NIEM’s dictionary organize names depending on namespaces and relationships:

  • Namespaces: core (e.g Person) or specific (e.g Subject/Justice).
  • Relationships: types (Counselor/Person) or properties (e.g PersonBirthDate).
vvv

NIEM’s Lexicon: Core (a) and specific (b) and associated core (c) and specific (d) properties

But since lexicons know only names, the organization is not orthogonal, with lexical items mapped indifferently to types and properties. The result being that, deprived of reasoned guidelines, lexical items are chartered arbitrarily, e.g:

Based on core PersonType, the Justice namespace uses three different schemes to define similar lexical items:

  • “Counselor” is described with core PersonType.
  • “Subject” and “Suspect” are both described with specific SubjectType, itself a sub-type of PersonType.
  • “Arrestee” is described with specific ArresteeType, itself a sub-type of SubjectType.

Based on core EntityType:

  • The Human Services namespace bypasses core’s namesake and introduces instead its own specific EmployerType.
  • The Biometrics namespace bypasses possibly overlapping core Measurer and BinaryCaptured and directly uses core EntityType.
Lexical items are meshed disregarding semantics

Lexical items are chartered arbitrarily

Lest expanding lexical items clutter up dictionary semantics, some rules have to be introduced; yet, as noted above, these rules should be limited to information exchange and stop short of knowledge management.

Usage: Roots and Aspects

As far as information exchange is concerned, dictionaries have to deal with lexical and semantic meanings without encroaching on ontologies or knowledge representation. In practice that can be best achieved with dictionaries organized around roots and aspects:

  • Roots and structures (regular, black triangles) are used to anchor information units to business environments, source or destination.
  • Aspects (italics, white triangles) are used to describe how information units are understood and used within business environments.
nformation exchanges are best supported by dictionaries organized around roots and aspects

Information exchanges are best supported by dictionaries organized around roots and aspects

As it happens that distinction can be neatly mapped to core concepts of software engineering.

Further Reading

External Links

Advertisements

Tags:

One Response to “NIEM & Information Exchanges”

  1. batdorfr Says:

    I would suggest that NIEM will go the way of the dinosaurs as Natural Language algorithms take hold in a few years. The problem with NIEM is that it isn’t finding and developing new concepts that it can support and build upon but instead it grows knowledge in the rear view mirror. It isn’t a total waste because you can put the NIEM dictionary within the algorithms, but why spend all that time and energy plus money to do so when you can get better integration of data from recognizing the patterns of the language instead of the language itself.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: