Archive for the ‘Semantic Web’ Category

A Brief Ontology Of Time

May 23, 2018

Preamble

The melting of digital fences between enterprises and business environments is putting a new light on the way time has to be taken into account.

Joseph_Koudelka_time

Time is what happens between events (Josef Koudelka)

The shift can be illustrated by the EU GDPR : by introducing legal constraints on the notifications of changes in personal data, regulators put systems’ internal events on the same standing as external ones and make all time-scales equal whatever their nature.

Ontological Limit of WC3 Time Recommendation

The W3C recommendation for OWL time description is built on the well accepted understanding of temporal entity, duration, and position:

Cake_time

While there isn’t much to argue with what is suggested, the puzzle comes from what is missing, namely the modalities of time: the recommendation makes use of calendars and time-stamps but ignores what is behind, i.e time ontological dimensions.

Out of the Box

As already expounded (Ontologies & Enterprise Architecture) ontologies are at their best when a distinction can be maintained between representation and semantics. That point can be illustrated here by adding an ontological dimension to the W3C description of time:

  1. Ontological modalities are introduced by identifying (#) temporal positions with regard to a time-frame.
  2. Time-frames are open-ended temporal entities identified (#) by events.
Cake_timeOnto

How to add ontological modalities to time

It must be noted that initial truth-preserving properties still apply across ontological modalities.

Conclusion: OWL Descriptions Should Not Be Confused With Ontologies

Languages are meant to combine two primary purposes: communication and symbolic representation, some (e.g natural, programming) being focused on the former, other (e.g formal, specific) on the latter.

The distinction is somewhat blurred with languages like OWL (Web Ontology Language) due to the versatility and plasticity of semantic networks.

 

Ontologies and profiles are meant to target domains, profiles and domains are modeled with languages, including OWL.

That apparent proficiency may induce some confusion between languages and ontologies, the former dealing with the encoding of time representations, the latter with time modalities.

Further Readings

External Links

Advertisements

Collaborative Systems Engineering: From Models to Ontologies

April 9, 2018

Given the digitization of enterprises environments, engineering processes have to be entwined with business ones while kept in sync with enterprise architectures. That calls for new threads of collaboration taking into account the integration of business and engineering processes as well as the extension to business environments.

Wang-Qingsong_scaffold

Collaboration can be personal and direct, or collective and mediated (Wang Qingsong)

Whereas models are meant to support communication, traditional approaches are already straining when used beyond software generation, that is collaboration between humans and CASE tools. Ontologies, which can be seen as a higher form of models, could enable a qualitative leap for systems collaborative engineering at enterprise level.

Systems Engineering: Contexts & Concerns

To begin with contents, collaborations should be defined along three axes:

  1. Requirements: business objectives, enterprise organization, and processes, with regard to systems functionalities.
  2. Feasibility: business requirements with regard to architectures capabilities.
  3. Architectures: supporting functionalities with regard to architecture capabilities.
RekReuse_BFCo

Engineering Collaborations at Enterprise Level

Since these axes are usually governed by different organizational structures and set along different time-frames, collaborations must be supported by documentation, especially models.

Shared Models

In order to support collaborations across organizational units and time-frames, models have to bring together perspectives which are by nature orthogonal:

  • Contexts, concerns, and languages: business vs engineering.
  • Time-frames and life-cycle: business opportunities vs architecture stability.
EASquare2_eam.jpg

Harnessing MBSE to EA

That could be achieved if engineering models could be harnessed to enterprise ones for contexts and concerns. That is to be achieved through the integration of processes.

 Processes Integration

As already noted, the integration of business and engineering processes is becoming a key success factor.

For that purpose collaborations would have to take into account the different time-frames governing changes in business processes (driven by business value) and engineering ones (governed by assets life-cycles):

  • Business requirements engineering is synchronic: changes must be kept in line with architectures capabilities (full line).
  • Software engineering is diachronic: developments can be carried out along their own time-frame (dashed line).
EASq2_wrkflw

Synchronic (full) vs diachronic (dashed) processes.

Application-driven projects usually focus on users’ value and just-in-time delivery; that can be best achieved with personal collaboration within teams. Architecture-driven projects usually affect assets and non-functional features and therefore collaboration between organizational units.

Collaboration: Direct or Mediated

Collaboration can be achieved directly or through some mediation, the former being a default option for applications, the latter a necessary one for architectures.

Cycles_collabs00

Both can be defined according to basic cognitive and organizational mechanisms and supported by a mix of physical and virtual spaces to be dynamically redefined depending on activities, projects, locations, and organisation.

Direct collaborations are carried out between individuals with or without documentation:

  • Immediate and personal: direct collaboration between 5 to 15 participants with shared objectives and responsibilities. That would correspond to agile project teams (a).
  • Delayed and personal: direct collaboration across teams with shared knowledge but with different objectives and responsibilities. That would tally with social networks circles (c).
Cycles_collabs.jpg

Collaborations

Mediated collaborations are carried out between organizational units through unspecified individual members, hence the need of documentation, models or otherwise:

  • Direct and Code generation from platform or domain specific models (b).
  • Model transformation across architecture layers and business domains (d)

Depending on scope and mediation, three basic types of collaboration can be defined for applications, architecture, and business intelligence projects.

EASq2_collabs

Projects & Collaborations

As it happens, collaboration archetypes can be associated with these profiles.

Collaboration Mechanisms

Agile development model (under various guises) is the option of choice whenever shared ownership and continuous delivery are possible. Application projects can so be carried out autonomously, with collaborations circumscribed to team members and relying on the backlog mechanism.

The OODA (Observation, Orientation, Decision, Action) loop (and avatars) can epitomize projects combining operations, data analytics, and decision-making.

EASquare2_collaMechas

Collaboration archetypes

Projects set across enterprise architectures cannot be carried out without taking into account phasing constraints. While ill-fated Waterfall methods have demonstrated the pitfalls of procedural solutions, phasing constraints can be dealt with a roundabout mechanism combining iterative and declarative schemes.

Engineering vs Business Driven Collaborations

With collaborative engineering upgraded at enterprise level, the main challenge is to iron out frictions between application and architecture projects and ensure the continuity, consistency and effectiveness of enterprise activities. That can be achieved with roundabouts used as a collaboration mechanism between projects, whatever their nature:

  • Shared models are managed at roundabout level.
  • Phasing dependencies are set in terms of assertions on shared models.
  • Depending on constraints projects are carried out directly (1,3) or enter roundabouts (2), with exits conditioned by the availability of models.

Engineering driven collaboration: roundabout and backlogs

Moreover, with engineering embedded in business processes, collaborations must also bring together operational analytics, decision-making, and business intelligence. Here again, shared models are to play a critical role:

  • Enterprise descriptive and prescriptive models for information maps and objectives
  • Environment predictive models for data and business understanding.
OKBI_BIDM

Business driven collaboration: operations and business intelligence

Whereas both engineering and business driven collaborations depend on sharing information  and knowledge, the latter have to deal with open and heterogeneous semantics. As a consequence, collaborations must be supported by shared representations and proficient communication languages.

Ontologies & Representations

Ontologies are best understood as models’ backbones, to be fleshed out or detailed according to context and objectives, e.g:

  • Thesaurus, with a focus on terms and documents.
  • Systems modeling,  with a focus on integration, e.g Zachman Framework.
  • Classifications, with a focus on range, e.g Dewey Decimal System.
  • Meta-models, with a focus on model based engineering, e.g models transformation.
  • Conceptual models, with a focus on understanding, e.g legislation.
  • Knowledge management, with a focus on reasoning, e.g semantic web.

As such they can provide the pillars supporting the representation of the whole range of enterprise concerns:

KM_OntosCapabs

Taking a leaf from Zachman’s matrix, ontologies can also be used to differentiate concerns with regard to architecture layers: enterprise, systems, platforms.

Last but not least, ontologies can be profiled with regard to the nature of external contexts, e.g:

  • Institutional: Regulatory authority, steady, changes subject to established procedures.
  • Professional: Agreed upon between parties, steady, changes subject to established procedures.
  • Corporate: Defined by enterprises, changes subject to internal decision-making.
  • Social: Defined by usage, volatile, continuous and informal changes.
  • Personal: Customary, defined by named individuals (e.g research paper).

Cross profiles: capabilities, enterprise architectures, and contexts.

Ontologies & Communication

If collaborations have to cover engineering as well as business descriptions, communication channels and interfaces will have to combine the homogeneous and well-defined syntax and semantics of the former with the heterogeneous and ambiguous ones of the latter.

With ontologies represented as RDF (Resource Description Framework) graphs, the first step would be to sort out truth-preserving syntax (applied independently of domains) from domain specific semantics.

KM_CaseRaw

RDF graphs (top) support formal (bottom left) and domain specific (bottom right) semantics.

On that basis it would be possible to separate representation syntax from contents semantics, and to design communication channels and interfaces accordingly.

That would greatly facilitate collaborations across externally defined ontologies as well as their mapping to enterprise architecture models.

Conclusion

To summarize, the benefits of ontological frames for collaborative engineering can be articulated around four points:

  1. A clear-cut distinction between representation semantics and truth-preserving syntax.
  2. A common functional architecture for all users interfaces, humans or otherwise.
  3. Modular functionalities for specific semantics on one hand, generic truth-preserving and cognitive operations on the other hand.
  4. Profiled ontologies according to concerns and contexts.
KM_OntosCollabs

Clear-cut distinction (1), unified interfaces architecture (2), functional alignment (3), crossed profiles (4).

A critical fifth benefit could be added with regard to business intelligence: combined with deep learning capabilities, ontologies would extend the scope of collaboration to explicit as well as implicit knowledge, the former already framed by languages, the latter still open to interpretation and discovery.

Further Reading

 

Business Intelligence & Semantic Galaxies

March 26, 2018

Given the number and verbosity of alternative definitions pertaining to enterprise and systems architectures, common sense would suggest circumspection if not agnosticism. Instead, fierce wars are endlessly waged for semantic positions built on sand hills bound to crumble under whoever tries to stand defending them.

Nature & Nurture (Wang Xingwei)

Such doomed attempts appear to be driven by a delusion seeing concepts as frozen celestial bodies; fortunately, simple-minded catalogs of unyielding definitions are progressively pushed aside by the need to understand (and milk) the new complexity of business environments.

Business Intelligence: Mapping Semantics to Circumstances

As long as information systems could be kept behind Chinese walls semantic autarky was of limited consequences. But with enterprises’ gates crumbling under digital flows, competitive edges increasingly depend on open and creative business intelligence (BI), in particular:

  • Data understanding: giving form and semantics to massive and continuous inflows of raw observations.
  • Business understanding: aligning data understanding with business objectives and processes.
  • Modeling: consolidating data and business understandings into descriptive, predictive, or operational models.
  • Evaluation: assessing and improving accuracy and effectiveness of understandings with regard to business and decision-making processes.

BI: Mapping Semantics to Circumstances

Since BI has to take into account the continuity of enterprise’s objectives and assets, the challenge is to dynamically adjust the semantics of external (business environments) and internal (objects and processes) descriptions. That could be explained in terms of gravitational semantics.

Semantic Galaxies

Assuming concepts are understood as stars wheeling across unbounded and expanding galaxies, semantics could be defined by gravitational forces and proximity between:

  • Intensional concepts (stars) bearing necessary meaning set independently of context or purpose.
  • Extensional concepts (planets) orbiting intensional ones. While their semantics is aligned with a single intensional concept, they bear enough of their gravity to create a semantic environment.

On that account semantic domains would be associated to stars and their planets, with galaxies regrouping stars (concepts) and systems (domains) bound by gravitational forces (semantics).

Galax_00

Conceptual Stars & Planets

 

Semantic Dimensions & Concepts Metamorphosis

While systems don’t leave much, if any, room for semantic wanderings, human languages are as good as they can be pliant, plastic, and versatile. Hence the need for business intelligence to span the stretch between open and fuzzy human semantics and systems straight-jacketed modeling languages.

That can be done by framing concepts metamorphosis along Zachman’s architecture description: intensional concepts being detached of specific contexts and concerns are best understood as semantic roots able to breed multi-faceted extensions, to be eventually coerced into system specifications.

Galax_Dims

Framing concepts metamorphosis along Zachman’s architecture dimensions

The Alignment of Planets

As stars, concepts can be apprehended through a mix of reason and perception:

  • Figured out from a conceptual void waiting to be filled.
  • Fortuitously discovered in the course of an argument.

The benefit in both cases would be to delay verbal definitions and so to avoid preempted or biased understandings: as for the Schrödinger’s cat, trying to lock up meanings with bare words often breaks their semantic integrity, shattering scraps in every direction.

In contrast, making room for semantic alignments would help to consolidate overlapping definitions within conceptual galaxies, as illustrated by the examples below.

Example: Data

Wikipedia: Any sequence of one or more symbols given meaning by specific act(s) of interpretation; requires interpretation to become information.

Merriam-Webster: Factual information such as measurements or statistics; information in digital form that can be transmitted or processed; information and noise from a sensing device or organ that must be processed to be meaningful.

Cambridge Dictionary: Information, especially facts or numbers; information in an electronic form that can be stored and used by a computer.

Collins: Information that can be stored and used by a computer program.

TOGAF: Basic unit of information having a meaning and that may have subcategories (data items) of distinct units and values.

Galax_DataInfo

Example: System

Wikipedia: A regularly interacting or interdependent group of items forming a unified whole; Every system is delineated by its spatial and temporal boundaries, surrounded and influenced by its environment, described by its structure and purpose and expressed in its functioning.

Merriam-Webster: A regularly interacting or interdependent group of items forming a unified whole

Business Dictionary: A set of detailed methods, procedures and routines created to carry out a specific activity, perform a duty, or solve a problem; organized, purposeful structure that consists of interrelated and interdependent elements.

Cambridge Dictionary: A set of connected things or devices that operate together

Collins Dictionary: A way of working, organizing, or doing something which follows a fixed plan or set of rules; a set of things / rules.

TOGAF: A collection of components organized to accomplish a specific function or set of functions (from ISO/IEC 42010:2007).

Further Reading

Open Ontologies: From Silos to Architectures

January 1, 2018

To be of any use for enterprises, ontologies have to embrace a wide range of contexts and concerns, often ill-defined for environments, rather well expounded for systems.

Circumscribed Contexts & Crossed Concerns (Robert Goben)

And now that enterprises have to compete in open, digitized, and networked environments, business and systems ontologies have to be combined into modular knowledge architectures.

Ontologies & Contexts

If open-ended business contexts and concerns are to be taken into account, the first step should be to characterize ontologies with regard to their source, justification, and the stability of their categories, e.g:

  • Institutional: Regulatory authority, steady, changes subject to established procedures.
  • Professional: Agreed upon between parties, steady, changes subject to accords.
  • Corporate: Defined by enterprises, changes subject to internal decision-making.
  • Social: Defined by usage, volatile, continuous and informal changes.
  • Personal: Customary, defined by named individuals (e.g research paper).

Assuming such an external taxonomy, the next step would be to see what kind of internal (i.e enterprise architecture) ontologies can be fitted into, as it’s the case for the Zachman framework.

The Zachman’s taxonomy is built on well established concepts (Who,What,How, Where, When) applied across architecture layers for enterprise (business and organization), systems (logical structures and functionalities), and platforms (technologies). These layers can be generalized and applied uniformly across external contexts, from well-defined (e.g regulations) to fuzzy (e.g business prospects or new technologies) ones, e.g:

Ontologies, capabilities (Who,What,How, Where, When), and architectures (enterprise, systems, platforms).

That “divide to conquer” strategy is to serve two purposes:

  • By bridging the gap between internal and external taxonomies it significantly enhances the transparency of governance and decision-making.
  • By applying the same motif (Who,What, How, Where, When) across the semantics of contexts, it opens the door to a seamless integration of all kinds of knowledge: enterprise, professional, institutional, scientific, etc.

As can be illustrated using Zachman concepts, the benefits are straightforward at enterprise architecture level (e.g procurement), due to the clarity of supporting ontologies; not so for external ones, which are by nature open and overlapping and often come with blurred semantics.

Ontologies & Concerns

A broad survey of RDF-based ontologies demonstrates how semantic overlaps and folds can be sort out using built-in differentiation between domains’ semantics on one hand, structure and processing of symbolic representations on the other hand. But such schemes are proprietary, and evidence shows their lines seldom tally, with dire consequences for interoperability: even without taking into account relationships and integrity constraints, weaving together ontologies from different sources is to be cumbersome, the costs substantial, and the outcome often reduced to a muddy maze of ambiguous semantics.

The challenge would be to generalize the principles as to set a basis for open ontologies.

Assuming that a clear line can be drawn between representation and contents semantics, with standard constructs (e.g predicate logic) used for the former, the objective would be to classify ontologies with regard to their purpose, independently of their representation.

The governance-driven taxonomy introduced above deals with contexts and consequently with coarse-grained modularity. It should be complemented by a fine-grained one to be driven by concerns, more precisely by the epistemic nature of the individual instances to be denoted. As it happens, that could also tally with the Zachman’s taxonomy:

  • Thesaurus: ontologies covering terms and concepts.
  • Documents: ontologies covering documents with regard to topics.
  • Business: ontologies of relevant enterprise organization and business objects and activities.
  • Engineering: symbolic representation of organization and business objects and activities.
KM_OntosCapabs

Ontologies: Purposes & Targets

Enterprises could then pick and combine templates according to domains of concern and governance. Taking an on-line insurance business for example, enterprise knowledge architecture would have to include:

  • Medical thesaurus and consolidated regulations (Knowledge).
  • Principles and resources associated to the web-platform (Engineering).
  • Description of products (e.g vehicles) and services (e.g insurance plans) from partners (Business).

Such designs of ontologies according to the governance of contexts and the nature of concerns would significantly reduce blanket overlaps and improve the modularity and transparency of ontologies.

On a broader perspective, that policy will help to align knowledge management with EA governance by setting apart ontologies defined externally (e.g regulations), from the ones set through decision-making, strategic (e.g plate-form) or tactical (e.g partnerships).

Open Ontologies’ Benefits

Benefits from open and formatted ontologies built along an explicit distinction between the semantics of representation (aka ontology syntax) and the semantics of context can be directly identified for:

Modularity: the knowledge basis of enterprise architectures could be continuously tailored to changes in markets and corporate structures without impairing enterprise performances.

Integration: the design of ontologies with regard to the nature of targets and stability of categories could enable built-in alignment mechanisms between knowledge architectures and contexts.

Interoperability: limited overlaps and finer granularity are to greatly reduce frictions when ontologies bearing out business processes are to be combined or extended.

Reliability: formatted ontologies can be compared to typed programming languages with regard to transparency, internal consistency, and external validity.

Last but not least, such reasoned design of ontologies may open new perspectives for the collaboration between cognitive humans and pretending ones.

Further Reading

External Links

Transcription & Deep Learning

September 17, 2017

Humans looking for reassurance against the encroachment of artificial brains should try YouTube subtitles: whatever Google’s track record in natural language processing, the way its automated scribe writes down what is said in the movies is essentially useless.

A blank sheet of paper was copied on a Xerox machine.
This copy was used to make a second copy.
The second to make a third one, and so on…
Each copy as it came out of the machine was re-used to make the next.
This was continued for one hundred times, producing a book of one hundred pages. (Ian Burn)

Experience directly points to the probable cause of failure: the usefulness of real-time transcriptions is not a linear function of accuracy because every slip can be fatal, without backup or second chance. It’s like walking a line: for all practical purposes a single misunderstanding can throw away the thread of understanding, without a chance of retrieve or reprieve.

Contrary to Turing machines, listeners have no finite states; and contrary to the sequence of symbols on tapes, tales are told by weaving together semantic threads. It ensues that stories are work in progress: readers can pause to review and consolidate meanings, but listeners have no other choice than punting on what comes to they mind, hopping that the fabric of the story will carry them out.

So, whereas automated scribes can deep learn from written texts and recorded conversations, there is no way to do the same from what listeners understand. That’s the beauty of story telling: words may be written but meanings are renewed each time the words are heard.

Further Reading

Things Speaking in Tongues

January 25, 2017

Preamble

Speaking in tongues (aka Glossolalia) is the fluid vocalizing of speech-like syllables without any recognizable association with a known language. Such experience is best (not ?) understood as the actual speaking of a gutted language with grammatical ghosts inhabited by meaningless signals.

The man behind the tongue (Herbert List)

Silent Sounds (Herbert List)

Usually set in religious context or circumstances, speaking in tongue looks like souls having their own private conversations. Yet, contrary to extraterrestrial languages, the phenomenon is not fictional and could therefore point to offbeat clues for natural language technology.

Computers & Language Technology

From its inception computers technology has been a matter of language, from machine code to domain specific. As a corollary, the need to be in speaking terms with machines (dumb or smart) has put a new light on interpreters (parsers in computer parlance) and open new perspectives for linguistic studies. In due return, computers have greatly improve the means to experiment and implement new approaches.

During the recent years advances in artificial intelligence (AI) have brought language technologies to a critical juncture between speech recognition and meaningful conversation, the former leaping ahead with deep learning and signal processing, the latter limping along with the semantics of domain specific languages.

Interestingly, that juncture neatly coincides with the one between the two intrinsic functions of natural languages: communication and representation.

Rules Engines & Neural Network

As exemplified by language technologies, one of the main development of deep learning has been to bring rules engines and neural networks under a common functional roof, turning the former unfathomable schemes into smart conceptual tutors for the latter.

In contrast to their long and successful track record in computer languages, rule-based approaches have fallen short in human conversations. And while these failings have hindered progress in the semantic dimension of natural language technologies, speech recognition have strode ahead on the back of neural networks fueled by increasing computing power. But the rift between processing and understanding natural languages is now being fastened through deep learning technologies. And with the leverage of rule engines harnessing neural networks, processing and understanding can be carried out within a single feedback loop.

From Communication to Cognition

From a functional point of view, natural languages can be likened to money, first as medium of exchange, then as unit of account, finally as store of value. Along that understanding natural languages would be used respectively for communication, information processing, and knowledge representation. And like the economics of money, these capabilities are to be associated to phased cognitive developments:

  • Communication: languages are used to trade transient signals; their processing depends on the temporal persistence of the perceived context and phenomena; associated behaviors are immediate (here-and-now).
  • Information: languages are also used to map context and phenomena to some mental representations; they can therefore be applied to scripted behaviors and even policies.
  • Knowledge: languages are used to map contexts, phenomena, and policies to categories and concepts to be stored as symbolic representations fully detached of original circumstances; these surrogates can the be used, assessed, and improved on their own.

As it happens, advances in technologies seem to follow these cognitive distinctions, with the internet of things (IoT) for data communications, neural networks for data mining and information processing, and the addition of rules engines for knowledge representation. Yet paces differ significantly: with regard to language processing (communication and information), deep learning is bringing the achievements of natural language technologies beyond 90% accuracy; but when language understanding has to take knowledge into account, performances still lag a third below: for computers knowledge to be properly scaled, it has to be confined within the semantics of specific domains.

Sound vs Speech

Humans listening to the Universe are confronted to a question that can be unfolded in two ways:

  • Is there someone speaking, and if it’s the case, what’s the language ?.
  • Is that a speech, and if it’s the case, who’s speaking ?.

In both case intentionality is at the nexus, but whereas the first approach has to tackle some existential questioning upfront, the second can put philosophy on the back-burner and focus on technological issues. Nonetheless, even the language first approach has been challenging, as illustrated by the difference in achievements between processing and understanding language technologies.

Recognizing a language has long been the job of parsers looking for the corresponding syntax structures, the hitch being that a parser has to know beforehand what it’s looking for. Parser’s parsers using meta-languages have been effective with programming languages but are quite useless with natural ones without some universal grammar rules to sort out babel’s conversations. But the “burden of proof” can now be reversed: compared to rules engines, neural networks with deep learning capabilities don’t have to start with any knowledge. As illustrated by Google’s Multilingual Neural Machine Translation System, such systems can now build multilingual proficiency from sufficiently large samples of conversations without prior specific grammatical knowledge.

To conclude, “Translation System” may even be self-effacing as it implies language-to-language mappings when in principle such systems can be fed with raw sounds and be able to parse the wheat of meanings from the chaff of noise. And, who knows, eventually be able to decrypt languages of tongues.

Further Reading

External Links

NIEM & Information Exchanges

January 24, 2017

Preamble

The objective of the National Information Exchange Model (NIEM) is to provide a “dictionary of agreed-upon terms, definitions, relationships, and formats that are independent of how information is stored in individual systems.”

(Alfred Jensen)

NIEM’s model makes no difference between data and information (Alfred Jensen)

For that purpose NIEM’s model combines commonly agreed core elements with community-specific ones. Weighted against the benefits of simplicity, this architecture overlooks critical distinctions:

  • Inputs: Data vs Information
  • Dictionary: Lexicon and Thesaurus
  • Meanings: Lexical Items and Semantics
  • Usage: Roots and Aspects

That shallow understanding of information significantly hinders the exchange of information between business or institutional entities across overlapping domains.

Inputs: Data vs Information

Data is made of unprocessed observations, information makes sense of data, and knowledge makes use of information. Given that NIEM is meant to be an exchange between business or institutional users, it should have no concern with data mining or knowledge management.

Data is meaningless, information meaning is set by semantic domains.

As an exchange, NIEM should have no concern with data mining or knowledge management.

The problem is that, as conveyed by “core of data elements that are commonly understood and defined across domains, such as person, activity, document, location”, NIEM’s model makes no explicit distinction between data and information.

As a corollary, it implies that data may not only be meaningful, but universally so, which leads to a critical trap: as substantiated by data analytics, data is not supposed to mean anything before processed into information; to keep with examples, even if the definition of persons and locations may not be specific, the semantics of associated information is nonetheless set by domains, institutional, regulatory, contractual, or otherwise.

Data is meaningless, information meaning is set by semantic domains.

Data is meaningless, information meaning is set by semantic domains.

Not surprisingly, that medley of data and information is mirrored by NIEM’s dictionary.

Dictionary: Lexicon & Thesaurus

As far as languages are concerned, words (e.g “word”, “ξ∏¥” ,”01100″) remain data items until associated to some meaning. For that reason dictionaries are built on different levels, first among them lexical and semantic ones:

  • Lexicons take items on their words and gives each of them a self-contained meaning.
  • Thesauruses position meanings within overlapping galaxies of understandings held together by the semantic equivalent of gravitational forces; the meaning of words can then be weighted by the combined semantic gravity of neighbors.

In line with its shallow understanding of information, NIEM’s dictionary only caters for a lexicon of core standalone items associated with type descriptions to be directly implemented by information systems. But due to the absence of thesaurus, the dictionary cannot tackle the semantics of overlapping domains: if lexicons alone can deal with one-to-one mappings of items to meanings (a), thesauruses are necessary for shared (b) or alternative (c) mappings.

vv

Shared or alternative meanings cannot be managed with lexicons

With regard to shared mappings (b), distinct lexical items (e.g qualification) have to be mapped to the same entity (e.g person). Whereas some shared features (e.g person’s birth date) can be unequivocally understood across domains, most are set through shared (professional qualification), institutional (university diploma), or specific (enterprise course) domains .

Conversely, alternative mappings (c) arise when the same lexical items (e.g “mole”) can be interpreted differently depending on context (e.g plastic surgeon, farmer, or secret service).

Whereas lexicons may be sufficient for the use of lexical items across domains (namespaces in NIEM parlance), thesauruses are necessary if meanings (as opposed to uses) are to be set across domains. But thesauruses being just tools are not sufficient by themselves to deal with overlapping semantics. That can only be achieved through a conceptual distinction between lexical and semantic envelops.

Meanings: Lexical Items & Semantics

NIEM’s dictionary organize names depending on namespaces and relationships:

  • Namespaces: core (e.g Person) or specific (e.g Subject/Justice).
  • Relationships: types (Counselor/Person) or properties (e.g PersonBirthDate).
vvv

NIEM’s Lexicon: Core (a) and specific (b) and associated core (c) and specific (d) properties

But since lexicons know only names, the organization is not orthogonal, with lexical items mapped indifferently to types and properties. The result being that, deprived of reasoned guidelines, lexical items are chartered arbitrarily, e.g:

Based on core PersonType, the Justice namespace uses three different schemes to define similar lexical items:

  • “Counselor” is described with core PersonType.
  • “Subject” and “Suspect” are both described with specific SubjectType, itself a sub-type of PersonType.
  • “Arrestee” is described with specific ArresteeType, itself a sub-type of SubjectType.

Based on core EntityType:

  • The Human Services namespace bypasses core’s namesake and introduces instead its own specific EmployerType.
  • The Biometrics namespace bypasses possibly overlapping core Measurer and BinaryCaptured and directly uses core EntityType.
Lexical items are meshed disregarding semantics

Lexical items are chartered arbitrarily

Lest expanding lexical items clutter up dictionary semantics, some rules have to be introduced; yet, as noted above, these rules should be limited to information exchange and stop short of knowledge management.

Usage: Roots and Aspects

As far as information exchange is concerned, dictionaries have to deal with lexical and semantic meanings without encroaching on ontologies or knowledge representation. In practice that can be best achieved with dictionaries organized around roots and aspects:

  • Roots and structures (regular, black triangles) are used to anchor information units to business environments, source or destination.
  • Aspects (italics, white triangles) are used to describe how information units are understood and used within business environments.
nformation exchanges are best supported by dictionaries organized around roots and aspects

Information exchanges are best supported by dictionaries organized around roots and aspects

As it happens that distinction can be neatly mapped to core concepts of software engineering.

P.S. Thesauruses & Ontologies

Ontologies are systematic accounts of existence for whatever is considered, in other words some explicit specification of the concepts meant to make sense of a universe of discourse. From that starting point three basic observations can be made:

  1. Ontologies are made of categories of things, beings, or phenomena; as such they may range from simple catalogs to philosophical doctrines.
  2. Ontologies are driven by cognitive (i.e non empirical) purposes, namely the validity and consistency of symbolic representations.
  3. Ontologies are meant to be directed at specific domains of concerns, whatever they can be: politics, religion, business, astrology, etc.

With regard to models, only the second one puts ontologies apart: contrary to models, ontologies are about understanding and are not supposed to be driven by empirical purposes.

On that basis, ontologies can be understood as thesauruses describing galaxies of concepts (stars) and features (planets) held together by semantic gravitation weighted by similarity or proximity. As such ontologies should be NIEM’s tool of choice.

Further Reading

External Links

Things Behavior & Social Responsibility

October 27, 2016

Contrary to security breaks and information robberies that can be kept from public eyes, crashes of business applications or internet access are painfully plain for whoever is concerned, which means everybody. And as illustrated by the last episode of massive distributed denial of service (DDoS), they often come as confirmation of hazards long calling for attention.

robot_waynemiller

Device & Social Identity (Wayne Miller)

Things Don’t Think

To be clear, orchestrated attacks through hijacked (if unaware) computers have been a primary concern for internet security firms for quite some time, bringing about comprehensive and continuous reinforcement of software shields consolidated by systematic updates.

But while the right governing hand was struggling to make a safer net, the other hand thoughtlessly brought in connected objects to a supposedly new brand of internet. As if adding things with software brains cut to the bone could have made networks smarter.

And that’s the catch because the internet of things (IoT) is all about making room for dumb ancillary objects; unfortunately, idiots may have their use for literary puppeteers with canny agendas.

Think Again, or Not …

For old-timers with some memory of fingering through library cardboard, googling topics may have looked like dreams: knowledge at one’s fingertips, immediately and comprehensively. But that vision has never been more than a fleeting glimpse in a symbolic world; in actuality, even at its semantic best, the web was to remain a trove of information to be sifted by knowledge workers safely seated in their gated symbolic world. Crooks of course could sneak in as knowledge workers, armed with fountain pens, but without guns covered by the second amendment.

So, from its inception, the IoT has been a paradoxical endeavor: trying to merge actual and symbolic realms that would bypass thinking processes and obliterate any distinction. For sure, that conundrum was supposed to be dealt with by artificial intelligence (AI), with neural networks and deep learning weaving semantic threads between human minds and networks brains.

Not surprisingly, brainy hackers have caught sight of that new wealth of chinks in internet armour and swiftly added brute force to their paraphernalia.

But in addition to the technical aspect of internet security, the recent Dyn DDoS attack puts the light on its social perspective.

Things Behavior & Social Responsibility

As far as it remained intrinsically symbolic, the internet has been able to carry on with its utopian principles despite bumpy business environments. But things have drastically changed the situation, with tectonic frictions between symbolic and real plates wreaking havoc with any kind of smooth transition to internet.X, whatever x may be.

Yet, as the diagnose is clear, so should be the remedy.

To begin with, the internet was never meant to become the central nervous system of human societies. That it has happened in half a generation has defied imagination and, as a corollary, sapped the validity of traditional paradigms.

As things happen, the epicenter of the paradigms collision can be clearly identified: whereas the internet is built from systems, architectures taxonomies are purely technical and ignore what should be the primary factor, namely what kind of social role a system could fulfil. That may have been irrelevant for communication networks, but is obviously critical for social ones.

Further Reading

External Links

Brands, Bots, & Storytelling

May 2, 2016

As illustrated by the recent Mashable “pivot”, meaningful (i.e unbranded) contents appear to be the main casualty of new communication technologies. Hopefully (sic), bots may point to a more positive perspective, at least if their want for no no-nonsense gist is to be trusted.

(Latifa Echakhch)

Could bots repair gibberish ? (Latifa Echakhch)

The Mashable Pivot to “branded” Stories

Announcing Mashable recent pivot, Pete Cashmore (Mashable ‘s founder and CEO) was very candid about the motives:

“What our advertisers value most about
 Mashable is the same thing that our audience values: Our content. The
 world’s biggest brands come to us to tell stories of digital culture, 
innovation and technology in an optimistic and entertaining voice. As 
a result, branded content has become our fastest growing revenue 
stream over the past year. Content is now at the core of our ad 
offering and we plan to double down there.

”

Also revealing was the semantic shift in a single paragraph: from “stories”, to “stories told with an optimistic and entertaining voice”, and finally to “branded stories”; as if there was some continuity between Homer’s Iliad and Outbrain’s gibberish.

Spinning Yarns

From Lacan to Seinfeld, it has often been said that stories are what props up our world. But that was before Twitter, Facebook, YouTube and others ruled over the waves and screens. Nowadays, under the combined assaults of smart dummies and instant messaging, stories have been forced to spin advertising schemes, and scripts replaced  by subliminal cues entangled in webs of commercial hyperlinks. And yet, somewhat paradoxically, fictions may retrieve some traction (if not spirit) of their own, reprieved not so much by human cultural thirst as by smartphones’ hunger for fresh technological contraptions.

Apps: What You Show is What You Get

As far as users are concerned, apps often make phones too smart by half: with more than 100 billion of apps already downloaded, users face an embarrassment of riches compounded by the inherent limitations of packed visual interfaces. Enticed by constantly renewed flows of tokens with perfunctory guidelines, human handlers can hardly separate the wheat from the chaff and have to let their choices be driven by the hypothetical wisdom of the crowd. Whatever the outcomes (crowds may be right but often volatile), the selection process is both wasteful (choices are ephemera, many apps are abandoned after a single use, and most are sparely used), and hazardous (too many redundant dead-ends open doors to a wide array of fraudsters). That trend is rapidly facing the physical as well as business limits of a zero-sum playground: smarter phones appear to make for dumber users. One way out of the corner would be to encourage intelligent behaviors from both parties, humans as well as devices. And that’s something that bots could help to bring about.

Bots: What You Text Is What You Get

As software agents designed to help people find their ways online, bots can be differentiated from apps on two main aspects:

  • They reside in the cloud, not on personal devices, which means that updates don’t have to be downloaded on smartphones but can be deployed uniformly and consistently. As a consequence, and contrary to apps, the evolution of bots can be managed independently of users’ whims, fostering the development of stable and reliable communication grammars.
  • They rely on text messaging to communicate with users instead of graphical interfaces and visual symbols. Compared to icons, text put writing hands on driving wheels, leaving much less room for creative readings; given that bots are not to put up with mumbo jumbo, they will prompt users to mind their words as clearly and efficiently as possible.

Each aspect reinforces the other, making room for a non-zero playground: while the focus on well-formed expressions and unambiguous semantics is bots’ key characteristic, it could not be achieved without the benefits of stable and homogeneous distribution schemes. When both are combined they may reinstate written languages as the backbone of communication frameworks, even if it’s for the benefits of pidgin languages serving prosaic business needs.

A Literary Soup of Business Plots & Customers Narratives

Given their need for concise and unambiguous textual messages, the use of bots could bring back some literary considerations to a latent online wasteland. To be sure, those considerations are to be hard-headed, with scripts cut to the bone, plots driven by business happy ends, and narratives fitted to customers phantasms.

Nevertheless, good storytelling will always bring some selective edge to businesses competing for top tiers. So, and whatever the dearth of fictional depth, the spreading of bots scripts could make up some kind of primeval soup and stir the emergence of some literature untainted by its fouled nourishing earth.

Further Readings

Out of Mind Content Discovery

April 20, 2016

Content discovery and the game of Go can be used to illustrate the strengths and limits of artificial intelligence.

(Pavel Wolberg)

Now and Then: contents discovery across media and generations (Pavel Wolberg)

Game of Go: Closed Ground, Non Semantic Charts

The conclusive successes of Google’s AlphaGo against world’s best players are best understood when  related to the characteristics of the game of Go:

  • Contrary to real life competitions, games are set on closed and standalone playgrounds  detached from actual concerns. As a consequence players (human or artificial) can factor out emotions  from cognitive behaviors.
  • Contrary to games like Chess, Go’s playground is uniform and can be mapped without semantic distinctions for situations or moves. Whereas symbolic knowledge, explicit or otherwise, is still required for good performances, excellence can only be achieved through holistic assessments based on intuition and implicit knowledge.

Both characteristics fully play to the strengths of AI, in particular computing power (to explore playground and alternative strategies) and detachment (when decisions have to be taken).

Content Discovery: Open Grounds, Semantic Charts

Content discovery platforms like Outbrain or Taboola are meant to suggest further (commercial) bearings to online users. Compared to the game of Go, that mission clearly goes in the opposite direction:

  • Channels may be virtual but users are humans, with real emotions and concerns. And they are offered proxy grounds not so much to be explored than to be endlessly redefined and made more alluring.
  • Online strolls may be aimless and discoveries fortuitous, but if content discovery devices are to underwrite themselves, they must bring potential customers along monetized paths. Hence the hitch: artificial brains need some cues about what readers have in mind.

That makes content discovery a challenging task for artificial coaches as they have to usher wanderers with idiosyncratic but unknown motivations through boundless expanses of symbolic shopping fields.

What Would Eliza Say

When AI was still about human thinking Alan Turing thought of a test that could check the ability of a machine to exhibit intelligent behaviors. As it was then, available computing power was several orders of magnitude below today’s capacities, so the test was not about intelligence itself, but with the ability to conduct text-based dialogues equivalent to, or indistinguishable from, that of a human. That approach was famously illustrated by Eliza, a software able to beguile humans in conversations without any understanding of their meanings.

More than half a century later, here are some suggestions of leading content discovery engines:

  • After reading about the Ecuador quake or Syrian rebels one is supposed to be interested by 8 tips to keep our liver healthy, or 20 reasons of unsuccessful attempts at losing weight.
  • After reading about growing coffee in Ethiopia one is supposed to be interested by the mansions of world billionaires, or a Shepard pup surviving after being lost at sea for a month.

It’s safe to assume that both would have flunked the Turing Test.

Further Reading

External Links


Hexa

Your content with a new angle at WordPress.com

IT Modernization < V.Hanniet

About IT Modernization

IT Modernization < V. Hanniet

software model driven approaches

Caminao's Ways

Do systems know how symbolic they are ?