Data Mining & Requirements Analysis

Preamble

Data mining explores business opportunities and competitive advantage, requirements analysis describe supporting applications. Both use models, the former’s are predictive and ephemeral, the latter’s descriptive (or prescriptive) and perennial.

(Andreas Gursky)

Data mining: sorting business wheat from world chaff (Andreas Gursky)

Understanding how they are related could significantly improve processes maturity.

Data vs Requirements Analysis

Nowadays the success of a wide range of enterprises critically depends on two achievements:

  1. Mapping business models to changing environments by sorting through facts, capturing the relevant data, and processing the whole into meaningful and up to date information. That can be achieved through analysis models meant to described business expectations with regard to supporting systems.
  2. Putting that information into effective use through their business processes and supporting systems. That is done by systems architecture and design models meant to prescribe how to build software artifacts.
vv

From data analysis to systems requirements and software design

Those challenges are converging: under the pressure of markets forces and technological advances most of traditional fences between business channels and IT systems are crumbling, putting the focus on the functional integration between data mining and production systems. That’s where predictive models can help by anchoring descriptive models to moving markets and by cross-feeding analysis and operations. How that can be achieved has been the bread and butter of good corporate governance for some time, but there has been less interest for the third branch, namely how data analysis (predictive models) could “inform” business requirements (descriptive models).

From Data to Information

Facts are not given but must be captured through a symbolic description of actual observations. That entails some observer set on task using a mix of conceptual and technical apparatus. Data mining and requirements analysis are practical realizations of that process:

  • Data mining relies on analytic tools to extract revealing information that could be used to chart opportunities along business models.
  • Requirements analysis relies on business processes and users’ practice to extract symbolic descriptions that will be used to build models of supporting applications.

If both walk the path from data to information, their objectives are different: the former’s is to improve business decisions by making sense of actual observations; the latter’s is to build system surrogates from the symbolic descriptions of actual business objects and activities.

Anchors & Structures: Plasticity of  Business Entities

Perhaps paradoxically, business agility calls for terra firma because nimble trades must be rooted in corporate identity and business continuity. As a consequence, the first step of requirements analysis should be to associate individuals business objects or activities with stable and consistent identification mechanisms, and to group them with regard to that mechanism:

  • External entities with natural (person) or designed identity (car).
  • Symbolic entities for roles (customer) or commitments (maintenance contract).
  • Actual activities (promotion campaign) and events (sale) or business logic (promotion).
Anchors

Anchors

Conversely, as the aim of data analysis is to explore every business angle, individual observations are supposed to be moved across groups; yet, since the units identified by data analysis will have to be aligned with the ones described by requirements analysis, moves must also keep track of identities. That dilemma between continuity of identified structures on one side, plasticity of functional aspects on the other side, can be illustrated by banks which, in response to marketing requirements, had to shift from account (internal identification) to customer (external identification) based systems.

From account (left) to customer (right) centered systems

It’s easier to market insurance from customer centered systems (right) than from account centered ones (left)

That challenge can be overcome by linking the identification of symbolic entities to external anchors.

Profiles & Features: Versatility of Business Opportunities

As noted above, requirements and data analysis are set on the same road but driven by different forces: the former tries to group individuals with regard to identification mechanisms before fleshing them out with relevant features; the latter tries to group individuals with given identities according to features and opportunity profiles. Yet, what could appear as collision courses may become a meeting of minds if both courses are charted with regard to variants analysis.

From the requirements perspective the primary concern is to distinguish between structural and functional variants:

  • Structural variants are bound to identities, i.e set up-front for the respective life-cycle of individual business objects or transactions. As a consequence they cannot be changed without undermining business continuity. Moreover, being part and parcel of descriptors (e.g  types and use cases) their change will affect engineering processes.
  • Functional variants may vary during the respective life-cycle of individual business objects or transactions. As a consequence they can be changed without undermining business continuity, and changes in descriptors (e.g partitions and scenarii) can be managed without affecting engineering processes.

From the data mining perspective the objective is to improve the benefits of information systems for decision-making processes:

  • Static: how to classify individuals as to reduce the uncertainty of predictions
  • Dynamic: how to classify business options as to reduce the uncertainty of decisions.

Since those objectives are set for individuals, constraints on continuity and consistency can be dealt with independently of the description of symbolic surrogates.

Identified individuals with profiles for customers (a), their behaviors (b), and conciliatory gestures (c)

Identified individuals with profiles for customers (a), their behaviors (b), and promotional gestures (c)

It ensues that perspectives can be adjusted by factoring out the constraints of continuity and consistency for business objects (e.g cars), agents (e.g customer) and processes (e.g repair). Profiles for agents (a), behaviors (b), and business options (c) could then be freely explored and tailored with regard to changes in business environment and objectives.

Applying Data Analysis to Requirements

Not surprisingly data analysis techniques can be used to adjust perspectives. For that purpose a sample of individuals (business objects and operations) representing the population targeted by requirements would have to be submitted to basic mining routines. Borrowing a catalog from F. Provost & T. Fawcett:

  1. Classification: estimates the probability for each individual (objects or operations) to belong to a set of classes; can be used to assess the closeness of the variants (respectively power-types or execution paths) identified by requirements analysis.
  2. Regression: reverse classification; estimates how much of individual features valuations can be explained by the proposed classifications.
  3. Similarity: a shallow version of classification; can be used to assess the distance between variants and consolidate the proposed classifications.
  4. Clustering: a deep version of classification; can be used to distinguish between shallow and natural classifications.
  5. Co-occurrence: deals with behavioral variants; can be used to distinguish between functional and structural classifications.
  6. Profiling: reverse of co-occurrence; can be used to consolidate functional and structural classifications.
  7. Links prediction: can be used to define relationships.
  8. Data reduction: eliminate redundant individuals; can be used to consolidate requirements and refine tests scenarii.
  9. Causal modeling: brings together business logic (events and rules) and users decisions; should provide the backbone of tests scenarii.

Besides the direct benefits for requirements, such procedures may help to bridge the span between data and requirements analysis and significantly improve processes’ capability and maturity level.

Business Objectives & Enterprise Architecture Capabilities

Data mining being first and foremost about competitive edge, it relies on a timely and effective coupling between enterprises capabilities and business opportunities. But the dilemma between continuity and plasticity described above for business objects and processes reappears at enterprise level: how to conciliate architecture, by nature perennial, with the agility needed to make the best of changing and competitive environments ?

As architectural big bang is arguably a last resort option, answers to that question must be progressive and local: if changes are to be swift and pertinent they must be both circumscribed and leveraged to the relevant parts of architecture. Taking an (amended) leaf of the Zachman framework, its sixth column (“Why” ) could be reset as a line for business and operational objectives that would cross the original five columns instead of the architecture layers. Using a pentagonal representation of enterprise architecture, that line would be set on the outer range.

ccc

Enterprise Architecture and the loci of change

It must be reminded that setting objectives on a line crossing the columns of capabilities instead of a column crossing the lines of layers means that objectives are set at enterprise level and their cascading impact traced and managed through layers.

Symbolic Systems vs World

Nowadays the life of enterprises fully depends on the ability of their systems to deal with their environment by making sense of data and supporting production systems. As long as environments were a hotchpotch of actual and symbolic artifacts the pros and cons of integration could be balanced. But the generalization of digital facts and transactions has upended the balance: there is no more room or time for latency and enterprises must unify the symbolic representation of their business models, organization, and systems. That should be the role of conceptual models but the challenge is to avoid flights to abstraction and rainbow chases.

OpenConcepts_00

Conceptual models as bridges between environments, processes, and systems.

That could be done by introducing a conceptual indexing scheme open to extensions but with its footprint defined by business processes and systems functionalities.

Selected Readings

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: