Semantic Database Technology Hits Compliance

Did you ever think about your compliance business challenges and how they could be addressed by the new leading-edge database technology? The new era of semantic database technologies offer an amazing array of near-term possibilities to address compliance, for those companies that are getting started now. We are one such organization and we'd like to share with you some of our thinking and activities in a series of BLOG posts on the topic of semantic database technology and how regulatory compliance challenges can be addressed using this new era of database technology. These thoughts will extend our prior posts on GRC data lakes.

Currently, we have two industry benchmarking platforms “in production mode” which address 1.) Financial Benchmarking and 2.) Corporate Resilience Benchmarking (i.e. an Enterprise Resilience Ratings System). The 3rd industry benchmarking solution, now under development, is titled, “BreakPoint KPI Performance Measurement System”. Here’s our description:

BreakPoint KPI™ is a benchmarking platform which provides Members with custom Key Performance Indicator (KPI) industry benchmarking where peer assessment information is sent to Members of an industry cluster in near real-time. Our Governance Execution Framework (GEF) is designed to take advantage of BreakPoint KPI's enterprise-class performance and is ideal for deploying GRC Data Lakes of federated GRC data. Note that BreakPoint KPI is a core member service which is positioned for use by members of an Industry Cluster, SIG or, alternatively, Member firms. Here are some GRC application areas that need strong KPI industry benchmarking: COSO Enterprise Risk Management, Anti-fraud, Crisis response, Defense, Cyber Security, Insider Threat, Intelligence, Legal intelligence, Law enforcement, and your own custom applications.

Examples of similar member-based communities, i.e. collectives / consortiums, like GRC Sphere which have a need for our BreakPoint KPI platform:

  • The Operational Risk eXchangewww.ORX.org
  • Shared Assessmentswww.SharedAssessments.org

1.   Technology Platform Specifics

Our federated data management platform relies upon the following core functions and 2 key Use Cases which are identified within this functional specification: 

a. A data model that ideally would extend a GRC vendor’s data model via so that we, as an intermediary Industry Benchmarking Consortium can accomplish crowdsourcing and benchmarking data management functions.

b. For use in Crowdsourcing functions (i.e. collecting, aggregating and synthesizing both quantitative and qualitative data).

c. Relying on a ratio-based scalar measurement function with sliders as input mechanism.

d. Leveraging sensitivity analysis to assess the Crowdsourcing data.

e. For use in Industry Benchmarking functions (taking the synthesized ratio-based output, trending the data on a monthly, quarterly and annual basis and outputting T- and Z-scored data.

2.   Technology Platform Independence

Our federated data model (API) is purpose-built to support any of the following without jeopardizing any GRC partner, any consultancy or any of their customer / clients. We need to support.

A.    Any Consultancy

B.    Any GRC platform

C.   Any ERP platform

D.   Specific Crowdsourcing model input definitions

E.  Deliver industry benchmarking data directly into a web client which integrates with 3 types of Professionals’ Networked Community (PNC) web platforms.

3.   Core Standards

A.    Financial Industry Business Ontology® (FIBO®) 

The OMG Finance Domain Task Force (OMG FDTF) was created to develop sustainable business and technology standards that will promote the notion that Data and its Semantics are the DNA of financial services. The FDTF is currently working on FIBO.

A joint effort by OMG and Enterprise Data Management (EDM) Council, FIBO is an industry initiative to define financial industry terms, definitions and synonyms using semantic web principles such as RDF/OWL and widely adopted OMG modeling standards such as UML. FIBO®* will contribute to transparency in the global financial system, aid industry firms in providing a cost-effective means for integrating disparate technical systems and message formats, and aid in regulatory reporting by providing clear and unambiguous meaning of data from authoritative sources.

B.    ODATA - http://www.odata.org/

OData (Open Data Protocol) is an OASIS standard that defines a set of best practices for building and consuming RESTful APIs. OData helps you focus on your business logic while building RESTful APIs without having to worry about the various approaches to define request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats, query options, etc. OData also provides guidance for tracking changes, defining functions/actions for reusable procedures, and sending asynchronous/batch requests.

OData RESTful APIs are easy to consume. The OData metadata, a machine-readable description of the data model of the APIs, enables the creation of powerful generic client proxies and tools.

C.    SPARQL

SPARQL (pronounced "sparkle"), is a recursive acronym for SPARQL Protocol and RDF Query Language. SPARQL is a Resource Description Language (RDF) query language, that is, a semantic query language for databases, which is able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. It was made a standard by the RDF Data Access Working Group (DAWG) of the World Wide Web Consortium (W3C) and is recognized as one of the key technologies of the semantic web.

SPARQL allows for a query to consist of triple patterns, conjunctions, disjunctions, and optional patterns. Implementations for multiple programming languages exist. There are tools that exist which allow one to connect and semi-automatically construct a SPARQL query for a SPARQL endpoint, for example ViziQuer. In addition, there exist tools that translate SPARQL queries to other query languages, for example to SQL and to XQuery.

SPARQL allows users to write queries against what can loosely be called "key-value" data or, more specifically, data that follows the RDF specification of the W3C. The entire database is thus a set of "subject-predicate-object" triples. This is analogous to some NoSQL databases' usage of the term "document-key-value", such as MongoDB.

SPARQL thus provides a full set of analytic query operations such as JOIN, SORT, AGGREGATE for data whose schema is intrinsically part of the data rather than requiring a separate schema definition. Schema information (the ontology) is often provided externally, though, to allow different datasets to be joined in an unambiguous manner. In addition, SPARQL provides specific graph traversal syntax for data that can be thought of as a graph.

D.    RDF

RDF data can also be considered in SQL relational database terms as a table with three columns – the subject column, the predicate column, and the object column. Unlike relational databases, the object column is heterogeneous: the per-cell data type is usually implied (or specified in the ontology) by the predicate value. Alternately, again comparing to SQL relations, all of the triples for a given subject could be represented as a row, with the subject being the primary key and each possible predicate being a column and the object is the value in the cell. However, SPARQL/RDF becomes easier and more powerful for columns that could contain multiple values (like "children") for the same key, and where the column itself could be a joinable variable in the query, rather than directly specified.

Next time we will dive a bit deeper now that we've reviewed the core standards for our semantic database technology initiative for compliance, crowdsourcing, benchmarking and GRC Data Lakes.

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.