Thursday, November 13, 2008

Business Intelligence as a Service

“There is a service waiting to be born.”

Peter Drucker, The Daily Drucker


How does Web 2.0 fit into the growing world of Business Intelligence? This post attempts to answer this question by making two points:

  1. Business Intelligence as a Service - For most companies Business Intelligence needs to be provided as a service, with the technical aspects outsourced to experts.
  2. Web 2.0 - The best way to provide Business Intelligence as services is through secured internet pages using Web 2.0 technology.

The combination of these two factors makes Business Intelligence (or BI) a perfect opportunity for a small group of very skilled people that are able to provide what most businesses cannot provide on their own.

1. Business Intelligence as a Service

Business is an information intensive activity with a powerful competitive advantage going to those companies can best determine their customer’s needs and the means of satisfying those needs. Business Intelligence is the application of data analysis to finding the business factors that provide this competitive advantage. Business Intelligence offers insightful and timely information to a company’s strategic decision-makers. Patterns and correlations between customers, products, markets, and other information can be provided to executives that will help them allocate the company’s resources to where they can be most productive for both the company and the economy within which it functions.

BI, however, requires specialized knowledge that does not exist within most companies, even including those companies that can afford their own sizeable IT staffs. How to structure a data warehouse, create data mining models, or using statistical techniques to classify customers or predict sales are skills that are rare in the typical IT staff and require a large amount of specialized training to grow within the company. To date, most BI ventures have either failed completely or have proven to be disappointing in the amount of information produced and the expenses incurred.

In addition to the inherent difficulty in producing a BI staff, there is the problem of scale that limits the development of BI to only those companies that are large enough to allocate an enormous budget to produce an automated extension to their enterprise decision makers. While the international conglomerate can afford to throw millions of dollars at a BI effort, the vast majority of businesses will find an in-house BI effort prohibitive.

According to Adam Smith, the “wealth of nations” was primarily a resultt of a division of labor that allowed the baker to bake, the painter to paint, and the pin-maker to produce pins. The production of Business Intelligence, while requiring a commitment from the business man, requires an expertise that is affordable only to the man of leisure, the corporate executive with an infinite budget, or the business person who is resourceful enough to allocate the effort to an expert service. As smaller firms began to see the gains made by their bigger competitors, they will need to turn to a firm that can provide BI as a service.


2.
Business Intelligence and Web 2.0

A service company needs to deliver a service, and in the case of Business Intelligence, the service to be delivered is information. The consumer needs to receive information in a manner that it will provide him insights to his business. In addition, the consumer needs to deliver raw data, representing the activities of the company, to the service-provider to be stored in such a manner that it can easily be turned into intelligence. If Business Intelligence is to be provided as a service, the service needs to be able to communicate data to and from the service consumer and it must be able to present the data in a manner that is user-friendly, efficient, and intuitive.


The communications requirement immediately suggests the internet. By way of the internet, data can be both uploaded to or downloaded from a service-provider’s database. There is no need for a dedicated network; the user only needs to access the database through an internet address and whatever security is needed to protect the data’s privacy. The URL is the application-download location, and the browser is the application platform. With the internet, the complete process of storing and reading data in an outsourced database is easily resolved. Users and developers have embraced the web because it offers centralized management and delivery, no local installation, instant updates to both functionality and content, and elements such as shareable bookmarks.


The problem with the internet is the classic render-call-render pattern that dominates most web applications. The traditional way of providing an interactive web site is to enter data in a page and wait for the page’s host to return a new updated page. This lacks the smooth interaction that a user has with an application that is running on his local machine. On his local machine he is able to enter data and have the program respond without rewriting the screen; the application is working in an interactive manner directly with the user and there is no delay time while a page and its bulky graphics are retransmitted across the internet. Conventional web applications solve the problem of data communications but leave the problem of a clumsy interface that interrupts the users thought and confronts him with interactive ambiguity.


This brings us to Web 2.0 and its ability to turn an internet browser into a virtual desktop. Web 2.0 teams the experience of interactivity with the underlying connectivity that comes from the internet. According to Tim O’Reilly, “Web 2.0 is the business revolution in the computer industry caused by the move to the internet as a platform, and an attempt to understand the rules for success on that new platform.”


At the center of the Web 2.0 phenomena are the interactive technologies AJAX, Adobe Flash, Flex, and Java. AJAX, for example, has prompted the development of websites that mimic desktop applications. Within a page based upon AJAX, there is embedded JavaScript code that can actually operate as a full-blown application, giving the user the same experience that he gets from word processing and spreadsheet applications that sit on his own machine.


With the combination of expert Business Intelligence designers combined with the technology of Web 2.0, there is a service out there waiting to be born.

Friday, November 7, 2008

The Universal OLAP Query

One of the impediments to providing Business Intelligence to management is the technical expertise required to make a request to the data warehouse. A busy executive cannot be expected to be able to use a database language like SQL and must therefore depend upon standard rigid reports from OLAP reporting tools. In this post, I will demonstrate how easy it is to provide a technically untrained executive with a graphical tool that will allow him to generate millions of combinations of queries to standard data warehouse.

The challenge appears difficult because, not only is SQL a technically complex language based upon mathematical set theory, but the SQL queries that go to a database designed as a data warehouse are particularly complex to code. However, within these challenges lies the very solution to providing real data warehouse power and flexibility to the non-technical user.

Every request for data from an orthodox data warehouse is complex but exactly the same in form—only the parameters change in a statement that must have the same components in the same places to utilize the analytical power of the data warehouse. The vehicle must be the same while the options differ depending upon the specific questions asked by the analyst. Let’s dissect the standard data warehouse query and factor out the generalities from the particulars of a specific query and thereby determine what are the parameters are to a “Universal OLAP Query.”

First of all, data warehouses have a particular architecture to their data that remains the same whether we are using a specialized “multidimensional database” or a standard relational database designed to implement a multidimensional “star schema.” This architecture provides the analyst with a set of facts that can be grouped and summed according to the dimensions of the facts. For example, where a company keeps a data warehouse of retail sales, the design of the data warehouse would appear as follows where the blocks are tables (a single “fact” table that is linked to four “dimension tables” that contain the sets of objects that the retail sales are related to):

Although the dimensions are specific to this data warehouse, the structure of a fact table attached to a set of dimensions remains the same in any orthodox data warehouse, accounting for its description as a “multidimensional database” or a “star schema.” Much of the complexity of an OLAP data query is a result of the joins that must occur between the Fact table and one or more of the dimensional tables. The solution to creating a universal OLAP query is automate this join process and provide the non-technical analyst with a user interface that allows him to simply select the facts and dimension that those facts will be rolled up by. Know this, we can see that the star schema query will always appear in the following form:

select Fact, Dimension Criteria
from Fact Table, Dimensional Tables
where Join Criteria, Slicing Criteria
group by Dimensional Criteria

This structure remains the same for each OLAP query, the things that change from query to query are the underlined terms that we can turn into parameters to our universal OLAP query generator. Reviewing these terms in the above example gives us the following:

This reduces the syntax of SQL (or some other technical language) to a set of parameters that can be chosen from a graphical user interface that uses drag-and-drop or some other selection process for our non-technical user, however, there are a lot of parameters here and we can reduce them somewhat. The identity of the Fact Table can be programmatically inferred from the selection of Facts, eliminating it as a parameter. Likewise, the Dimension Tables and the Join Criteria can be inferred from Dimensional Criteria and the Slicing Criteria (We assume that the Universal Query Generator has been initialized with the metadata that describes the data warehouse). This reduces the input to the generator as follows:

A user interface that has been configured with the description of the data warehouse can then provide a simple wizard that allows the user to select the facts that he wants rolled up (summed, averaged, etc.), the attributes of the dimensions by which the data will be rolled up (for example, customer age group, product color, time periods, etc.), and the limits of the data (slicing criteria). The Facts and Dimensional Criteria are simple selection operations, while the slicing criteria can involve simple to more involved logic as data is sliced with “and” and “or” conditions. However, the whole thing can be done with an intuitive user interface that requires no technical language skills.

Thursday, November 6, 2008

Drucker on Integrating Decision Support Data

“…most businesses have two information systems. One is organized around the data stream; the other, far older one, around the accounting system. The accounting system, though, is a five-hundred-year-old information system that is in terrible shape. The changes we will see in information technologies over the next twenty years are nothing compared to the changes we will see in accounting.”

Peter Drucker, Management in the Next Society

The heart of Business Intelligence is the data warehouse, with its historical record of the business events. It typically includes data about customers, products, business units, and other classes of objects that are critical to the company’s decision-making. Unfortunately, it has not included the most critical dimension of the business event—the movement of financial resources that occur during that business event.

This data, historically the most important of all business information, has been sequestered from the world of Business Intelligence by accountants. The accountants use this data to produce reports that satisfy government regulations while providing almost no strategic assistance to the company’s management. According to Peter Drucker, decision support will have to include the transactional data that is produced by the accounting department.

“At present, these two systems [data stream and accounting] are separate. They will not be separate for our grandchildren’s generation. Today’s CEOs still depend on the accounting model. I don’t know of a single business that bases its decisions on the data processing stream. Everyone bases their decisions on the accounting model, even though most of us have learned how easy this model is to manipulate.”

Peter Drucker, Management in the Next Society

Drucker is right; what we call Business Intelligence will have to be integrated with accounting data. It simply has to; the competitive advantages of doing it are too great. More importantly, the competitive disadvantages of not doing it will make it essential.