What is a Data Compiler?

So, what is a Data Compiler?  Is it like any other software compiler?  The answer is, “almost but not quite.”

We know that Java compilers compile Java source code in order to generate Java libraries and executables.  We also know that C++ compilers compile C++ source code in order to generate C++ libraries and executables.  You would think that it’s a pretty safe to assume that Data compilers compile Data source code in order to generate Data libraries and executables.  Right?

Well, there’s no such thing as “data executables” so there is a difference.

All compilers have some similar traits:

  1. They take inputs,
  2. They process and transform these inputs, and
  3. They generate expected and controlled outputs.

For example, if we apply this to Java Compilers, we know that:

  1. Java Compilers take in Java Source Code and Build Rules,
  2. They use the Build Rules to process and transform the Source Code,
  3. And, assuming no issues, they generated Libraries and/or Executables, based on those Build Rules.

Data Compilers are similar but, yet, a little different.  In fact, there are different permutations of Data Compilers.  For example:

  • Big Data analytics tools are Data Compilers.  1) They ingest Data & Rules, 2) they process and transform that Data based on those Rules, and 3) they automatically generate Reports, Charts, and Visualizations.
  • Web Site and Digital Library Synthesizers, such as IF4IT NOUNZ, 1) ingest Data and Rules, 2) process and transform that Data based on those Rules, and 3) automatically generate Web Sites and Digital Libraries.

This process is called “Data Driven Synthesis.”  In other words, you use Data as fuel to automatically generate (i.e. synthesize) some other construct.

Why would you use a Data Compiler?

Data Compilers allow you to rapidly transform very large quantities of Data, very quickly, in a manner that allows you to easily make changes.  This means you can “fail fast” and leverage agile processes to iteratively get to solutions you like, faster, with higher levels of quality, and for a much lower investment.

If you’ve ever dealt with a very large database, like a Data Warehouse, you know that changing what’s in that repository can be very complex, time consuming, and expensive, especially as the Data volumes grow to be very large.

Data Compilers use no databases, although some can dump their outputs into databases if they’re programmed to do so.  More often than not, Data Compilers generate an “artifact” like a report.  Some that are more advanced are programmed to generate far more complex artifacts like Documents, Web Sites, and/or Digital Libraries.  IF4IT NOUNZ is an example that knows how to generate stand-alone HTML documentation, in the form of a deployable Web Site that is structured as a Digital Library.

Trade-Offs

Databases are transaction and can be used for real time operational data.  Data Compilers don’t usually work well in real time implementations because of this.

Data Compilers are a batch-based paradigm.  They take some time to compile and that time can grow, based on the quantity and/or complexity of the input.  However, unlike databases that require you to design and use a Schematic (Schema) to control data structure, you can rapidly change Data structure, at will, to meet your needs, without the requirement of complex, time consuming, and expensive modeling.

Databases and all the infrastructure associated with them, including all the development you need to do around them, yield far more complex solutions, that take longer to implement/change, and which are much more expensive.  Data Compilers are far simpler, quicker to get to result, and far cheaper.

Summary

Data Compilers are tools used to dynamically and automatically generate complex Data and Information artifacts like Documents, Web Sites, and Digital Libraries, directly from Data and from Processing Rules.  They are not meant to replace database-oriented systems but act as alternatives that help solve certain problems more efficiently and effectively, especially in the areas of Business Intelligence, Analytics, and Knowledge Management.

What is a Data Driven Web Site?

If you’re still building web sites like your Intranet, manually and one content page at a time, your competition is about to crush you.

The old way of building Web Sites is one Web Page at a time

Traditional Web Sites are built and curated, manually and one content page at a time. This means that, after you write your content, you have to do things like associate tags, categorize it, link it to other pages, and then publish it.  If you have to do this for thousands of Web Pages, the time to do all these things is significant and is one of the primary reasons that Web Sites like company/organizational Intranets fail.

What if you didn’t have to manually write and curate all your content but, instead, could let a computer do the bulk of the work for you… at least all the work that goes beyond the authoring of the content, itself? Just think of all the time you’d save and the amount of content you could publish.  This leads us to the new way of creating Web Content, which is through a paradigm called Data Driven Synthesis (a.k.a. Data Compilation).

What is Data Driven Synthesis (a.k.a. Data Compilation)?

Data Driven Synthesis, otherwise known as Data Compilation, is the process of ingesting data, along with rules that tell you how to deal with the data, and generating an output that conforms to the specified rules.  In other words, you use Data as building blocks to build other things.  Because a computer is doing the work, you can get to far more advanced features that surround your Data and Information (i.e. your Web Content).  For example:

Example of NOUNZ Synthesized Web Views
Example of NOUNZ Synthesized Web Views

Imagine how long it would take you and how expensive it would be for you or your organization to build all those views, for hundreds of thousands or even millions of Data elements.  And, that’s assuming you have the skills to do it all, which most organizations don’t.

What are Data Driven Web Sites?

Unlike traditional Web Sites that are built one content page at a time, like your Intranet or even Wikipedia, Data Driven Web Sites are those that are built automatically and directly from data (including embedded content). This means that in the time it take you to normally build, curate, and publish just one Web Page (including links to other pages), a computer can build, curate, and publish hundreds of thousands and even millions of pages that include complex features that you couldn’t even dream of creating, yourself.

How is it possible to use Data to drive the automatic synthesis or compilation of Web Sites?

Your data is composed of “things.”  Most of these things have importance to people and systems, under specific contexts.  These “things” also have relationships to other things, and even to themselves, also with descriptive context.  Data Driven Synthesizers or Compilers are programmed to know how to read data, extract the “things,” categorize them, organize them, create relationships between them, and turn them into other things… like web pages (at a minimum) or Digital Libraries (in a more advanced state and structure).

A Simple Example

Imagine a spreadsheet (2 dimensional view) that represents a list of people with descriptive attributes.  For each person, we have the following descriptive attributes:

  • Last Name
  • First Name
  • Age
  • Gender
  • Mother
  • Father

I you were asked to turn every record into a Web Page, there are two options for generating each page…

  1. Option #1: Manually generate one web page for each person that shows the descriptive attributes for that person.
  2. Option #2: Use a computer to read the data and automatically generate one Web Page for each person.

Option #2 would clearly be faster.  However, even Option #2 has different permutations for implementation…

  • Option 2a: Build a transactional solution that leverages a database and dedicated code that knows how to read the database and render web pages, on the fly.
  • Option 2b: Build a compiler that simply reads the data and spits out static Web Pages that you can publish.

The problem with Option 2a is that you have to build it all and manage it all, yourself.  If you change an attribute you need to change the code.  If you change the look and feel of the user interface, you have to change the code.  If you want to build advanced data joins (for queries), you have to build it yourself.  Changing things gets very expensive and is extremely time consuming.

In Option 2b there is no database and no web server.  The Synthesizer or Compiler reads the data and spits out a complex Web Site that you can carry with you on a Flash/Thumb Drive and read/view without a Web Server.

Data Driven Synthesis is a quick way to feed data into a software Synthesizer/Compiler that spits out a complex Web Site, faster than it would take for humans to install infrastructure, customize that infrastructure, and manage data for that infrastructure.

A common example of failure that almost everyone has experienced: Your Intranet

The simplest example of failure is the traditional Intranet. Ask most people and they will tell you that their Intranets are pretty bad. It’s rarely because of the features but more because of the lack of content (Data and Information) that they need to perform their jobs. The fact is that those people who build Intranets can never produce enough content fast enough and powerful enough to meet the needs of those employees and consultants who consume it. As a result, the Intranet sits around barely being used… a noticeable waste of investment.

Data Driven Web Site Builders/Synthesizers like NOUNZ allow you to create Web Sites that contain massive quantities of neatly categorized, organized, and highly inter-linked content, in minutes. And, for every one HTML link you create, manually, these Data Driven Web Site Builders can find and create many thousands.

So, if you’re not writing your content, where is it coming from?

Believe it or not, you can quickly and very easily get access to and/or pull data for content from many sources. Examples include but are not limited to: Spreadsheets, Microsoft Sharepoint List Structures, and Systems that have Databases.

Imagine feeding a simple spreadsheet with 1,000 products, along with all product details, into a Data Driven Synthesizer and, with the click of a button, the Synthesizer generates far more than 1,000 pages, all neatly categorized, organized, and inter-linked to other related data and topics… in seconds.  And, imagine that the output automatically includes indexable views of data, dynamic charts and dashboards, and interactive graphic visualizations that help your end users see and use the Data and Information in many different ways and under different contexts.  This is the power of Data Driven Synthesis.

Example of a Data Driven Web Site Synthesizer

IF4IT NOUNZ is the most prominent example of a Web Site Synthesizer. It takes data, in 1st Normal Form, along with some rules, and simply generates a massive and very powerful Web Site that is structured in the form of a Digital Library. In the time it takes a person to normally create, curate, and publish one traditional content page, NOUNZ can do even more for tens of thousands and even hundreds of thousands of pages.

Summary

The simple truth is that if you’re building Web Sites, using traditional Content Management Systems (CMSs) that require you do so one page at a time, you should consider that your competition is now building their Web Sites automatically and with Data Compilers, hundreds of thousands (and even millions) of pages at a time. While you worry about how to build and strap in new features, your competition is letting Data Driven Synthesizers do the bulk of the work for them. This means that for the cost it takes you to build one page, they perform many hundreds of thousands (and even millions) of times more work, simultaneously, yielding far more features.