So, what is a Data Compiler? Is it like any other software compiler? The answer is, “almost but not quite.”
We know that Java compilers compile Java source code in order to generate Java libraries and executables. We also know that C++ compilers compile C++ source code in order to generate C++ libraries and executables. You would think that it’s a pretty safe to assume that Data compilers compile Data source code in order to generate Data libraries and executables. Right?
Well, there’s no such thing as “data executables” so there is a difference.
All compilers have some similar traits:
- They take inputs,
- They process and transform these inputs, and
- They generate expected and controlled outputs.
For example, if we apply this to Java Compilers, we know that:
- Java Compilers take in Java Source Code and Build Rules,
- They use the Build Rules to process and transform the Source Code,
- And, assuming no issues, they generated Libraries and/or Executables, based on those Build Rules.
Data Compilers are similar but, yet, a little different. In fact, there are different permutations of Data Compilers. For example:
- Big Data analytics tools are Data Compilers. 1) They ingest Data & Rules, 2) they process and transform that Data based on those Rules, and 3) they automatically generate Reports, Charts, and Visualizations.
- Web Site and Digital Library Synthesizers, such as IF4IT NOUNZ, 1) ingest Data and Rules, 2) process and transform that Data based on those Rules, and 3) automatically generate Web Sites and Digital Libraries.
This process is called “Data Driven Synthesis.” In other words, you use Data as fuel to automatically generate (i.e. synthesize) some other construct.
Why would you use a Data Compiler?
Data Compilers allow you to rapidly transform very large quantities of Data, very quickly, in a manner that allows you to easily make changes. This means you can “fail fast” and leverage agile processes to iteratively get to solutions you like, faster, with higher levels of quality, and for a much lower investment.
If you’ve ever dealt with a very large database, like a Data Warehouse, you know that changing what’s in that repository can be very complex, time consuming, and expensive, especially as the Data volumes grow to be very large.
Data Compilers use no databases, although some can dump their outputs into databases if they’re programmed to do so. More often than not, Data Compilers generate an “artifact” like a report. Some that are more advanced are programmed to generate far more complex artifacts like Documents, Web Sites, and/or Digital Libraries. IF4IT NOUNZ is an example that knows how to generate stand-alone HTML documentation, in the form of a deployable Web Site that is structured as a Digital Library.
Trade-Offs
Databases are transaction and can be used for real time operational data. Data Compilers don’t usually work well in real time implementations because of this.
Data Compilers are a batch-based paradigm. They take some time to compile and that time can grow, based on the quantity and/or complexity of the input. However, unlike databases that require you to design and use a Schematic (Schema) to control data structure, you can rapidly change Data structure, at will, to meet your needs, without the requirement of complex, time consuming, and expensive modeling.
Databases and all the infrastructure associated with them, including all the development you need to do around them, yield far more complex solutions, that take longer to implement/change, and which are much more expensive. Data Compilers are far simpler, quicker to get to result, and far cheaper.
Summary
Data Compilers are tools used to dynamically and automatically generate complex Data and Information artifacts like Documents, Web Sites, and Digital Libraries, directly from Data and from Processing Rules. They are not meant to replace database-oriented systems but act as alternatives that help solve certain problems more efficiently and effectively, especially in the areas of Business Intelligence, Analytics, and Knowledge Management.