< Return to content

Bring Unprecedented Order To Your Data At Any Scale

by David Enga
3 min read

We live in the era of exploding volumes of data. In many ways, data is eating the world.

We believe that data-centric organizations will accomplish extraordinary things when they have an exponentially better way to organize their data. Not many people realize that computer science is still in its infancy and that there are many new breakthrough algorithms yet to be discovered. Our customers know, because we've shared with their CTOs, Chief Product Officers, Chief Data Officers, and Software Architects how our remarkable algorithms bring order to data at any scale. We call the technology Black Forest.

Today, data-centric organizations do not have the right algorithms for organizing, querying, and integrating data so that they can efficiently unlock its value. For instance, imagine trying to find a book in the 190 million item Library of Congress without a card catalog.

Sure, a staff of ten million librarians could split up the 190 million books so that each librarian only need to look at 19 books to find the book you want; but it would take the same amount of work (190 million books looked at) and cost an outrageous amount. This approach seems comical, until you realize that this is the exact way many data management tools operate because they can't maintain indexes (digital card catalogs) at scale.

The following graph shows the notional mathematical differences between various ways of organizing and querying data at different levels of scale.

Relational databases have long been able to efficiently index data using BTrees and other indexing technology. These indexes are very efficient with computational cost of O(Log N). However, these types of indexes become too expensive to maintain when you reach enough scale. At some point, shown by the red rectangle in the illustration, it becomes too expensive to manage the underlying data structure. This problem was why Google didn't use relational databases to power their search engine. Instead, they custom built their own distributed indexing software with just the limited functionality they really needed to map a keyword to the pages it occurs on.

To achieve enough performance at scale, many products take a massively parallel processing O(N) approach. Massively parallel processing sounds very cool. However, this brute force approach is the same as if every query was a dreaded full table scan. By applying massive computational resources, these products can provide fairly fast results by checking all 190 million items to see if it is the book you are looking for. Hmm, our ten million librarians were doing massively parallel processing!  Massively parallel processing is very cool for certain tasks, but it is a very uncool (literally uncool because of the heat produced by the computation) and unsustainable way to query data.

What if instead of a query effort cost of 190 million, your book could be found by performing around 28 operations?  Of course everyone would choose this approach if it was cost-effective to maintain the indexes at the necessary scale.

Maintaining indexes cost-effectively is the most important problem in all of data management and our O(1) Black Forest algorithms can do it at any scale.

Craxel has created a suite of products with standard interfaces that you can plug into your enterprise so that you can bring unprecedented order to your data.

Black Forest Cloud Data Platform™ decouples the organization of data from compute so that you can accelerate your analytics, AI algorithms, data fabrics, and data lakes in the cloud, on-premises, or at the edge without throwing out your investments in your favorite tools and frameworks such as Hadoop and Apache Spark. Instead of expensive and time consuming brute force computation, Black Forest organizes your data for rapid and cost-effective query at any scale. This is an incredibly easy way to get started with Black Forest but we have even more ways for you to bring unprecedented order to your data...

Black Forest Database™ is an OLTP database with extraordinary performance and data security features. This product is for your hardest transactional, data security and real-time analytics challenges.

Black Forest Distributed Ledger™ is a distributed database that provides trustless and immutable ACID transactions at unprecedented scale.

Everything we do and everything we build is about making it as easy as possible for you to bring unprecedented order to your organization's data. We believe that is fundamental to your success and the success of your data-centric organization.

Contact us at info@craxel.com to discuss how this technology can help solve your most challenging use cases.