DHEE: An Intelligent Data Orchestrator and Observer

July 2023

With the explosion in interest in AI (no small thanks to ChatGPT), businesses the world over are looking at how they can leverage this technology to help their business grow. The best part is that this is not a technology for the sake of technology, there are myriad ways of readily applying this in a business. From empowering product, marketing and customer support teams internally, to providing key data instantaneously to management and providing updated information directly to customers, the business value of AI-driven search and retrieval alone is immense.

As with every technology of course, the devil is in the details. There is no magic wand AI/ML can wave and produce predictive insights if relevant, clean, sufficient data is not centrally available and ready to be utilized. This is where the legacy data warehouses of yesterday are ceding way to the modern cloud data lakes (Azure, Google, AWS, Databricks, Snowflake etc.). Many of these platforms promise the key benefits like –

  • the ability to secure and centralize all data in a cost-effective datalake
  • provide access to many stakeholders- data scientists, analysts and so on.

It is but a natural product extension that these same platforms then provide standard machine-learning models, and intuitive (conversational) user-interfaces. As they like to say, ‘The hottest new programming language is English’, meaning that the learning curve to ask questions of the data and get meaningful answers is getting accessible all the way up top, to the business.

So, to begin with, how do we help organizations get data from their multiple sources, in varied formats, to a central cloud data lake platform? From data in their ERP’s (Oracle, SAP), to their modern online-data (Web-clicks, Google Analytics), to modern CRM (Salesforce) data, the format and types are varied and owned by different departments.

The team at DataLens, a born-in-the-data-cloud company, is privileged to have served many large enterprises to build connectors, data validators, automated data pipelines and BI templates. Borne out of such experiences is ‘Dhee’ (intelligence in Sanskrit), a ‘data project accelerator’ to help customers migrate, validate, transform their data to a central platform.

The Dhee Value Proposition

Many of the features of Dhee – Connectors, DQM, Orchestration and Observability – are available in many products and platforms today. So why is Dhee different?

First, Dhee is not a ‘SaaS only’ platform. Customers can install Dhee inside their cloud account and customize it for their requirements. Dhee does not store customer data; it works with AWS, Azure or Databricks datalakes.

The key features of Dhee are


Dhee is built on open-source, and adds connectors everyday; there are connectors for just about every data source, and if not, one can be built quickly.

Data Quality

Built on top of open-source, Dhee offers an intuitive, low-code interface for data engineers to build their data validation, cleaning and transforming routines. This addresses a key project inhibitor, i.e the time taken to build and use data quality routines.

Multi-Platform Orchestration

Dhee works with AWS, Databricks etc. Jobs built in AWS Glue or Databricks can be orchestrated from inside Dhee. This also allows for enhanced logging across platforms, as we will see in the next point.


Dhee is built for easy monitoring and control of jobs/pipeline status and captures logs from the AWS/Databricks platforms as well as allowing for data engineers to script detailed logging mechanisms for visibility. Coupled with a graphic interface (Grafana). Dhee’s monitoring capabilities are a stand-out from other platforms that offer similar features.

Above all, Dhee is a constantly-evolving platform , built from the DataLens teams’ field experience, and adding features built for customers, constantly. This ‘built-by-engineers-for-engineers’ is the real value that Dhee provides to customers.

Follow us on LinkedIn for more updates.