Organizations face increasing competition and compressed time frames that require intelligent use of all available data. Reports of yesterday’s sales and operational figures must be accompanied by analyses of a variety of structured and unstructured data, including streams of sensor data, social media posts, and data from many other internal and external sources.
Organizations also need to perform sophisticated artificial intelligence and machine learning (AI/ML) analyses on the data they collect and they must be able to do so in real time. Maintaining separate systems for each of these requirements makes it difficult to be agile and responsive in today’s fast-moving markets. New data and analytics architectures are available that can support all these requirements with a common platform.
Line-of-business functions have traditionally relied on data warehouses to provide information organized in a business-oriented format, ready for analysis. Recently, data lakes have emerged as a way to manage massive amounts of data, including unstructured data. Organizations need both these capabilities and they are often used together. In fact, nearly three-quarters (72%) of our research participants report that their data lakes are associated with their data warehouses. In one-quarter of the cases (23%), the data lake serves as a superset of these capabilities where the data warehouse is built within the data lake, giving rise to the term “lakehouse.” This combination streamlines integration to reduce complexities, minimize the risk of inconsistencies and minimize any delay in accessing the information.