In computing, a data warehouse is a form of data management system that is specifically made to facilitate and back business intelligence tasks, particularly analytics. Data warehouses play a vital role in helping managers to make strategic decisions for their companies since it assists in the large-scale collection of data (Moscoso-Zea et al., 2018). The idea of a data warehouse has been around since the 1980s, when it was first invented to enable data movement from powering operations to supporting decision support systems that disclose business insight (Bhatia, 2019). The vast amount of data in these models originates from various sources including international applications such as finance, sales, and marketing. This essay explores how data warehousing can be used in managing structured data by exploring its origin, categories, and its relevance in the 21st century.
Origin of Data Warehousing
Data warehouse has a rich history that can be traced to the origin of computing. Therefore, its architecture was created in the 1980s to facilitate converting data from operational systems to models that are essential for decision-making (Bhatia, 2019). Therefore, the relational databank development in the early 1980s paved the way data processing. It was quickly realized that databases designed to be efficient for transactional processing were not necessarily suited for sophisticated reporting or analytical requirements. The demand for decision support systems precedes the first relational model and Structured Query Language (SQL) (Costa & Santos, 2018). In the beginning of the 1970s, ACNielsen, a market research and television rankings giant offered customers a concept referred to as a “data mart” to boost their sales initiatives. However, the present-day data warehousing saw its inception in the late 1980s when the phrase “business data warehouse” was created in an International Business Machines (IBM) Systems article printed in 1988 (Hooper, 2018). The birth and growth of data warehousing was attributed to one pioneer.
Data warehousing was created and developed and is considered today as a significant component of computing. Bill Inmon, is considered by a majority as the founder and inventor of data warehousing since he formulated its principles and coined its name (Bhatia, 2019). Computerworld recognized Inmon in 2007 as one of the leading figures who matter in the last forty years. Throughout the 1970s and 1980s, Inmon worked as a data specialist, refining his competence in all categories of data modeling (Costa & Santos, 2018). His work as a data working innovator started in the early 1990s when he created his firm, Prism Solutions. Moreover, the data professional published one of his most influential seminal volumes (Bhatia, 2019). Later, Inmon formulated the concept of the Corporate Information Factor, a business-level perspective of a firm’s data in which data warehousing plays a significant part. His webpage committed to the CIF functions as a source for Inmon’s writing and white papers on all concepts of his profession. The latter’s approach to data warehousing emphasizes on a centralized data source displayed to the third normal form. Therefore, Inmon laid the foundation for the modern day use of data warehouses in various business industries.
Categories of Data Warehousing
The practice of data warehousing exists in various categories. For example, the three significant types of data warehouses are enterprise data warehouses, operational, and data mart. An enterprise data warehouse is unified warehouses which offers decision support services across firms. It also provides a seamless approach for arranging and representing data. In addition, it also has the ability to categorize data according to the subject and gives access as per those divisions (Chandra & Gupta, 2018). The second category is referred to as operational data store (ODS) which is used when a data warehouse does not meet company reporting needs. However, ODS is refreshed in real time which makes it a preferred choice for routine tasks such as storing employee details. Lastly, a data mart is a sub-group of the data warehouse specifically invented for specific business needs such as sales and finance (Bhatia, 2019). Therefore, data can collect directly from sources in an autonomous data mart.
Data Warehousing in the 21st Century
Today, the practices of data warehousing have incorporated new trends and changes. For example, cloud computing has been the revolutionary force that has shaped the face of data warehousing in the world. In addition, real-time data analysis has also contributed to the significant changes in the way people manage data. From an end-user’s perspective, web-based and mobile access are considered imperative requirements on several projects (Costa & Santos, 2018). Improvements in ontology practice have strengthened the ability of ETL systems to extract information from both structured and unstructured data sources.
Big data also shapes modern day practices of data warehousing. Since compliance has developed to become a significant factor in the advent of the Sarbanes-Oxley Act, data quality and governance have expanded in significance concerning the handling of data warehouses (Bhatia, 2019). In essence, data warehousing highly relies on solid enterprise integration akin to any component of data management practice. Whether a firm adheres to Inmon’s top-down centralized perspective of warehousing, Kimball’s bottom-up star-schema technique, combining a warehouse with the corporation’s data architecture remains a pivotal principle (Costa & Santos, 2018). Since the inception of data warehousing three decades ago, the principles highlighted by Bill Inmon and Ralph Kimball are still relevant in the industry. Their seminal contributions to computing have outlined a section of the data profession that continues to develop today.
Data Warehouse for Structured Data
Structured data is a category of data that follows a pre-defined path or data model and therefore easy to examine. Structured data follows a tabular format with relationships between columns and rows. Excel spreadsheets and SQL databanks are some common examples of structured data. Organized data is predicated on the presence of a data model, which is a description of how data may be saved, analyzed, and retrieved. Due to the presence of data architecture, each field is distinct and may be accessed independently or in conjunction with data from other fields (Bhatia, 2019). Structured data is potent since it enables individuals to easily combine data from multiple areas in the databank. It is also regarded as the most “conventional” form of storing data, since the earliest forms of database management systems were able to save, analyze, and access structured data.
Today, cloud computing has revolutionized the way structured data is stored, processed, and accessed. Cloud data warehouses collect data from multiple sources into a centralized hub, to back various enterprise needs, analytics, and visualization (Chandra & Gupta, 2018). The new age of data warehouses is constructed to run in the cloud instead of needing a corporation to own on-premise server units. Therefore, they are provided to the customers as a managed service, with the physical infrastructure maintained by the cloud provider. It is not essential for customers to invest in hardware or software since they need not to worry about server maintenance or other issues. In recent years, the use of cloud-based databanks has become widespread as more firms use cloud services and intend to minimize or eliminate their on-premises investments (Hooper, 2018). As such, these platforms come with several advantages such as scalability, cost savings, speed, security, and availability.
Reasons for Storing Structured Data
It is essential for individuals and organizations to participate in storing structured data. However, it is challenging and resource-consuming to store it due to its increasing volume and complexity. Data professionals engage in the process of data storage for various reasons. For example, it facilitates easy manipulation and querying of unprocessed information (Mehmood & Anees, 2019). One of the leading benefits that comes from storing structured data is that it can be easily used by machine learning algorithms. The specific and orderly nature of these forms of data enables easy management. Another factor influencing professionals to engage in storing data is that it can be easily used by business users. It also has increased access to numerous tools since it has been existing and used for a longer time, as historically it was the only alternative (Mehmood & Anees, 2019). This fact implies that there are several tools that have been tested in using and examining structured data. In essence, data managers have numerous product range when using structured data.
Regardless of the benefits that stem from storing structured data in various data warehouses, there are several limitations. For example, the predefined nature of structured data limits its use. Since this category of data can be stored in data warehouses or relational databases, the factor of storage inflexibility, as seen in the stringent structures present in these models is a drawback (Mehmood & Anees, 2019). Therefore, if a person plans to alter their data requirements, the probability is that they will have to update all structured data. In essence, the entire process of updating will lead to an increase in the use of resources and time. Therefore, time and cost factors are key components that are compromised as a result of attempting to alter the platforms within which structured data is stored. However, some expenses can be eliminated by utilizing a cloud-based data warehouse, since it facilitates scalability and removes maintenance expenses generated from hosting on-premises servers.
Relevance of Data Warehousing for Structured Data in Real-Life
Data warehousing has been significantly used to store, examine, and process structured life. This function is imperative in the society since it improves various practices in multiple industries. For example, in the airline industry, these systems are used for operational functions such as frequent flyer program promotions, examining route profitability, and crew assignment (Bhatia, 2019). In addition, the banking sector also uses data warehousing to manage the resources available. Some banking institutions have utilized these resources to perform market research and performance analysis of products and operations. A banking data warehouse plays an intermediary role when it acts as the link between operational data and routine professionals. In the healthcare sector, data warehousing also contributes to its development and growth. For example, it can be used to forecast and predict outcomes and inform decision-making processes for generating patient treatment reports (Chandra & Gupta, 2018). It is also used for disseminating information by sharing data with tie-in insurance firms and healthcare aid services.
The public sector can also utilize the principles of data warehousing for intelligence gathering. In particular, it plays a vital role in enabling government bodies to maintain and monitor tax records and health policy records (Bhatia, 2019). The investment and insurance industries have also benefited from using data warehouses. These assets are used to examine data and consumer trends, and also to monitor market movements. The retail sector is one of the leading industries that use data warehousing. In essence, retailers use it for distribution and promotional acts. It also assists in monitoring inventory, customer buying habits, and determining pricing policy.
Moreover, supermarkets and other grocery stores can also use data warehousing for improved business intelligence. Since these systems extract data from multiple divisions and organize it in the form of dashboards, retailers can use them for informed decision-making. The self-service portal provided by data warehouses allow individuals to divide and analyze data which enables firms to generate critical business insights (Bhatia, 2019). In essence, retailers can use data warehousing for demand forecasting and scaling of operations. Retailers require demand forecasting since their scope of operations changes throughout the year depending on the season. For example, a significant portion of a retail corporation’s annual revenues is often attributed to the holiday season. As a result, it is imperative for retail outlets to improve and strengthen their position to meet the increasing demand during peak seasons and adjust accordingly to minimize losses.
Data warehousing still remains an imperative driver of growth in multiple industries. The origin of this computing resource dates back to the 1980s when it was created to enable professionals to convert operational systems essential for making informed decisions. Bill Inmon is regarded as an influential figure who coined the term “data warehousing” as well as published various seminal works which have had monumental impacts in the 21st century. The use of data warehouses has also benefited multiple industries especially since it helps in storing, analyzing, and accessing structured data. For example, data warehousing has been effectively used in the healthcare, banking, airline, and retail sectors to make informed decisions and manage employee data. In essence, the concept of data warehousing is still important for the society irrespective of its limitation. Therefore, IT professionals should continue research and development initiatives to discover and exploit possible opportunities for improving data warehousing architecture.
Bhatia, P. (2019). Data mining and data warehousing: Principles and practical techniques. Cambridge University Press.
Chandra, P., & Gupta, M. K. (2018). Comprehensive survey on data warehousing research. International Journal of Information Technology, 10(2), 217–224. Web.
Costa, C., & Santos, M. Y. (2018). Evaluating several design patterns and trends in big data warehousing systems. International Conference on Advanced Information Systems Engineering, 459–473. Web.
Hooper, L. (2018). Becoming a warehouse of things: The audio world is changing, and collection development methods must change, too. Music Reference Services Quarterly, 21(3), 111–121. Web.
Mehmood, E., & Anees, T. (2019). Performance analysis of not only SQL semi-stream join using MongoDB for real-time data warehousing. IEEE Access, 7, 134215–134225. Web.
Moscoso-Zea, O., Paredes-Gualtor, J., & Luján-Mora, S. (2018). A holistic view of data warehousing in education. IEEE Access, 6, 64659–64673.