My First Look Into DW-BI
Data Warehousing and Business Intelligence (DW-BI)
I really do suppose it’s time for me to talk about this topic now, I’ve had these tabs pulled up on Chrome for close to a month now and I kept telling myself, “I’ll write about it today… no today… nah tomorrow” And now a good month later, I’m finally talking about DW-BI. Data Warehousing and Business Intelligence. The more I read into it, the more I realized just how confused I was, the more I realized it was such a huge industry within the database field itself and why it is always data warehousing AND business intelligence. Sure, they’re still independent and individual entities, but together they come in perfect harmony, just like “Ebony and Ivory”!
*Cue terrible laughter and comments on how you expected a different GIF or reference… I’ll take pity laughter, it’s still appreciated*
Anyways, DW-BI become such an interesting topic to me that I spent an unfortunate and forever undisclosed amount of money on getting some actual PHYSICAL books and resources to read, mainly based off of the Kimball method which is the leading practice and go-to resource regarding data warehousing’s best practices. Now let’s do some definition-work! Business Intelligence refers to a set of methods and techniques used by organizations to help with tactical and strategic decision making, based off of analytics, trends, and statistical values conducted using data warehousing and database methods. It emphasizes the use of technologies that focus on counts, statistics, and business metrics to help improve an organization’s performance and standards!
Data Warehousing is a monster in itself I suppose. I guess the easiest way to describe it consolidating data from a variety of sources and tables meant to help support strategic and tactical decision making. If you read my first post about Database Normalization you should be familiar with relational databases and how they are normalized to prevent any redundancies and irrelevant columns in a table, as well as ensuring that they are all easily connected and “related” through foreign keys. However, in large situations and organizations having multiple tables parsed and queried can increase the processing time and resources thus making it cumbersome for quick searches, as a result data warehousing essentially “denormalizes” these tables into larger tables where there CAN be redundant data but still relies on one value per row and column, but with the redundancies more things can be pulled up and data can be found in a shorter amount of time! In a sense, if you’re trying to select all of the rows of data in a table regarding hospitals that use Epic, it’s easier to get a more coherent picture and view of those hospitals in a shorter span of time since it isn’t querying and cross referencing various tables to look for rows of data with the same foreign key.
DW-BI has been around for decades now, but it didn’t really grow and gain the attention until maybe the 90’s and early 2000’s? Before then, information was operated in large mainframe-based applications that were really designed just for operational, data entry purposes. Programmers and developers thus had a need to address the constant paper reports and worked to write report programs that could be generated based on queried data and information from databases. After that, relational databases became a huge industry where data could be stored and organized in these tables instead of the vast Excel spreadsheets and even hand written plots as well. Although it was great and easy to read on an end-user side, there was often complex logic required to join and related each of the tables, often done by taking courses or learning the intricacies of database design and understanding normalization like the back of your hand. When it came time to generate reports, it required multiple queries that soon became inefficient and a burden.
Overall, data warehouses are subject-oriented, integrated, time-variant, and non-volatile. This makes them focused on a business concept (such as sales) as opposed to a business process (issuing confirmation emails) and contains all the relevant information on the concept gathered from multiple processing systems.
Companies started to invest large sums of money in order to support up-to-date, accurate and integrated information about key metrics and indicators of their organization. As free trade boomed, globalization and computerization really increased with the technological changes that the world was facing. The Internet was finally pushed out into the public and companies rushed to try and make their way into the eCommerce marketplace and eBusiness industries as well to reduce staff needs and provide a greater range and variety of services. With multiple tables and now an increasing company and web presence, the databases were becoming expensive, poorly integrated and as a result inconsistent across systems, users, and customers. As a result, certain elements were pushed to help establish data warehouses today.
- Single Version – Consistent and Updated. Meant to reduce the fragmentation and inconsistency of data in multiple databases across the organization did not allow accurate strategic and tactical decision making.
- Flexibility To Changing Information Needs – With the changing Internet and growing trends, it is important for organizations to be able to adapt to many different trends as needed. As a result, it is easier to slice and dice data than trying to push toothpaste back into the tube. By doing so, it can offer good customer service by retaining simple and easy to access data to their needs… as needed hehe. This point is greatly improved by business intelligence and its objective of better understanding customers, to make the supply/distribution chain more efficient, and to identify and address business problems and opportunities quickly.
- Although organizations may not be using connected technologies and application systems, it is still easy to manage because all of the information and data can end up in the back end databases! All the matters is the correct integration of data into the appropriate and meaningful fields. It allows the flexibility for integration, however the preferred practice is too sync data through complete integration.
- Easy Reporting – Don’t have to say much, by consolidating and combining what would have been several tables together, it’s easier to generate reports because queries do not have to be so complex and long in order to pull the appropriate information. It simplifies the reporting environment.