EDW Basics
The Enterprise Data Warehouse, or EDW, is a large, centralized database that combines data from multiple sources. It serves two main purposes: (1) it acts as a consolidated repository for Davidson's data to enable better reporting and analytics across the college, and (2) it helps ensure accountability and transparency for all of our institutuonal data assets.
Reporting & Analytics
Over the years, Davidson has developed or acquired numerous systems to supplement Banner. These systems include Slate (admissions), Blackbaud (advancement/college relations), Etrieve (document management), and EMS (space management). These systems all have their own databases, some of which reside on different servers and use different technology from Banner. The EDW is designed to unify these diffuse data sources on one server, in one database, under a common structure.
A great example of this is "Person data": personal details get stored in these different systems, such as Blackbaud storing alumni and donors, Slate storing applicants, and Banner storing students and employees. However, these personal details get stored differently in each database, and these systems aren't designed to talk to each other, so tracking a person's data across these systems can be difficult and time-consuming. The EDW helps us solve this problem. We create a single table to store all personal data, and we populate that table using data pipelines that extract the personal info from each system, transform the data so that all of the fields are mapped and formatted correctly, and load the fields into the EDW. Once the data is loaded, you can see all of a given person's data from across multiple systems, all in one location.
The EDW copies data from other systems. It does not write data back to these systems. Furthermore, users cannot write data to the EDW -- they can only query data from it using reporting tools like Power BI and Tableau. If a particular data point is incorrect, the user needs to identify which source system that data point originated from, then fix the data at the source system. The next time the EDW refreshes its data from the source system, it will bring in the correct data.
Data Governance
Davidson's data governance framework is the set of policies and procedures that we use to determine where the data resides, what it means, who owns it, who has access to it, etc. It's an integral part of any enterprise's technology plan.