Data Governance is the practice of ensuring that collected and stored data are subject to quality and security standards. Implementing a robust data governance architecture will support our efforts to protect customer data and will be an investment in protecting future lines of business as well.
An effective data governance architecture must cover systems, processes, accountability, employee practices, and external practices to ensure that escapes in quality and security are minimized.
Utilizing the DGI Data Governance Framework from The Data Governance Institute (citation) we will define the People & Organizational Bodies, the Rules & Rules of Engagement, and the Processes for Data Governance at AirBnB using their 10 piece framework.
Framework Elements Of AirBnB’s Plan
Rules and Rules of Engagement
1. Mission and Vision
We are very clear about what we want to accomplish and why we are implementing this plan: customer security is our priority, and we must implement a data governance plan that both internally and externally demonstrates our commitments to our customers. Losing customer trust means losing customers, so it’s imperative for AirBnB to take these steps to renew our customer’s faith in our service.
2. Goals, Metrics, Success Measures, and Funding
The area of data governance that we will focus on is data security, with the emphasis on anomaly detection. We have established key business assumptions, key business requirements, and acceptance criteria for these cybersecurity measures. A business case has been provided to AirBnB executive leadership that illustrates the negative financial impact of not implementing these measures to mitigate security risks.
3. Data Rules and Definitions
Additional rules and standards will be put in place regarding Access Management, Data Usage, Anomaly Detection and Data Access. We will also clearly map out the data stakeholders (internal and external) that are represented and impacted by this new governance process, and ensure that any new procedures and standards are effectively communicated.
4. Decision Rights
We will develop an RACI matrix (Responsible, Accountable, Consult, Inform) of stakeholders and employees to define data governance responsibilities across AirBnB and externally.
We must cultivate a culture of accountability at AirBnB that doesn’t begin and end with IT, but throughout the data flow. Once this data governance process is implemented, we will monitor processes and use dashboards and reports to track our progress to plan.
Our data infrastructure and data rules are being modified with an emphasis on data security. Using secure systems such as Microsoft SQL as our data engine and [TBD] as our data lake configuration, and data rules around security will help prevent and uncover security issues.
People and Organizational Bodies
7. Data Stakeholders
Our data stakeholder map will show who and what in the AirBnB ecosystem generates data, uses data, and makes decisions about data. This will be useful as we implement processes throughout the data flow stream, so that we can understand who is impacted and how.
8. A Data Governance Office (DGO)
A Data Governance Office (DGO) will be established in this new governance model to maximize the effectiveness and efficiency of our data governance implementation. The DGO will be responsible for managing and developing the data governance roadmap elements, advocating on the behalf of stakeholders, bridging gaps between business units at AirBnB, and otherwise serving as a Center of Excellence for the enterprise.
9. Data Stewards
A critical part of our data governance process is ensuring that it is viewed as a shared responsibility across the enterprise, rather than just the responsibility of the DGO, so Data Stewards will be distributed across the business units.
10. Data Governance Processes
The DGO will be responsible for establishing the above framework and overseeing its execution at AirBnB. The combination of buy-in at the executive level and detailed data stakeholder mapping by the DGO will support a successful implementation and long-term success.
Comparison of Risk/Benefits of two Database Engines
1. Microsoft SQL Server is a database engine that operates both on the cloud-based and local servers. When properly set, this engine can function on both servers at the same time. It is available on both Linux and Window based platforms of Microsoft. However, the cost of Microsoft SQL Server is too high and may be beyond the company’s affordability. The management at AirBnB can use this database engine until they are able to track changes made to data over time. The engine also has a dynamic data masking which ensures only authorized people are able to see the sensitive information of the company. When installed, the AirBnB clients will be able to access the firm’s services on their mobile devices. It is also fast and stable. Nonetheless, this database system could be subjected to a certain level of risks like it can gobble resources, as well as the challenge of integration of services when importing files.
2. PostgreSQL is another database engine that the AirBnB Ltd can use for data protection. It is common in web databases and is essential for the management of both structured and unstructured data. Its latest version is PostgreSQL 9.5 which offers large volumes of data and increased number of the concurrent users. The company will be able to increase the number of users of the users of its database while at the same time keep track of every user ensuring that the activities of any illegitimate users are promptly detected. AirBnB using this system will enjoy the benefits of improved security which will support the DBMS-SESSION and expanded password profiles availing more interfaces. The cost of this database, however, could be too expensive that if the firm has a limited budget, they may not be able to afford it. Its usage may also be subjected to risks like configuration challenge. AirBnB users may also suffer from accessing the business information, particularly during large bulky operations.
3. AirBnB can also use MongoDB as a database system for securing its information. MongoDB uses both structured and unstructured data and is a very versatile system that operates by connecting databases to applications through Mongo DB database drivers. It is cheap from the cost perspective. if the management of AirBnB decides to use this database, they will enjoy benefits like speed and easy to use. Also, whether structured or unstructured, the data stored in this system can be traced easily. The system, however, is subjected to risks like its default settings not being secured and so where a query language is critical, the database may not help the company.
4. MariaDB is a free DBMS and is also offered in a free version. It is a fastest growing open source database available. When applied for data security management, AirBnB will enjoy benefits like faster and stable processing speed which increases the firm’s performance. The database similarly has a progress bar which enables the users to know the progress of any query. However, the use of this engine will subject AirBnB to the risk of no guarantee to further updates and versions in future. When it comes to the cost of establishing this system, it is relatively cheap compared to Oracle or MySQL.
How the chosen engine will support the business needs
Microsoft SQL will be the ideal database engine for this company. This is because AirBnB is working to develop a system that meets the security standards. That is, it has a dynamic data masking, user-friendly and is fast enough to meet clients’ needs.
Risk/Benefits of Data Lake configurations
1. By the mobile and websites that are developed by Adobe Systems. Its cost is fairly cheap and users can only spend less than $ 9.99 monthly subscription without annual commitment. It is simple and user-friendly.
2. As a Data Lake configuration component, Mapreduce is a programming model associated with the implementation of processing and generation of big data using both parallel and distributed algorithm on a cluster. This system functions by marshalling the distributed servers, running the various activities parallel, managing all the communications and data transfers. It’s relatively expensive but performs more complex functions compared to Spark.
3. The Google Cloud Platform is another Data Lake configuration that runs the same infrastructure that Google Uses for its end-user products. No upfront cost is required. However, using this application post a lot of threats on the security of the firm’s data. Where the management of AirBnB chooses to use this system, they are likely to enjoy benefits like better pricing and improve the performance of their activities.
How the chosen Configuration will support the business needs
Google Cloud Platforms is the best selected Data Lake configuration for AirBnB Ltd. The company is likely to benefit from openness, flexibility and the cheap cost of this application. Besides, the system provides a platform that is user-friendly and is likely to meet the needs of all the clients if this company across the globe.
One of the Data Lake configuration components is Hadoop. Hadoop is a collection of open source software utilities that facilitate the use of a network of many computers in solving problems that involve massive data and computations. It is cheap and suitable for organizations that are unable to fully install IT. Where AirBnB uses this Data Lake Component, they are likely to face the risks of Big Data Security like difficulties to implement enterprise security requirements such as role-based authentication since it relies entirely on Kerberos. The firm may, however, benefit from a fast and flexible nature of this component as well as its resilient to failure.
Modern Data Governance comprises the following two main technologies. These include data ingestion tools that support the process of bringing data into analytics ecosystem. There is also a data-cataloguing that manages the inventory of the data sets. AirBnB will need this tool to connect its clients with the necessary information they may need about the company. Data Preparation is another tool of governance useful for improving, enriching, formatting and blending data making it ready for analysis and reporting. Reporting and Analysis is the last tool of data governance which companies use to explore, model and visualize data to establish trends, patterns and insights thereby making it easier for consumers to interpret the information.