You are a database administrator managing sales transaction data by region stored in a BigQuery table. You need to ensure that each sales representative can only see the transactions in their region. What should you do?
Answer : B
Creating a row-level access policy in BigQuery ensures that each sales representative can see only the transactions relevant to their region. Row-level access policies allow you to define fine-grained access control by filtering rows based on specific conditions, such as matching the sales representative's region. This approach enforces security while providing tailored data access, aligning with the principle of least privilege.
Your organization stores highly personal data in BigQuery and needs to comply with strict data privacy regulations. You need to ensure that sensitive data values are rendered unreadable whenever an employee leaves the organization. What should you do?
Answer : C
Using customer-managed encryption keys (CMEK) allows you to encrypt highly sensitive data in BigQuery with encryption keys managed by your organization. When an employee leaves the organization, you can render the data unreadable by deleting or revoking access to the encryption keys associated with the data. This approach ensures compliance with strict data privacy regulations by making the data inaccessible without the encryption keys, providing strong control over data access and security.
You are a data analyst working with sensitive customer data in BigQuery. You need to ensure that only authorized personnel within your organization can query this data, while following the principle of least privilege. What should you do?
Answer : D
Using IAM roles to enable access control in BigQuery is the best approach to ensure that only authorized personnel can query the sensitive customer data. IAM allows you to define granular permissions at the project, dataset, or table level, ensuring that users have only the access they need in accordance with the principle of least privilege. For example, you can assign roles like roles/bigquery.dataViewer to allow read-only access or roles/bigquery.dataEditor for more advanced permissions. This approach provides centralized and manageable access control, which is critical for protecting sensitive data.
Your company uses Looker as its primary business intelligence platform. You want to use LookML to visualize the profit margin for each of your company's products in your Looker Explores and dashboards. You need to implement a solution quickly and efficiently. What should you do?
Answer : B
Defining a new measure in LookML to calculate the profit margin using the existing revenue and cost fields is the most efficient and straightforward solution. This approach allows you to dynamically compute the profit margin directly within your Looker Explores and dashboards without needing to pre-calculate or create additional tables. The measure can be defined using LookML syntax, such as:
measure: profit_margin {
type: number
sql: (revenue - cost) / revenue ;;
value_format: '0.0%'
}
This method is quick to implement and integrates seamlessly into your existing Looker model, enabling accurate visualization of profit margins across your products.
Your retail company collects customer data from various sources:
You are designing a data pipeline to extract this dat
a. Which Google Cloud storage system(s) should you select for further analysis and ML model training?
Answer : B
Online transactions: Storing the transactional data in BigQuery is ideal because BigQuery is a serverless data warehouse optimized for querying and analyzing structured data at scale. It supports SQL queries and is suitable for structured transactional data.
Customer feedback: Storing customer feedback in Cloud Storage is appropriate as it allows you to store unstructured text files reliably and at a low cost. Cloud Storage also integrates well with data processing and ML tools for further analysis.
Social media activity: Storing real-time social media activity in BigQuery is optimal because BigQuery supports streaming inserts, enabling real-time ingestion and analysis of data. This allows immediate analysis and integration into dashboards or ML pipelines.
You are constructing a data pipeline to process sensitive customer data stored in a Cloud Storage bucket. You need to ensure that this data remains accessible, even in the event of a single-zone outage. What should you do?
Answer : C
Storing the data in a multi-region bucket ensures high availability and durability, even in the event of a single-zone outage. Multi-region buckets replicate data across multiple locations within the selected region, providing resilience against zone-level failures and ensuring that the data remains accessible. This approach is particularly suitable for sensitive customer data that must remain available without interruptions.
Your team is building several data pipelines that contain a collection of complex tasks and dependencies that you want to execute on a schedule, in a specific order. The tasks and dependencies consist of files in Cloud Storage, Apache Spark jobs, and data in BigQuery. You need to design a system that can schedule and automate these data processing tasks using a fully managed approach. What should you do?
Answer : C
Using Cloud Composer to create Directed Acyclic Graphs (DAGs) is the best solution because it is a fully managed, scalable workflow orchestration service based on Apache Airflow. Cloud Composer allows you to define complex task dependencies and schedules while integrating seamlessly with Google Cloud services such as Cloud Storage, BigQuery, and Dataproc for Apache Spark jobs. This approach minimizes operational overhead, supports scheduling and automation, and provides an efficient and fully managed way to orchestrate your data pipelines.