Snowflake DSA-C02 SnowPro Advanced: Data Scientist Certification Exam Practice Test

Page: 1 / 14
Total 65 questions
Question 1

Mark the incorrect statement regarding usage of Snowflake Stream & Tasks?



Answer : D

All are correct except a standard-only stream tracks row inserts only.

A standard (i.e. delta) stream tracks all DML changes to the source object, including inserts, up-dates, and deletes (including table truncates).


Question 2

Consider a data frame df with 10 rows and index [ 'r1', 'r2', 'r3', 'row4', 'row5', 'row6', 'r7', 'r8', 'r9', 'row10']. What does the aggregate method shown in below code do?

g = df.groupby(df.index.str.len())

g.aggregate({'A':len, 'B':np.sum})



Answer : C

Computes length of column A and Sum of Column B values of each group


Question 3

Which Python method can be used to Remove duplicates by Data scientist?



Answer : D

The drop_duplicates() method removes duplicate rows.

dataframe.drop_duplicates(subset, keep, inplace, ignore_index)

Remove duplicate rows from the DataFrame:

1. import pandas as pd

2. data = {

3. 'name': ['Peter', 'Mary', 'John', 'Mary'],

4. 'age': [50, 40, 30, 40],

5. 'qualified': [True, False, False, False]

6. }

7.

8. df = pd.DataFrame(data)

9. newdf = df.drop_duplicates()


Question 4

Data Scientist can query, process, and transform data in a which of the following ways using Snowpark Python. [Select 2]



Answer : A, C

Query and process data with a DataFrame object. Refer to Working with DataFrames in Snowpark Python.

Convert custom lambdas and functions to user-defined functions (UDFs) that you can call to process data.

Write a user-defined tabular function (UDTF) that processes data and returns data in a set of rows with one or more columns.

Write a stored procedure that you can call to process data, or automate with a task to build a data pipeline.


Question 5

Mark the incorrect statement regarding Python UDF?



Answer : D

A scalar function (UDF) returns one output row for each input row. The returned row consists of a single column/value


Question 6

Which of the following Snowflake parameter can be used to Automatically Suspend Tasks which are running Data science pipelines after specified Failed Runs?



Answer : C

Automatically Suspend Tasks After Failed Runs

Optionally suspend tasks automatically after a specified number of consecutive runs that either fail or time out. This feature can reduce costs by suspending tasks that consume Snowflake credits but fail to run to completion. Failed task runs include runs in which the SQL code in the task body either produces a user error or times out. Task runs that are skipped, canceled, or that fail due to a sys-tem error are considered indeterminate and are not included in the count of failed task runs.

Set the SUSPEND_TASK_AFTER_NUM_FAILURES = num parameter on a standalone task or the root task in a DAG. When the parameter is set to a value greater than 0, the following behavior applies to runs of the standalone task or DAG:

Standalone tasks are automatically suspended after the specified number of consecutive task runs either fail or time out.

The root task is automatically suspended after the run of any single task in a DAG fails or times out the specified number of times in consecutive runs.

The parameter can be set when creating a task (using CREATE TASK) or later (using ALTER TASK). The setting applies to tasks that rely on either Snowflake-managed compute resources (i.e. serverless compute model) or user-managed compute resources (i.e. a virtual warehouse).

The SUSPEND_TASK_AFTER_NUM_FAILURES parameter can also be set at the account, database, or schema level. The setting applies to all standalone or root tasks contained in the modified object. Note that explicitly setting the parameter at a lower (i.e. more granular) level overrides the parameter value set at a higher level.


Question 7

Which command manually triggers a single run of a scheduled task (either a standalone task or the root task in a DAG) independent of the schedule defined for the task?



Answer : C

The EXECUTE TASK command manually triggers a single run of a scheduled task (either a standalone task or the root task in a DAG) independent of the schedule defined for the task. A successful run of a root task triggers a cascading run of child tasks in the DAG as their precedent task completes, as though the root task had run on its defined schedule.

This SQL command is useful for testing new or modified standalone tasks and DAGs before you enable them to execute SQL code in production.

Call this SQL command directly in scripts or in stored procedures. In addition, this command sup-ports integrating tasks in external data pipelines. Any third-party services that can authenticate into your Snowflake account and authorize SQL actions can execute the EXECUTE TASK command to run tasks.


Page:    1 / 14   
Total 65 questions