Snowflake DSA-C02 SnowPro Advanced: Data Scientist Certification Exam Practice Test

Page: 1 / 14
Total 65 questions
Question 1

Mark the correct steps for saving the contents of a DataFrame to a Snowflake table as part of Moving Data from Spark to Snowflake?



Answer : C

Moving Data from Spark to Snowflake

The steps for saving the contents of a DataFrame to a Snowflake table are similar to writing from Snowflake to Spark:

1. Use the write() method of the DataFrame to construct a DataFrameWriter.

2. Specify SNOWFLAKE_SOURCE_NAME using the format() method.

3. Specify the connector options using either the option() or options() method.

4. Use the dbtable option to specify the table to which data is written.

5. Use the mode() method to specify the save mode for the content.

Examples

1. df.write

2. .format(SNOWFLAKE_SOURCE_NAME)

3. .options(sfOptions)

4. .option('dbtable', 't2')

5. .mode(SaveMode.Overwrite)

6. .save()


Question 2

Which command manually triggers a single run of a scheduled task (either a standalone task or the root task in a DAG) independent of the schedule defined for the task?



Answer : C

The EXECUTE TASK command manually triggers a single run of a scheduled task (either a standalone task or the root task in a DAG) independent of the schedule defined for the task. A successful run of a root task triggers a cascading run of child tasks in the DAG as their precedent task completes, as though the root task had run on its defined schedule.

This SQL command is useful for testing new or modified standalone tasks and DAGs before you enable them to execute SQL code in production.

Call this SQL command directly in scripts or in stored procedures. In addition, this command sup-ports integrating tasks in external data pipelines. Any third-party services that can authenticate into your Snowflake account and authorize SQL actions can execute the EXECUTE TASK command to run tasks.


Question 3

Data Scientist used streams in ELT (extract, load, transform) processes where new data inserted in-to a staging table is tracked by a stream. A set of SQL statements transform and insert the stream contents into a set of production tables. Raw data is coming in the JSON format, but for analysis he needs to transform it into relational columns in the production tables. which of the following Data transformation SQL function he can used to achieve the same?



Answer : B

To know about lateral flatten SQL Function, please refer:

https://docs.snowflake.com/en/sql-reference/constructs/join-lateral#example-of-using-lateral-with-flatten


Question 4

Mark the Incorrect understanding of Data Scientist about Streams?



Answer : B, D

Streams on views support both local views and views shared using Snowflake Secure Data Sharing, including secure views. Currently, streams cannot track changes in materialized views.

stream itself does not contain any table data. A stream only stores an offset for the source object and returns CDC records by leveraging the versioning history for the source object. When the first stream for a table is created, several hidden columns are added to the source table and begin storing change tracking metadata. These columns consume a small amount of storage. The CDC records returned when querying a stream rely on a combination of the offset stored in the stream and the change tracking metadata stored in the table. Note that for streams on views, change tracking must be enabled explicitly for the view and underlying tables to add the hidden columns to these tables.

Streams support repeatable read isolation. In repeatable read mode, multiple SQL statements within a transaction see the same set of records in a stream. This differs from the read committed mode supported for tables, in which statements see any changes made by previous statements executed within the same transaction, even though those changes are not yet committed.

The delta records returned by streams in a transaction is the range from the current position of the stream until the transaction start time. The stream position advances to the transaction start time if the transaction commits; otherwise it stays at the same position.


Question 5

Which are the following additional Metadata columns Stream contains that could be used for creating Efficient Data science Pipelines & helps in transforming only the New/Modified data only?



Answer : A, C, E

A stream stores an offset for the source object and not any actual table columns or data. When que-ried, a stream accesses and returns the historic data in the same shape as the source object (i.e. the same column names and ordering) with the following additional columns:

METADATA$ACTION

Indicates the DML operation (INSERT, DELETE) recorded.

METADATA$ISUPDATE

Indicates whether the operation was part of an UPDATE statement. Updates to rows in the source object are represented as a pair of DELETE and INSERT records in the stream with a metadata column METADATA$ISUPDATE values set to TRUE.

Note that streams record the differences between two offsets. If a row is added and then updated in the current offset, the delta change is a new row. The METADATA$ISUPDATE row records a FALSE value.

METADATA$ROW_ID

Specifies the unique and immutable ID for the row, which can be used to track changes to specific rows over time.


Question 6

As Data Scientist looking out to use Reader account, Which ones are the correct considerations about Reader Accounts for Third-Party Access?



Answer : D

Data sharing is only supported between Snowflake accounts. As a data provider, you might want to share data with a consumer who does not already have a Snowflake account or is not ready to be-come a licensed Snowflake customer.

To facilitate sharing data with these consumers, you can create reader accounts. Reader accounts (formerly known as ''read-only accounts'') provide a quick, easy, and cost-effective way to share data without requiring the consumer to become a Snowflake customer.

Each reader account belongs to the provider account that created it. As a provider, you use shares to share databases with reader accounts; however, a reader account can only consume data from the provider account that created it.

So, Data Sharing is possible between Snowflake & Non-snowflake accounts via Reader Account.


Question 7

Which one is incorrect understanding about Providers of Direct share?



Answer : D

If you want to provide a share to many accounts, you might want to use a listing or a data ex-change.


Page:    1 / 14   
Total 65 questions