Data pipelines form the foundation of how data is prepared for different analytical applications, such as timely reporting of key metrics, to training of machine learning models for automation of key metric prediction.
Snowpark allows users to write very rich, expressive data pipelines over Snowflake, with 100% pushdown to fully leverage the elasticity of this platform. With the newer enhancements the workflow of creating these pipelines within Snowflake has further been streamlined.
This webinar aims to walk through the following topics to present a complete cycle of working with Snowpark in order to build data pipelines within Snowflake Notebooks, integrate with a Git based repo and incorporate a CI/CD process to test and promote Notebook based pipelines to higher environments:
- Data representation options: Snowpark Dataframes / Pandas on Snowflake
- Snowflake Notebooks for development and code management through Git integration
- Managing CI/CD release to higher environments for Snowflake Notebooks using SnowCLI
- Executing the pipeline as Snowflake Notebook using Notebook orchestration
- Continuous data quality monitoring through Data Metric Functions
- Pipeline observability using Traces & Spans
Speakers
Chinmayee Lakkad
Senior Data Cloud Architect
Snowflake