How to deploy Snowflake stored procedures with Python
This post walks through an example of deploying a Snowflake stored procedure using the Snowpark Python library.
Most of the examples online (including in the official documentation) assume you’re doing this via a SnowSQL session. This guide assumes you want to have a deployment script to make the process automatable via GitHub Actions or your CI/CD tool of choice.
Setup#
We’ll be using the snowflake-snowpark-python
library to deploy
our procedure. At the time of writing (December 2022) this only
works with Python 3.8. If you’re using
conda
you can create a virtual
environment using Python 3.8 with the Snowpark library
installed as follows:
conda create --name my_env snowflake-snowpark-python python=3.8
Creating a single-file Python function#
First we need a Python function that will become our stored
procedure. I’m using a simple add_one()
function as an
example, saved in a file called main.py
.
# main.py
from snowflake.snowpark import Session
def add_one(session: Session, x: int) -> int:
"""Add one to an integer"""
return x + 1
Some things to note here:
-
We’re using type hints. This makes life easier later on as Snowpark will use these to generate the appropriate input and return types for the function. More on this later.
-
The first argument to the function must be a
snowflake.snowpark.Session
object, even if we don’t do anything with it in the function itself. When we call the generated stored procedure theSession
will be automatically supplied by the Snowflake runtime, i.e. our SQL call will look like:CALL ADD_ONE(1)
If the first argument to your function isn’t a
Session
object, Snowpark will throw an error when you try and deploy.
Writing a deployment script#
The Snowpark API provides a
register_from_file()
function that we can use to deploy our procedure. Here’s a super
simple deployment script.
For information on connecting to Snowflake see the Snowpark docs.
# deploy.py
from snowflake.snowpark import Session
# Connection parameters shouldn't be hard-coded into the script
# like this. Use environment variables, GitHub Actions secrets,
# etc.
SF_CONNECTION_PARAMS = {
"account": "my.snowflake.account",
"user": "me@email.com",
"password": "password123",
"warehouse": "my_warehouse",
"role": "my_role",
"database": "my_database",
"schema": "my_schema"
}
# Create a session that we will use to deploy our proc
session = Session.builder.configs(SF_CONNECTION_PARAMS).create()
# Create a Snowflake stage to hold the uploaded Python source
# files
session.sql(
"CREATE STAGE IF NOT EXISTS PYTHON_SOURCE_CODE"
).collect()
# Deploy the procedure
session.sproc.register_from_file(
name = "add_one", # Name of the procedure in Snowflake
stage_location = "python_source_code",
file_path = "main.py",
func_name = "add_one", # Within `main.py`
is_permanent = True,
replace = True,
packages = ["snowflake-snowpark-python"]
)
Running python deploy.py
will deploy the stored procedure.
Notes#
name
is the name of the deployed procedure in Snowflake. This doesn’t have to be the same as the name of the function.is_permanent
should beTrue
. If left asFalse
the procedure will only exist for the duration of thesession
used to deploy it, i.e. it won’t be useable once the deployment script has run.replace = True
means that any pre-existing procedure with this name will be replaced. If this isFalse
then trying to deploy a procedure whose name already exists will cause an error.packages
must containsnowflake-snowpark-python
even if you’re only using Python standard library packages.- If we hadn’t included type hints in our source code we’d also
need to supply
input_types
andreturn_type
arguments todeploy_from_file()
.
Using multiple Python files#
If your Python code is split over multiple files you need to tell Snowpark which files need to be included in the deployment package. For example, if I have a custom module containing some application code:
# my_module.py
def greater_than_ten(x: int) -> bool:
"""Is a number greater than ten?"""
if x > 10:
return True
return False
and I want to import this module into my main procedure code:
# main.py
from snowflake.snowpark import Session
import my_module
def sometimes_add_one(session: Session: x: int) -> int:
"""Add one to integers that are greater than ten"""
if my_module.greater_than_ten(x):
return x
return x + 1
the deployment command needs to refer to my_module.py
to
ensure it’s uploaded.
# deploy.py
# ... see above
session.sproc.register_from_file(
name = "add_one", # Name of the procedure in Snowflake
stage_location = "python_source_code",
file_path = "main.py",
func_name = "add_one", # Within `main.py`
is_permanent = True,
replace = True,
packages = ["snowflake-snowpark-python"],
imports = ["my_module.py"] # This becomes `import my_module`
)
This example demonstrated how to use a Python script to deploy a single Snowflake stored procedure. You could easily expand this to deploy multiple procedures, which may be more suitable for use in a CI/CD environment.