There are some people out there who will like to connect to Snowflake via AWS Lambda to push data. This post is limited to connecting to Snowflake using AWS Lambda. The upcoming posts (if ever given my posting history) will detail how to send some actual data.
As usual we will discuss why first. There are cases where data has to be loaded on a regular basis to Snowflake where something like Snowpipe is well suited. However if the data comes in occasionally and it is important that it should be sent to Snowflake as soon as possible then AWS Lambdas are a good fit.
It is rather easy once you “know the steps”. I am going to use the Snowflake Python connector for this demo. The Snowflake libraries and its dependencies have to be packaged into a Lambda layer. This is needed as the Python runtime which AWS Lambda uses will naturally not have the Snowflake libraries. So we will package the needed libraries in form of a layer and then upload it to AWS. Our lambda function then refers to that layer and gets all the libraries which are needed.
I will be using an EC2 Linux instance to create the package.
Pro Tip: Even if you have a Python environment readily available on your windows machine it is best to use an EC2 Linux environment to create the package. Most of the errors you will encounter will be due to missing/mismatched libraries. Using EC2 Linux machine (Amazon Linux 2 AMI) is about as close to the Lambda runtime you will ever get in terms of environment.
Here is the page from Snowflake team with instrutions to install Python Connector.
Important thing here is the version of the Snowflake connector and Python you will be using. The Snowflake page has the link which I am putting here for your convenience. At time of writing the latest version is 2.7.12.
Note down the Python version (in my case it is 3.8.5) and the version of the Snowflake connector.
So now lets get on with it.
What we will do it to create a directory and then install the needed packages there. And then create a zip file out of it.
Here are the steps:
sudo yum install -y libffi-devel openssl-devel mkdir lambdaLayerSnowflakePython cd lambdaLayerSnowflakePython mkdir python sudo amazon-linux-extras install python3.8 curl -O https://bootstrap.pypa.io/get-pip.py python3.8 get-pip.py –user python3.8 -m pip install -r https://raw.githubusercontent.com/snowflakedb/snowflake-connector-python/v2.7.12/tested_requirements/requirements_38.reqs -t python/ python3.8 -m pip install snowflake-connector-python==2.7.12 -t python/ zip -r snowflake_2.7.12_python_3.8.zip python aws lambda publish-layer-version --layer-name snowflake_2_7_12_python_3_8_5 --zip-file fileb://snowflake_2.7.12_python_3.8.zip --compatible-runtimes python3.8 --region us-west-2
1. We have to install libffi-devel openssl-devel as listed in the requirements.
2. When you publish the layer, region will change as per your AWS region.
3. When you publish the package programmatically make sure that you have right permissions. This page goes into the details of setting up the IAM roles. Otherwise you can download the package via sftp/ftp and then upload it manually using the AWS Lambda UI.
Here is a small lambda code inspired from the Snowflake documentation page to test the connectivity of the Lambda.
import snowflake.connector SNOW_ACCOUNT = 'STARKINTERPRISES' SNOW_USER = 'TONY' SNOW_PASS = 'PEPPER' def lambda_handler(event, context): ctx = snowflake.connector.connect( user=SNOW_USER, password=SNOW_PASS, account=SNOW_ACCOUNT ) cs = ctx.cursor() try: cs.execute("SELECT current_version()") one_row = cs.fetchone() print("Snowflake Version :" + one_row) finally: cs.close() ctx.close()
1. Choose Python runtime 3.8
2. Increase the default timeout of 3 sec to 30 sec in Lambda configuration page. It takes sometime for Snowflake to return the version number.
3. Do not forget to add the layer to Lambda function.