Data loading Using Snowflake Internal & External Stages

Sunil Mannem

Snowflake’s internal storage is designed for high scalability, flexibility, and durability, and is a key factor in the platform’s ability to provide high performance and cost-effective cloud data warehousing services.

Internal Stages store data internally and are categorized into three types: User Stage, Table Stage, and Internal Named Stage.

User Stage is a personal storage area assigned to each user, and no one else can see it.
Table Stage is a storage location within a table object, useful when only a few files need to be imported into a certain table i.e., when files can be imported to one table by many users.
Internal Named Stage is a storage location within a Snowflake database/schema, offering more versatility when it comes to importing files into different tables and allowing multiple users to access the same stage.

In this blog, we’ll be discussing the data migration process from internal stages in Snowflake

How to Set up and Use Internal Snowflake Stages?

If you’re planning to upload files to a Snowflake stage, first the connection with Snowflake needs to be checked. Here are the steps you can follow to set up the connection:
Step 1: Open the command prompt on your local machine and check whether SnowSQL is installed or not.
Step 2: Once you have confirmed that SnowSQL is installed, run the following command: snowsql -a -u
Replace and with the appropriate values for your Snowflake account and user. After running the command, you will be prompted to enter your Snowflake password.
If everything is set up correctly, you should be able to log in to Snowflake through SnowSQL and proceed with uploading files to your Snowflake stage.

Uploading a File to the User Stage

Step 1: To upload files to the Table Stage area, use the Snow SQL command below.
sunilraj#COMPUTE_WH@MY_DB.PUBLIC>put file://D:Snowflakesample_file.csv @%test;
Step 2: The file will be uploaded to the User Stage, which you may check using the LIST command.        sunilraj#COMPUTE_WH@MY_DB.PUBLIC> list @%test;
Note: Table stage name should be same as Table name.

Uploading a File to Table Stage

Step 1: To upload files to the User Stage Area, use the Snow SQL command below.  sunilraj#COMPUTE_WH@MY_DB.PUBLIC> put file://D:Snowflakesample_file.csv @~/staged; 
Step 2: The file will be uploaded to the user stage, which you may check using the LIST command.           sunilraj#COMPUTE_WH@MY_DB.PUBLIC> list @~/staged;

Uploading a File to Named Stage

Step 1: To Upload Files to the Internal Named Stage area, use the snow SQL command given below. sunilraj#COMPUTE_WH@MY_DB.PUBLIC> put file://D:Snowflakesample_file.csv @my_csv_stage;  
Step 2: The file will be saved to an Internal Named Stage, which you can check with the LIST command.                sunilraj#COMPUTE_WH@MY_DB.PUBLIC> LIST @my_csv_stage;

How to use the Copy Command with User, Table & Named Snowflake Stages?

In the copy command, you can specify several stages such as User, Table, or Named Stage. For example, to utilize a different stage, use any of the following commands.

User Stage Command:

COPY INTO test FROM @~/staged;

Table Stage Command:

COPY INTO test FROM @compute_wh.public.%test;

Named Stage Command:

COPY INTO test FROM @compute_wh.public.stage_path;

Seamless Data Migration with Snowflake External Stages 

External stages in Snowflake provide a simple and efficient way to load data from external storage systems such as Amazon S3, Azure Blob Storage, or Google Cloud Storage into Snowflake. With external stages, you can create a connection to the external storage system, specify the authentication credentials and connection parameters, and use the COPY INTO command to load data into Snowflake. This eliminates the need for using any third-party tools or services to transfer data between the external storage system and Snowflake. 

By using external stages, you can benefit from the scalability, performance, and security of Snowflake, while still being able to access and utilize data stored in external storage systems. Overall, external stages provide a powerful and convenient feature inside Snowflake for loading data from external storage systems without relying on third-party tools. 

Data Loading from AWS to Snowflake 

Step 1:
Create storage integration object to establish connection between AWS to Snowflake. Give the ARN and S3 Location where you want to Load the data.
create or replace storage integration s3_int
type = external_stage
storage_provider = s3
enabled = true
storage_aws_role_arn = ‘arn:aws:iam::< ARN >:role/< S3 Location>
storage_allowed_locations = (‘s3:// < S3 Location >’);

Note: Only account administrators (users with the ACCOUNTADMIN role) or a role with the global CREATE INTEGRATION privilege can execute this SQL command.
Step 2:
Create a file format as required with the type of file to be loaded.
create or replace file format my_csv_format type = csv
field_delimiter = ‘,’
field_optionally_enclosed_by = ‘”‘
null_if=(‘NULL’, ‘null’)
empty_field_as_null = true;
Step 3:
Create an External Stage and along with it provide the s3 storage location and storage integration
create or replace stage my_db.public.aws_stage
URL=”s3://sunils3bucket2″ STORAGE_INTEGRATION=s3_int
file_format = my_csv_format;
Step 4:
Once the all the setup is done now you can able to load data by using Copy Command.
copy into my_db.public.emplyeee from
(select t.$1,t.$2 ,t.$3 ,t.$4 from @db1.public. aws_stage t)
file format=my_csv_format;

Data Loading from AZURE to Snowflake

Step 1:
Create storage integration object to establish connection between AZURE to Snowflake. Give the account, tenant_id, container and path location where you want to load the data.
CREATE STORAGE INTEGRATION azure_int
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = ‘AZURE’ ENABLED = TRUE AZURE_TENANT_ID = ‘a123b4c5-1234-123a-a12b-1a23b45678c9’
STORAGE_ALLOWED_LOCATIONS=’azure://myaccount.blob.core.windows.net/mycontainer1/mypath1/’, ‘azure://myaccount.blob.core.windows.net/mycontainer2/mypath2/’)
STORAGE_BLOCKED_LOCATIONS=’azure://myaccount.blob.core.windows.net/mycontainer1/mypath1/sensitivedata/’,’azure://myaccount.blob.core.windows.net/mycontainer2/mypath2/sensitivedata/’);
Step 2:
Create a file format as required with the type of file to be loaded.
Create or replace file format my_csv_formattype = csv
field_delimiter = ‘,’
field_optionally_enclosed_by = ‘”‘
null_if= (‘NULL’, ‘null’)
empty_field_as_null = true;
Step 3:
Create an External Stage and along with it provide the azure blob url and storage integration
CREATE STAGE my_azure_stage STORAGE_INTEGRATION = azure_int
URL = ‘azure://myaccount.blob.core.windows.net/mycontainer/load/files/’
FILE_FORMAT = my_csv_format;
Step 4:
Once all the setup is done, you will be able to load data by using Copy Command.
copy into my_db.public.emplyeee from
(select t.$1,t.$2 ,t.$3 ,t.$4 from @db1.public. azure_int t)
file format=my_csv_format;

Conclusion:

Snowflake’s internal and external stages offer a powerful and flexible solution for seamless data ingestion and migration. Users can easily and securely load data from a variety of sources into Snowflake, including cloud storage platforms like AWS S3 or Azure Blob Storage. This allows for greater flexibility in data migration and can help organizations avoid the time-consuming process of manually transferring data between systems. 

In conclusion, Snowflake stages are an essential feature of its data warehousing platform, allowing for easy and efficient data migration from cloud providers. Its innovative architecture and features, such as built-in security and compliance, make it an excellent choice for organizations looking to store, manage, and analyze vast amounts of data in the cloud. 

Sunil Mannem

Data Engineer

Boolean Data Systems


Sunil works as a Data Engineer at Boolean Data Systems and has built many end-end Data Engineering solutions. He is a Matillion Associate certified engineer who has proven skills and expertise with Snowflake, Matillion, Python to name a few.

About Boolean Data
Systems

Boolean Data Systems is a Snowflake Select Services partner that implements solutions on cloud platforms. we help enterprises make better business decisions with data and solve real-world business analytics and data problems.

Global
Head Quarters

1255 Peachtree Parkway, Suite #4204, Alpharetta, GA 30041, USA.
Ph. : +1 678-261-8899
Fax : (470) 560-3866