Building data applications is crucial for getting the most value possible out of your data, but DIY approaches can be time-consuming and complex. This post will explain how Fivetran’s data movement platform helps accelerate building data applications in the Snowflake Data Cloud with Streamlit.

I'll set up a Workday HCM to Snowflake connector to move several hundred tables that include candidate, compensation, job, organization and employee information over to Snowflake so that my Workday dataset can be centralized with other critical data, including SAP and Salesforce, to enable the wide range of data workloads that Snowflake supports. I’ll then show you how to create a simple Streamlit talent management application.

It’s worth pointing out that most operational systems and ERP platforms like Workday have their own operational analytics capabilities. For Workday, it has services like Workday Prism Analytics and People Analytics.

But what if you want to combine Workday with other data sources to improve your customer experience? For example, you might combine demographic data from Salesforce, SAP ERP purchasing behavior and Workday employee interactions to identify segments or clusters with similar characteristics and needs.

That's where Fivetran's automated data platform comes in. It allows you to centralize all of your data, modernize your data infrastructure, achieve greater data self-service and democratization, and build differentiating data solutions in the Snowflake Data Cloud with Streamlit.

Below is a high-level architecture for my Snowflake data workload today, which involves building and creating new data solutions. I’ll move a Workday HCM data set to the Snowflake Data Cloud and then quickly build a simple Streamlit talent management application that leverages the new Workday dataset.

End-to-end high-level architecture and data flow: Fivetran, Snowflake and Streamlit

What you'll see today from Fivetran, Snowflake and Streamlit is pure SaaS, fully automated and fully managed. I’m also using Streamlit as a native app within Snowflake. For data movement on the front end, Fivetran is exceptionally fast to set up and configure in just a few minutes.

As a quick note, you can check out an end-to-end video of this solution with Fivetran, Snowflake and Streamlit on YouTube below.

So let's get Workday HCM data flowing into Snowflake and build a Streamlit talent management application.

Adding a new source connector with Fivetran

I have one Snowflake destination in this Fivetran account, but I could have multiple Snowflake destinations depending on my business requirements. Also, I have my choice of cloud providers and regions to select from for each Snowflake destination.

Setting up Snowflake as a Fivetran destination is quick and easy, and I can choose my cloud region

Multiple data sources already flow into the Snowflake Data Cloud, including SAP, SQL Server and Oracle operational databases. I also have Salesforce, GA4, Kafka and S3, and I have Workday HCM connectors.

These are the current sources that Fivetran moving into the Snowflake Data Cloud for my account.

Fivetran offers connectors to over 500 sources, including databases, applications, file systems and event systems, and that number is growing daily.

The talent management application I’ll build in Steamlit will use a different dataset from Workday HCM. I could add schemas and tables to my current Workday connector, but I’d like to show the full end-to-end setup, so I’ll simply set up a new one.

Fivetran supports Workday HCM, but Workday RaaS, Strategic Sourcing and Financial Management can also be set up as sources.

Fivetran supports over 500 data sources to automate data movement into the Snowflake Data Cloud

The Fivetran engineering, product and technical documentation teams do a fantastic job laying out the steps to get data flowing quickly into the Snowflake Data Cloud. I’ve linked it here as well if you’d like to take a look outside of the Fivetran UI: Workday HCM Setup Guide.

Workday HCM Setup Guide in Fivetran docs

Additionally, for each source and destination, documentation includes details on version support, configuration support, feature support, any limitations, a sync overview, schema information, type transformations and much more. Here is the Workday HCM source detail page.

Workday HCM Source Detail in Fivetran docs

Importantly, in the Fivetran UI, any source setup pages are framed on the right by the setup guide in the gray navigation on the right side. It’s the fast path to ensuring you understand the options to connect quickly to any source with Fivetran.

I can name the destination schema anything that I choose. Also, schemas and tables do not have to be created ahead of time in Snowflake. I’m going to use the following:

accelerate_building_data_apps_snowflake_streamlit_fivetran

From there, Fivetran must know how to authenticate to the Workday integration system API, so I’ll provide the username, the Workday tenant, the password and my Workday host identifier.

Fivetran needs those details to authenticate into Workday and will run connection tests to the Workday API, validate the authentication details and ensure that Fivetran can access the Workday API.

Managing source changes and schema drift plus security

Not only will Fivetran move the Workday data set I just selected and do the initial historical sync, but when there are any changes to those Workday tables or columns, CDC is automatically set up for me, 100% no code, so that any incremental changes are captured as well at whatever schedule I prefer.

From a security standpoint, all data in motion is encrypted. Any data at rest that sits ephemerally in the Fivetran service is also encrypted.

Once Fivetran has moved the data and completed its checks, no data is stored in the Fivetran service. Fivetran simply maintains a cursor at the sync point for the next incremental sync to capture changes.

Starting the initial sync from Workday HCM to the Snowflake Data Cloud

That's it, I'm ready to start syncing my Workday HCM dataset to Snowflake. Fivetran provides a highly compelling degree of standardization and predictability for any source to the Snowflake Data Cloud that any data team can appreciate.

That’s it - I’m ready to start syncing my Workday HCM data to the Snowflake Data Cloud

Highlighting Fivetran transformations

Fivetran provides seamless integration with dbt Core and dbt Cloud, including QuickStart data models. These allow you to automatically produce analytics-ready tables using pre-built data models and transform your data with no code, no additional dbt project and no additional third-party tools required.

Transformations include integrated scheduling, and they automatically trigger model runs after completing Fivetran connector syncs in Snowflake.

You can check out the wide range of connectors and packages that support quick-start data models or transformations for dbt here.

Of note, there is a Workday Fivetran Quickstart Transformation that models Workday HCM data based on the Fivetran connector and provides the following ready to use modeled data:

workday__employee_overview – Each record represents an employee with enriched personal information and the positions they hold. This helps measure employee demographic and geographical distribution, overall retention and turnover, and compensation analysis of their employees.
workday__job_overview – Each record represents a job with enriched details on job profiles and job families. This allows users to understand recruitment patterns and details within a job and job groupings.
workday__organization_overview – Each record represents organization, organization roles and positions and workers tied to these organizations. This allows end users to slice organizational data at any grain to better analyze organizational structures.
workday__position_overview – Each record represents a position with enriched data on positions. This allows end users to understand position availabilities, vacancies and cost to optimize hiring efforts.

Choose from Quickstarts, your own dbt Core project, or dbt Cloud for transformations with Fivetran

A quick review of the Fivetran top navigation options

I’ll return to the Workday connector and the Schema tab across the top navigation.

This Workday dataset has 200+ tables covering a range of objects in Workday HCM. My focus for the Streamlit app I will build is the Candidate Experience table.

Fivetran allows me to select all tables and columns in a schema and selectively block what I want to

All initial syncs and the first 14 days of incrementals are free. On the Setup page, I can set the incremental sync frequency. The default for the Workday connector is every six hours, but I can choose from one minute up to 24 hours depending on my data product or downstream data application requirements.

Fivetran SaaS connectors support incremental syncs from every 1 minute up to every 24 hours

Fivetran ERDs are gold

Let’s quickly review the Fivetran Docs. One part of Fivetran documentation I sometimes take for granted is the Entity Relationship Diagrams (ERDs). Fivetran provides these ERDs for most application connectors.

Here you're going to find detailed information and the blueprint for how each entity in Workday HCM, in this case, relates to the others.

Fivetran ERDs are detailed and give you the blueprint for each connector

The Workday connector is now on the list with the others, and I now have access to all those data sources in Snowflake.

The new Workday HCM connector, plus all other connectors moving data (including changes) into the Snowflake Data Cloud

Fivetran status page for the Workday HCM connector (initial sync is complete)

Remember what I named this connector earlier before I move into Snowflake Snowsight to review the new dataset.

accelerate_building_data_apps_snowflake_streamlit_fivetran

The initial sync for Workday HCM is complete.

Snowflake Snowsight

I opened Snowflake Snowsight and selected the Snowflake database set up for this Fivetran destination. I immediately saw the other connectors, the new Workday schema and the tables Fivetran moved over. Fivetran provides a faithful one-to-one representation of the Workday source data to Snowflake, which is “data app ready.”

The data is organized, understandable and ready to be worked with, enriched, transformed and used in a Streamlit application.

The new schema and tables for Workday HCM in Snowsight that Fivetran moved to the Snowflake Data Cloud.

Building a simple talent management data application with Snowflake and Streamlit

Now that Workday HCM is flowing into Snowflake, it’s time to build a Streamlit data app. If you aren't familiar with Streamlit, it's an open-source Python library that makes it easy to create and share custom data apps for machine learning and data science. It’s also available as a native app in Snowflake.

A Streamlit app is a secureable object that adheres to the Snowflake access control framework. Streamlit and Snowflake require a virtual warehouse to run a Streamlit app and perform SQL queries. I’ll walk you through step-by-step how I built my simple Streamlit app.

First, you see, I've got a supply chain efficiency app. The new app will be a simple talent management app that uses the new Workday dataset that I just moved to Snowflake.

Streamlit is available within Snowsight as a native app.

I can assign my database and schema to this Streamlit app. Remember, we're operating inside of Snowflake here. Streamlit gives me some “hello world” starter code when the app is first generated.

Creating a new Streamlit app in Snowflake starts with naming and selecting the app location.

I won’t use much of that today, but it can be helpful to get started. I'm going to start from scratch here. I want to create the foundation for building an interactive Streamlit app that can seamlessly query and display data from Snowflake.

I want to import the Streamlit library and assign the Streamlit alias. This is going to allow me to work with Snowflake connections via Snowpark. This will also store connection details for Snowflake database interactions.


# Streamlit foundation
import streamlit as st
from snowflake.snowpark.context import get_active_session

session = get_active_session()

Also, I will run the application, and take a look at it as I go interactively throughout.

Next, I want to give this application a title and a short description. I’ll add a couple of rocket emojis to the title and just give a simple description of how to use the app.

Adding a title, description and some emojis in Streamlit.


# App name and description
st.title(":rocket: Talent Mgmt App :rocket:")
st.write(
    """Check out this interactive Streamlit talent management app!
    **Built with Streamlit and Snowflake,** empowering better talent management. Just enter a school name, field of study, or both.
    """
)

So let's take a look at it for the first time. I’ll run my Streamlit app and see what I’ve got. So far, so good.

Next, I'd like to have the search on the left side, separated from the results. So I’ll create a min-max section for my search filters in a sidebar, keeping it separate from the results.

I’ll add a search heading and also a couple of search fields. I want to be able to search for the school name and the field of study for this particular application.


# Min-max for search
with st.sidebar:
    st.subheader("Search")
    school_name = st.text_input("Enter school name (e.g. univ):").lower()
    field_of_study_code = st.text_input("Enter field of study (e.g. bus):").upper()

I’ll take another quick peek. So far, so good. I’ve got my title and description, and now my sidebar with the search boxes is ready to go.

Let's go back and add the next section of code for attribution and a way to contact me if somebody would like to. Also, I'd like to add the Streamlit logo.

Adding some “extras” to the Streamlit app.


# Add attribution and contact in the min-max    
    # Horizontal line
    st.markdown("---")
# Create columns for logo placement
    col1, col2, col3 = st.columns([3, 2, 8])
    with col1:
        st.markdown("Made in")
    with col2:
        st.image("https://drive.google.com/uc?export=view&id=1CUBiJcxnqJqS0GwJIWF5NDwmjXf0hzGF", width=40)
    with col3:
        st.markdown("by [@kellykohlleffel](https://www.linkedin.com/in/kellykohlleffel/)")  # Custom link format

All right. There is the attribution and I’ve got my name in there for contact if needed.

It's not perfect, but it's close enough.

All right, let's add our next section of code here. This will be the foundation for my simple talent management app.

It's a dynamically constructed SQL query within Streamlit based on user input. It needs to have flexible and interactive data filtering. The data set is from the candidate education table you saw earlier.

I want to return all columns in that table. I'm also interested in conditionally searching for candidates based on the school they attended, and I want that to be case-insensitive.

I'd also like to be able to search on their field of study and do partial searches with no nulls returned in the search results.

Search (query) logic for the Streamlit app.


# Dynamic SQL query within Streamlit based on user search
if school_name or field_of_study_code:
    sql = f"""
        SELECT *
        FROM CANDIDATE_EDUCATION
        WHERE (LOWER(SCHOOL_NAME) LIKE '%%{school_name}%%' OR school_name IS NULL)
          AND (FIELD_OF_STUDY_CODE LIKE '%%{field_of_study_code}%%' OR field_of_study_code IS NULL)
          AND (SCHOOL_NAME IS NOT NULL AND FIELD_OF_STUDY_CODE IS NOT NULL)  -- Eliminate nulls
    """

Next, I want to instruct the application to execute the SQL query, retrieve the results and store them in a Python data frame object. Since our results will be tabular, that makes the most sense. This will convert the data retrieved from Snowflake by Streamlit into a Pandas data frame.


# Query execution, results retrieval, and results stored in a DataFrame object for a tabular view        data = session.sql(sql).to_pandas()

All right, the last thing is to let the user know if the school or the field of study they enter into the search is outside the data set. If they get a hit on a search, the results are displayed using Streamlit's st.dataframe function to render those query results in the main content area of the app.

Specifying search results display behavior.


# Search and results display behavior    
    if data.empty:
        st.write("No results found for the entered criteria.")
    else:
        st.dataframe(data, width=1000)  # Display dataframe in the main content area

So that's it. I mentioned it was a pretty simple application. Moment of truth now. I’ll run the app and see what I’ve got. Visually, it looks pretty good. I like the way that Streamlit has rendered this.

I’ll do a search now, and, I can search on a partial school name. In this case, “minn”.

I see the University of Minnesota and Minnesota State University. Also, a field of study can be partial. So for finance (fin), for instance.

I could search on business or geology or whatever field of study I wanted for candidates here. So all of that is looking good.

I’ll try another one. Texas A&M is my alma mater. Nope. There is nothing from Texas A&M in that data set, at least no candidates with finance degrees.

One more search. How about Rider University? Got a hit, but not with a finance degree. The candidate from Rider studied business administration.

I’ll just search on the field of study I’m interested in, BUS for business, and see what that gives me.

That search resulted in several school names. We're not getting any nulls in those school names or in that field of study code. I’m pretty happy with this application overall and Streamlit's functionality based on the Workday dataset.

The app also has a min/max on the sidebar and another on the results set. Streamlit makes it extremely easy to configure these applications and adjust them to your desired user experience.

Talent management may be a stretch, but you get the idea.

Streamlit min/max in the left search area and with the search results.

Snowflake Partner Connect with Fivetran makes it easy to connect

If you don't have a Fivetran account right now, you can set up a Fivetran trial account quickly and easily right from Snowflake Partner Connect.

In fact, Snowflake automatically creates the required objects for you, including the Snowflake database, the warehouse, the system user and the system roles that Fivetran requires to connect to Snowflake. It’s very easy, and I encourage you to check it out.

Snowflake Partner Connect is the easy button to connect Fivetran to the Snowflake Data Cloud.

Get started now

Snowflake, Streamlit and Fivetran together. You get high-quality, usable, trusted data into Snowflake, which can immediately be used to build any Streamlit application you can imagine.

Fivetran ensures that data movement to Snowflake is standardized and predictable across any data source. Each fully automated and fully managed pipeline provides reliability, scalability, predictability and security. If you'd like to discuss your requirements for moving Workday HCM or over 500 other data sources simply and reliably into Snowflake, Fivetran offers a 14-day free trial.

It would be great to hear from you on any connectors, data workloads and industry use cases you’d like to see profiled next. Take care!

[CTA_MODULE]

Data insights

Build a data app with Streamlit, Fivetran and Snowflake

April 29, 2024

Kelly Kohlleffel

Senior Global Director, Partner Sales Engineering

Fivetran

Anchor Link

Kelly Kohlleffel

Senior Global Director, Partner Sales Engineering

Topics

Accelerate building data apps in Snowflake and Streamit with Fivetran.

As a quick note, you can check out an end-to-end video of this solution with Fivetran, Snowflake and Streamlit on YouTube below.

So let's get Workday HCM data flowing into Snowflake and build a Streamlit talent management application.

Adding a new source connector with Fivetran

Fivetran offers connectors to over 500 sources, including databases, applications, file systems and event systems, and that number is growing daily.

Fivetran supports Workday HCM, but Workday RaaS, Strategic Sourcing and Financial Management can also be set up as sources.

I can name the destination schema anything that I choose. Also, schemas and tables do not have to be created ahead of time in Snowflake. I’m going to use the following:

accelerate_building_data_apps_snowflake_streamlit_fivetran

From there, Fivetran must know how to authenticate to the Workday integration system API, so I’ll provide the username, the Workday tenant, the password and my Workday host identifier.

Fivetran needs those details to authenticate into Workday and will run connection tests to the Workday API, validate the authentication details and ensure that Fivetran can access the Workday API.

Managing source changes and schema drift plus security

From a security standpoint, all data in motion is encrypted. Any data at rest that sits ephemerally in the Fivetran service is also encrypted.

Starting the initial sync from Workday HCM to the Snowflake Data Cloud

Highlighting Fivetran transformations

Transformations include integrated scheduling, and they automatically trigger model runs after completing Fivetran connector syncs in Snowflake.

You can check out the wide range of connectors and packages that support quick-start data models or transformations for dbt here.

Of note, there is a Workday Fivetran Quickstart Transformation that models Workday HCM data based on the Fivetran connector and provides the following ready to use modeled data:

workday__employee_overview – Each record represents an employee with enriched personal information and the positions they hold. This helps measure employee demographic and geographical distribution, overall retention and turnover, and compensation analysis of their employees.
workday__job_overview – Each record represents a job with enriched details on job profiles and job families. This allows users to understand recruitment patterns and details within a job and job groupings.
workday__organization_overview – Each record represents organization, organization roles and positions and workers tied to these organizations. This allows end users to slice organizational data at any grain to better analyze organizational structures.
workday__position_overview – Each record represents a position with enriched data on positions. This allows end users to understand position availabilities, vacancies and cost to optimize hiring efforts.

A quick review of the Fivetran top navigation options

I’ll return to the Workday connector and the Schema tab across the top navigation.

This Workday dataset has 200+ tables covering a range of objects in Workday HCM. My focus for the Streamlit app I will build is the Candidate Experience table.

Fivetran ERDs are gold

Here you're going to find detailed information and the blueprint for how each entity in Workday HCM, in this case, relates to the others.

The Workday connector is now on the list with the others, and I now have access to all those data sources in Snowflake.

Fivetran status page for the Workday HCM connector (initial sync is complete)

Remember what I named this connector earlier before I move into Snowflake Snowsight to review the new dataset.

accelerate_building_data_apps_snowflake_streamlit_fivetran

Snowflake Snowsight

The data is organized, understandable and ready to be worked with, enriched, transformed and used in a Streamlit application.

Building a simple talent management data application with Snowflake and Streamlit

First, you see, I've got a supply chain efficiency app. The new app will be a simple talent management app that uses the new Workday dataset that I just moved to Snowflake.

I can assign my database and schema to this Streamlit app. Remember, we're operating inside of Snowflake here. Streamlit gives me some “hello world” starter code when the app is first generated.


# Streamlit foundation
import streamlit as st
from snowflake.snowpark.context import get_active_session

session = get_active_session()

Also, I will run the application, and take a look at it as I go interactively throughout.

Next, I want to give this application a title and a short description. I’ll add a couple of rocket emojis to the title and just give a simple description of how to use the app.


# App name and description
st.title(":rocket: Talent Mgmt App :rocket:")
st.write(
    """Check out this interactive Streamlit talent management app!
    **Built with Streamlit and Snowflake,** empowering better talent management. Just enter a school name, field of study, or both.
    """
)

So let's take a look at it for the first time. I’ll run my Streamlit app and see what I’ve got. So far, so good.

Next, I'd like to have the search on the left side, separated from the results. So I’ll create a min-max section for my search filters in a sidebar, keeping it separate from the results.

I’ll add a search heading and also a couple of search fields. I want to be able to search for the school name and the field of study for this particular application.


# Min-max for search
with st.sidebar:
    st.subheader("Search")
    school_name = st.text_input("Enter school name (e.g. univ):").lower()
    field_of_study_code = st.text_input("Enter field of study (e.g. bus):").upper()

I’ll take another quick peek. So far, so good. I’ve got my title and description, and now my sidebar with the search boxes is ready to go.

Let's go back and add the next section of code for attribution and a way to contact me if somebody would like to. Also, I'd like to add the Streamlit logo.


# Add attribution and contact in the min-max    
    # Horizontal line
    st.markdown("---")
# Create columns for logo placement
    col1, col2, col3 = st.columns([3, 2, 8])
    with col1:
        st.markdown("Made in")
    with col2:
        st.image("https://drive.google.com/uc?export=view&id=1CUBiJcxnqJqS0GwJIWF5NDwmjXf0hzGF", width=40)
    with col3:
        st.markdown("by [@kellykohlleffel](https://www.linkedin.com/in/kellykohlleffel/)")  # Custom link format

All right. There is the attribution and I’ve got my name in there for contact if needed.

It's not perfect, but it's close enough.

All right, let's add our next section of code here. This will be the foundation for my simple talent management app.

I want to return all columns in that table. I'm also interested in conditionally searching for candidates based on the school they attended, and I want that to be case-insensitive.

I'd also like to be able to search on their field of study and do partial searches with no nulls returned in the search results.


# Dynamic SQL query within Streamlit based on user search
if school_name or field_of_study_code:
    sql = f"""
        SELECT *
        FROM CANDIDATE_EDUCATION
        WHERE (LOWER(SCHOOL_NAME) LIKE '%%{school_name}%%' OR school_name IS NULL)
          AND (FIELD_OF_STUDY_CODE LIKE '%%{field_of_study_code}%%' OR field_of_study_code IS NULL)
          AND (SCHOOL_NAME IS NOT NULL AND FIELD_OF_STUDY_CODE IS NOT NULL)  -- Eliminate nulls
    """


# Query execution, results retrieval, and results stored in a DataFrame object for a tabular view        data = session.sql(sql).to_pandas()


# Search and results display behavior    
    if data.empty:
        st.write("No results found for the entered criteria.")
    else:
        st.dataframe(data, width=1000)  # Display dataframe in the main content area

I’ll do a search now, and, I can search on a partial school name. In this case, “minn”.

I see the University of Minnesota and Minnesota State University. Also, a field of study can be partial. So for finance (fin), for instance.

I could search on business or geology or whatever field of study I wanted for candidates here. So all of that is looking good.

I’ll try another one. Texas A&M is my alma mater. Nope. There is nothing from Texas A&M in that data set, at least no candidates with finance degrees.

One more search. How about Rider University? Got a hit, but not with a finance degree. The candidate from Rider studied business administration.

I’ll just search on the field of study I’m interested in, BUS for business, and see what that gives me.

The app also has a min/max on the sidebar and another on the results set. Streamlit makes it extremely easy to configure these applications and adjust them to your desired user experience.

Talent management may be a stretch, but you get the idea.

Snowflake Partner Connect with Fivetran makes it easy to connect

If you don't have a Fivetran account right now, you can set up a Fivetran trial account quickly and easily right from Snowflake Partner Connect.

Get started now

Snowflake, Streamlit and Fivetran together. You get high-quality, usable, trusted data into Snowflake, which can immediately be used to build any Streamlit application you can imagine.

It would be great to hear from you on any connectors, data workloads and industry use cases you’d like to see profiled next. Take care!

[CTA_MODULE]

Experience for yourself how Fivetran can accelerate your data app development.

Try Fivetran