This is a hands-on guide to running Qlik Sense in the cloud with Amazon Redshift with Control Tower setup. To provide you with some context, we provide some background information on Amazon Web Services, Control Tower, Redshift, and Qlik. The main part is a step-by-step guide to getting you started.
Amazon Web Services is a collection of web services that collectively make up a cloud computing platform. Compared to buying and building a physical server farm, the three key benefits of Amazon’s cloud platform are:
Ease of use – a platform can be constructed in hours, unlike a physical server which may take days
Flexibility – capacity can be grown or shrunk on demand
Cost matching – the cost of a platform can be easily matched to the benefits gained.
Under the AWS banner, Amazon offers a number of webs services, including:
DynamoDB – NoSQL database
EC2 – cloud-based servers running software
RDS – relational database service
Redshift – data warehouse as a service
S3 – scalable cloud storage
EMR – elastic map reduce(Hadoop as Service)
AWS Control Tower offers the easiest way to set up and govern a new, secure, multi-account AWS environment. It establishes a landing zone that is based on best-practices blueprints, and enables governance using guardrails you can choose from a pre-packaged list. The landing zone is a well-architected, multi-account baseline that follows AWS best practices. Guardrails implement governance rules for security, compliance, and operations.
Quickly setup and configure a new AWS environment
Automate ongoing policy management
View policy-level summaries of your AWS environment
From the Amazon website: “Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.”
Here are some of the benefits of using a Redshift as opposed to physical hardware:
Data Warehouse as a Service (DaaS) - no physical hardware needed and a pay-as-you-go model.
Fast, effective and low-cost data warehouse – columnar database built for analytical workloads.
Easy to use – one click deployment, easy to back up, easy to manage
Scalability – allows resizing and clustering
Fully managed – hardware and software upgrades are all managed by AWS.
Since 2011, Qlik and AWS have been partners, providing cloud-based business intelligence using cloud-based data. Qlik works well as a web service and we have a large number of cloud-based customers and partners. Some of our partners base entire business lines around Qlik in the cloud, including CONTEXT, IPC Global, IFR Monitoring, Rosslyn Analytics, and SynerTrade.
Amazon Redshift was release in 2013, adding the flexibility of a massively scalable cloud-based data warehouse to Qlik’s data analysis capabilities
The diagram below gives an overview of how Qlik works with Amazon’s web services.
Amazon released Redshift in 2013, adding the flexibility of a massively scalable cloud-based database to Qlik’s data analysis capabilities.
Why use Qlik and AWS Redshift together?
Redshift is certified for Qlik 11 SR5.
Qlik Server is certified on AWS EC2 servers.
Redshift is one of Qlik’s preferred Big Data Platforms for Direct Discovery (in-database processing)
Qlik 11 SR5 has been tested by extracting 100 million rows into Qlik’s associative in –memory data store in the cloud
Using Qlik’s Direct Discovery platform with data sourced from Redshift, Qlik 11 SR5 has been tested with 1 billion rows of data
Qlik has shown consistent performance in running inside AWS Environment.
Many System Integrator Partners are experienced in deploying Qlik and AWS.
We recommend the currently certified Redshift/Qlik platform, namely Qlik Server and Publisher running inside an Amazon EC2 instance as shown in the figure below.
We do not recommend Qlik running on premises with Redshift. This is because of bandwidth variability issues that can degrade performance. We recommend using AWS Direct Connect when using Qlik running on premises with Redshift. AWS Direct Connect provides dedicated bandwidth that removes variability and ensures a good end user experience.
Qlik accesses data through the Redshift leader node via ODBC data connectors (see the figure below).
Due to distribution of data happening inside AWS Platform; Qlik recommend users follow Redshift best practices and place organize data that will be optimal for query performance.
This lab requires an account with Administrator Privileges and Control Tower setup.
Record your AWS SSO
AWSReservedSSO_AWSServiceCatalogEndUserAccess role name. In the next steps we will grant access for this role to launch Qlik products from service catalog.
a. Login to your Control Tower
Management with Administrator role
b. Navigate to IAM console
c. Select Roles
d. Search for role with name started with
e. Click on the Role and record the Role ARN, you will need it for the next step.
Deploy the Service Catalog porfolio into Control Tower.
a. Login to your Control Tower
Management with Administrator role
b. Navigate to CloudFormation console
c. Choose the Create stack button
d. Enter the template URL: https://qlik-aws-public.s3.amazonaws.com/sc-portfolio-mpdemoQlik.json
e. Enter the Stack name
f. For parameter
LinkedRole1 enter the Role ARN from previous step
g. Leave the remaining parameters as default
h. Click Next twice
i. Enable checkbox for I acknowledge that AWS CloudFormation might create IAM resources with custom names. and I acknowledge that AWS CloudFormation might require the following capability: CAPABILITY_AUTO_EXPAND
j. Click Create Stack
Login to Control Tower Management as Service Catalog End User
a. The stack you deployed on the previous step will create Qlik service catalog porfolio that is shared to AWS SSO group
AWSServiceCatalogEndUserAccess. To access this service catalog you need to re-login via AWS SSO using the right role.
b. From your AWS SSO URL, and login to your Management account using AWSServiceCatalogEndUserAccess role.
Deploying your Service Catalog Product into Control Tower.
a. Login to your Service Catalog and You should see something similar to this.
b. Create a Qlik Instance and Amazon Redshift Cluster from AWS Service Catalog. Select the product Qlik Sense Enterprise Instance with Redshift.
c. Click Launch Product to Launch the Demo Qlik Sense and Redshift Instance to your Control Tower Instance.
d. Enter Name as Identifier.
e. Set Primary Availability Zone to correct availability zone.
f. Set Inbound IPv4 CIDR to your external IP with /32 using www.whatismyip.com
g. Set Master Password to
h. Click Next.
i. Review and click Next to deploy.
j. Once completed, AWS Service Catalog will show the following:
Please note the Value will open up CloudFormation window for you to track the progress
k. If you go back to Service Catalog and Look at Provisioned product list. You see the status as Available and all needed information for this product will be listed.
Connecting to Qlik Sense Server and Conncting Redshift to Clik
Due to issues with certificates we will not be configuring Qlik Sense be accessible from other browser. For purposes of this lab we will log into the server and make a connection to AWS Redshift.
a. Open RDP Tool Connect to Qlik Sense Server by using user:
qlik and password
Note: For this workshop all of Qlik Sense Servers were imaged with above credentials and loaded partner licenses
Note: please click Continue if you get challenged by certificates
b. Once connected open Qlik Sense Hub.
c. Qlik Sense will launch Firefox and auto authenticate with saved credentials. Once in the hub double click and open “AWS Redshift” Icon inside Qlik Sense The Firefox browser has credentials qlik/ Welc0me$ko2020 pre-loaded.
d. Now We will connect Redshift to Qlik Sense Application. First Create ODBC connection to Redshift. Find ODBC Data Sources (64-bit) in Windows
e. Add Redshift ODBC Driver to User/System DSN
f. Add Credentials from Step 1k
g. For Redshift Database “qlik_aws_sko” is the database name
h. Validate Connection
i. Now back to hub Click on “Add data from files and other sources”. Refer to Step 2c.
j. Click on ODBC tile from Qlik Sense Connectors and select the ODBC Connection created. Select created DSN and Enter credentials (qlik/ Welc0me$ko2020).
k. At this point you have your Qlik Sense Server connected Redshift Instance.
(Optional) Loading Sample Data to Redshift and Visualizing in Qlik
a. For Final Part of the Work we will attempt to load a few sample rows from NYC Taxi and Limousine Commission Data. The original data can be found here https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page. For purpose of this lab the Data has been loaded to a public bucket folder s3://qlik-aws-public/taxi_sample
b. Login to Redshift and Open Query Editor.
c. Run the Following Code. Create Trips Table.
CREATE TABLE trips ( vendorid VARCHAR(10) ENCODE LZO, tpep_pickup_datetime TIMESTAMP NOT NULL, tpep_dropoff_datetime TIMESTAMP NOT NULL ENCODE LZO, passenger_count DECIMAL(6,0) , trip_distance DECIMAL(6,3) DEFAULT '0.0' ENCODE MOSTLY8, PULocationID VARCHAR(2) , DOLocationID VARCHAR(2) , rate_code_id SMALLINT NOT NULL ENCODE LZO, store_and_fwd_flag VARCHAR(5), payment_type VARCHAR(3), fare_amount DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8, extra DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8, mta_tax DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8, tip_amount DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8, tolls_amount DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8, improvement_surcharge DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8, total_amount DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8, congestion_surcharge DECIMAL(6,2) DEFAULT '0.0' ENCODE MOSTLY8 ) sortkey(tpep_dropoff_datetime)
d. Copy Files into Trips Tables.
copy trips from 's3://qlik-aws-public/taxi_sample/xaa' credentials 'aws_access_key_id=MY_ACCESS ;aws_secret_access_key=MYSECRET delimiter ',' region 'us-east-1' IGNOREHEADER as 1; copy trips from 's3://qlik-aws-public/taxi_sample/xab' credentials credentials 'aws_access_key_id=MY_ACCESS ;aws_secret_access_key=MYSECRET delimiter ',' region 'us-east-1'
e. Validate Trips Data in Redshift Query window. At this point you should have 200K lines in Redshift and you can validate.
f. Use Qlik Sense to Connect data. Click on created ODBC Connection inside Redshift.
g. If you set the owner to public and check the checkmark next to trips you should see your data from Redshift.
h. Press the green “Add data” Button and you should see Qlik connecting to Redshift and caching and aggregating data (Approximately 24 seconds).
i. At this Point Qlik Indexing and cognitive engine has inspected the data and metadata. Click on “Generate Insight” for Qlik to give suggestion on insights found on data.
j. Qlik will Display suggested insights and you can press “Add to Sheet” to create you first Qlik Dashboard.
k. Click on My new sheet and Explore by Clicking on different charts and fields interacting with Redshift.
In This Tutorial you have seen Service Catalog, Qlik, Redshift. Please reach out to your Qlik AWS Partner Team for more information.
Copyright 2020, Amazon Web Services, All Rights Reserved.