OK so creating an extract of your billion row dataset is now faster in Hyper, but without the ability to choose a normalised extract yet in the interface, applying security to your data can be cumbersome.
In order to enforce security many people use an inner join from the security table on the left to the data table on the right, and in the process explode their data. They could use a live connection of course so this doesn’t happen but many people want to use the Tableau data engine.
If you have even a few hundred users in your security Table your data could grow to proportions you don’t want to think about if those users overlap records
TL;DR – Take your secured data and add a connection from Salesforce before hitting extract. This will force Tableau to make a normalised database, making extraction faster and smaller.
So what can we do? Well we could play with Tableau 18.3 beta and do this with a tick box to choose multi table extract…. But what could we do today? Actually what could we have done since Tableau 10.0 was released?
Well both good old TDE and Hyper have actually supported multiple tables since day one. However the interface in Tableau just hasn’t been made yet to allow this. So what can we do?
Get a free Salesforce developer account of course! No really.
You see when you connect to Salesforce in Tableau it knows that denormalising from their huge array of tables would be a bad thing, so they don’t do it.
So how does this help me with all the data in my corporate Oracle/SQL/SAP database?
Follow the below and we will create an extract that is normalised instead.
1. Create your data source exactly as you want and make your security table joins as needed
2. Now press ‘Add‘ and choose Salesforce
3. HEED THIS WARNING
Not viewing data in the data grid and live connections are no big deal in this situation but be warned if you are using extract filters to limit the data or need to take incremental updates. This is no longer possible so either fall back on something else like Custom SQL to limit data or define a view first.
4. Log into Salesforce. If you already have an account from your company, that’s great, use that. If not take 2 minutes to get a free developer account here: https://developer.salesforce.com/signup
5. You will now see lots of tables from Salesforce on the left. You do not want to use anything from here at all. Its all for show
Just click on a sheet and choose where you want to save your extract as normal.
Lets be clear. None of your data is being shared with a cloud vendor and you are getting nothing from a cloud datasource. This is purely to force Tableau in the background to make a multi table extract.
6. To check that this has worked you can find out in two ways. Firstly if this took hours before it probably just took minutes instead as the database didn’t create an explosive query. Secondly if you drag the .Hyper/TDE file into Tableau as a new data source you should see multiple tables listed in your single hyper/tde file.
So with this new knowledge go forth and don’t multiple your data!