Tuesday, October 15, 2019

Deeper into the Tableau Repository Part 5: Cracking the Workgroup Database Wide Open!


This is Part 5 of a series where we dig into the Tableau repository.


WARNING 

This is out of the box, non-standard, and not supported. Proceed at your own risk. Make sure you have a backup. Don't try this at home (more likely work). 

When Tableau 2019.2 came out it has an annoying banner that shows up. I have been wanting a MOTD (Message Of The Day) for a long time so I can post upcoming events, upgrades, etc to all users without spamming people. 



    Like many things this is a case where we get a feature, that isn't quite fleshed out and have no administrative controls (looking at you Ask Data and Extensions) and then in a next release Tableau allows us to control and customize them. Some of the issues I have with the banner:
    • Once it is hidden, it is hidden forever. You can't get it back. 
    • If you have an account on 2 sites you have to hide it for each site.
    • I can't control if it is shown.
    • I can't control what is shown. 
    Based on the user's license role the banner will change. The issue here is we have a Core-Based Server license and 40k users with publishing rights. Since they can publish they get the links to download Tableau Desktop and Prep Builder. This banner means I potentially have thousands of users who are prompted to download software which:
    1. They aren't licensed for
    2. Isn't the version we are using
    Most users will get this even though they don't have a license for Tableau Desktop, and we have zero Prep Builder licenses.

    While I was searching for a solution I came upon this article By Jonathan MacDonald over at the Information Lab that suggests enabling write access to the Workgroup database and just writing to the preferences table so it doesn't show up for users. It doesn't address what to do with new users, so you would actually have to run a scrip to find any new users and turn the banner off for them. 

    Now that I can write to the database what else is possible?

    I have so far avoided the REST API that can probably handle some of this. Some things I have been experimenting with are:

    Canceling Jobs


    Have a long-running extract that you want to cancel? well, there is a table for canceling them called BACKGROUND_JOBS_CANCEL that is empty. Insert a row in there with a background_job_id and it will stop running! I would guess there is a REST API command to populate the table and kill the job too.

    Turn off AskData


    You can do this for each data source, or for the entire server. Turning it off requires multiple clicks per datasource and we have hundreds.  I would really like the default to be off but still, allow datasource owners to enable it. AskData requires some prep to work well. The first thing that comes up when clicking on a datasource in Tableau Server is that it triggers the indexing of the datasource. We have had some performance issues with indexing large datasources. If people actually use AskData that is great but watching the logs most of the time a user will trigger the indexing and then do something else. Rarely do we have a user actually use AskData.

    To implement this I had to get the right permissions on the table by running:
    GRANT ALL ON SEQUENCE public.datasources_id_seq TO readonly;  
    GRANT ALL ON TABLE public.datasources TO readonly; 
    GRANT ALL ON SEQUENCE public.datasource_events_id_seq TO readonly;

    And then this SQL will turn off AskData for every datasource:
    UPDATE datasources 
    SET nlp_setting = 'disabled' 
    where parent_workbook_id is null

    Reschedule Jobs


     With the newer REST API, you can do this programmatically, but for a long time, it wasn't possible. I have been working to set thresholds to demote extracts that aren't used often. Here is an example: A workbook extract that updates every day, but nobody has looked at it in 20 days. Let's move it to a weekly schedule.

    Wednesday, October 9, 2019

    Deeper into the Tableau Repository Part 4: Letting users know about their Tableau Content

    This is Part 4 of a series where we dig into the Tableau repository.

    Our Tableau Server is a shared resource. I understand it is really easy to create a dashboard, generate an extract, and publish it out there. The really hard part is getting people to use a dashboard. This leads to lots of orphaned content.

    We recently have been having issues with a pretty substantial backlog of extract jobs in the morning. Our main EMR(Electronic Medical Record) database doesn't finish updating until 6am and then we have about 300 jobs that get added to the queue between 6am and 10am. With only 4 backgrounders to execute these jobs (plus any system tasks and subscriptions), there can be a bit of a backlog. We were averaging hours of lag time during the morning. I ended up adding a backgrounder process and there is a configuration setting to sort jobs (with the same priority) based on runtime.

    To better let users know what is going on I created a dashboard to allow them to see some stats. I invited all our users to subscribe themselves and I added some of the worst offenders. It contains sections that show on a monthly basis:

    • What content is being used
    • What extracts have failed or suspended
    • What content has not been accessed in the last 6 months (including how big it is)
    • How long do my extracts take compared to other users
    • How much time did my extracts take last month
    • How long did my extracts wait before executing
    • What extracts are being refreshed too frequently based on use

    The last one is a big deal. I found someone who had an hourly extract that hasn't been used in 6 months. I set some thresholds for hourly, daily, weekly, and monthly schedules. My next steps are to automatically demote extracts that are exceeding these thresholds. More on that to come!