Our Tableau Server is a shared resource. I understand it is really easy to create a dashboard, generate an extract, and publish it out there. The really hard part is getting people to use a dashboard. This leads to lots of orphaned content.
We recently have been having issues with a pretty substantial backlog of extract jobs in the morning. Our main EMR(Electronic Medical Record) database doesn't finish updating until 6am and then we have about 300 jobs that get added to the queue between 6am and 10am. With only 4 backgrounders to execute these jobs (plus any system tasks and subscriptions), there can be a bit of a backlog. We were averaging hours of lag time during the morning. I ended up adding a backgrounder process and there is a configuration setting to sort jobs (with the same priority) based on runtime.
To better let users know what is going on I created a dashboard to allow them to see some stats. I invited all our users to subscribe themselves and I added some of the worst offenders. It contains sections that show on a monthly basis:
- What content is being used
- What extracts have failed or suspended
- What content has not been accessed in the last 6 months (including how big it is)
- How long do my extracts take compared to other users
- How much time did my extracts take last month
- How long did my extracts wait before executing
- What extracts are being refreshed too frequently based on use
The last one is a big deal. I found someone who had an hourly extract that hasn't been used in 6 months. I set some thresholds for hourly, daily, weekly, and monthly schedules. My next steps are to automatically demote extracts that are exceeding these thresholds. More on that to come!
No comments:
Post a Comment