Gain 3 Benefits with Delta Sharing

May 26, 2021 05:00 PM (PT)

Download Slides

Following Matei Zaharia’s keynote presentation, join this session for the nitty gritty details. Tableau is joining forces with Databricks and the Delta Lake open source community to announce Delta Sharing and the new open Delta Sharing protocol for secure data sharing. For Tableau customers, Delta Sharing simplifies and enriches data, while supporting the development of a data culture. Join this session to see a live demo of Tableau on Delta Sharing. Tableau customers can choose between 2 workflows for connection. The first workflow is called “Direct Connect,” which leverages a Tableau WDC connector. The second workflow involves using a hybrid approach for querying live on the Delta Sharing protocol and using Tableau Hyper in-memory data engine for fast data ingestion and analytical query processing.


In this session watch:
Razi Sharir, VP of Product Management, Data, Tableau
Blair Hutchinson, Product Manager, Tableau



Razi Sharir: Hello everyone. My name is Razi Sharir. I’m the VP of product management of data at Tableau. I am joined with me here with Blair Hutchison, the product manager on the technology partners arm for Tableau. And I’m here to talk with you about what you can gain from a [vanity] point of view by using Delta Sharing and you don’t have to wait for the annual presentation to see what Delta Sharing actually means. Blair, do you want to step in and introduce yourself?

Blair Hutchison: Yes, thanks and as Razi said, I work on our technology partner team, which means I work closely with our strategic partners like Databricks and really excited to be here today as a launch partner for Delta Sharing as this really addresses a really important need for our customers around data access.

Razi Sharir: All right. So let’s dive right in. So if you read the article by a Gartner, they’re claiming that by 2023 organizations that usually promote Delta Sharing will outperform their peers on most business values metrics. And if you read carefully in the article, this comes as no surprise. It’s actually the other side of the house that, where there you were following earlier the tradition of Don’t Share Data unless the mindset has outlived its original purpose. So, I guess what I’m saying here, is that sharing is the center of gravity for data moving forward and this is also the center of gravity for this discussion. And let me walk you through what comes up in the agenda and what we’re going to cover today around this sharing concept. So, it all starts with data. I’m guessing that many of you have heard the new cliche, the data is the new oil.
And quite honestly, when I talk to many customers and many of our partners, this is actually a reality. So, we will talk about what it all starts with and where the data starts the whole process of transformation and digital transformation, which is based on the Delta Sharing. And I guess we’ll go over into what you can do if you get more data. Actually, that gets more exciting. It’s not only about what it starts with, but rather where you can go with it. And wrap it up with Delta Sharing and Tableau coming together, and giving you the hint into what’s the point we’re going to reveal later on in this presentation. So stay tuned for the next few slides, sharing with you what Delta Sharing is all about. So guys, for those of you who have known Tableau, and I’m guessing many of have known Tableau, the way we like to paint ourselves, is that we’ve been on a mission.
We help people see and understand data. And this means more to us and more to anybody else as more than ever, because data gives people the superpower and people that tap into the super product can do incredible things. So, it all revolves around data, but the data alone is just mandatory, but not sufficient. It’s about the people that see it and understand it and what they make out of it. So with the data of seeing and understanding, let’s move on to the next step and see how it translates into what you guys can gain from it about sharing it. So let’s see some facts and some claims about stats that I think would guide us through this discussion. So data people are a secret weapon in creating success. Studies shows us that organizations that better harness their data get driving better outcomes.
So they get a better, complete picture. They make much decision and faster, more importantly than that, and they empower everyone. So if you look at some of the numbers, let’s pick on them one by one. 23x, more likely to add customer. That’s impressive, but I guess you agree with me on that. Almost one and a half times more likely to grow revenue by more than 10%, that’s super impressive. And last but not least 9x are more likely to retain customers. So this means that putting all of these metrics together gives you a sense of drive in merit, why you should actually follow a data driven culture, if you will. And move on to an organization it’s performing better based on is data visualization and data driven culture. So we’ve all seen the metrics they look impressive, but for every con there is a pro and every pro there is a con.
So let me share with you a little bit more details that may be slightly showing a different picture. But the truth is organization, and for the most part, unfortunately, falling short. Recent surveys show that less than 10% of companies actually [distinguish themselves] as being more data-driven. And if you look more than that, that means that most of us are falling behind. That’s a fairly straightforward claim. That was a competitive problem, before. Before the pandemic, before worldwide economic downturn. Before, to complete our feel of how we were, before and after we work and what our customers expect for us today is becoming a data driven culture, a data driven company, and a data driven organization. It’s not just a matter of being able to deliver on competitive edge, it’s about a matter of survival. So it’s not something that you can just overlook. It’s something that you have to address.
And the earlier the better. At the blow we’ve been consumed in with understanding of what makes a difference for those who are actually finding success. And it’s much deeper than deploying technology. And we go beyond the technology. It’s more fundamental about how you do things rather than what they’re actually doing.
That’s why we call it building a data culture, which is the center of everything that we do about driving people, seeing and understanding data, but driving the uptick of their organizational change in data discovery and data utilization and data transformation. It all centers around data and what you do with it in the context of culture and adoption. It all starts, we said already in the beginning, it all starts with the data. So we need to deploy the platform and we need to allow people to get access to the data. So they get the data access. And last but not least, once they are done with it, and they’re happy with the results, they want to collaborate and share it with their peers and supervisors. And others who are actually consuming the dashboards and outcomes and findings and insights, and interact them on among each other.
And we’ve worked with many organizations of every size, every industry, technically and practically across the globe. I mean, the slide in front of you shows you so many names. It gives you a sense of the diversity of the organization’s customers that work with. So that’s, I’m guessing comes as no surprise. Our customers are leaders in the industry. They use Tableau to change the way people interact with data. And in fact, what we do at Tableau, we help them turn insights into action. Cut down on the analysis time. Change behaviors to help everyone be more data-driven across the business. It’s all about the data and what you do with it. So I walked you through the offering. I worked with our customers, the customer value by using Tableau. But I think it gets even better when Tableau partners with its strategic partners in this case Databricks.
And we’re super excited and glad to share that we are a strategic partner of Databricks. As you can see here in the picture, we have Francois who’s our CPO shaking hands with Michael Hoff who runs a partner organization. So it’s not only about Tableau, it’s about Tableau plus Databricks bringing in one plus one greater than two significantly, bigger, larger, and more [aggressive] value to the addressable market, which I [inaudible]. So let me tell you a little bit more about that, as we move into the actual cooperation between the two companies. The strategic partnership goes across all layers from the products to the engineering, to the sales and field all across the board. And when it comes to the actual implementation, it basically boils down to how do we connect the ends and how do we make these two offering come together, and show what I already alluded to as a one plus one graded two. And we already said upfront, it all starts with the data.
So let’s dive in right into the data. In 2019, we introduced the first connector that we did from Tableau into Databricks, and we showed 12x improvement, in the initial connection speed. It was about 30% improvement in SQL generation. It was even a simplified connection experience and improved error handling. So we kind of hit the road with something was already ahead of the curve and very promising for us, as things kind of progressed over the time.
In 2020, we took it one step further and we reduced the latency from a connectivity point of view. We were able to demonstrate significantly higher throughput, and we even did some more fine tuning on the actual connectivity and the connection between the two ends. We did update wire protocol. We did a bunch of stuff [inaudible] in February. But we’ll be releasing in the next quarter, even more, faster and lower latency across the two technologies or the two stacks, I should say. In 2021 this year, we are continuing to co-develop this joint technology and joint go-to-market agenda, and continuing to make improvements based on SSO and performance from the ODBC driver, take advantage of sequel analytics.
So as you can see, that integration is as deep as it gets, and we mean business. We want to make sure that the Databricks engine is taken to the max plus the Tableau engine together. And we can definitely demonstrate how these things come together and provide so much more value to our customers, which would be a perfect segue to the next slide. And let me show you some joint customers. There we go, so the names right in front of you. And let me touch on two of these particular names, Flipp and CVS Health. Specifically with Flipp, it’s a retail technology company that works with the largest retailers and brands, which enables everyone to access and analyze their entire data. So it’s a breach of the ability for the business analysts and the sales teams, and then they can show their partners, customers, what are they processing and how they’re processing it.
It’s the data science team that can build powerful, predictive analytics. So that’s a killer for them. And last but not least is for the engineering team now can create product features that rely on the joint analytics offering. So that’s a really exciting use case that we take pride at, at Flipp and really looking forward to the next steps that they’re going to push it downwards. CVS Health, we all know CVS, a common drug store that you all know. It has about 80 million customers passing through the pharmacies every day. It’s a giant, and what are they doing? They’re using Tableau and Databricks to provide meaningful interaction that put customers in the path to a much better health. In other words, what does it mean? It’s a collaborative environment. The teams are able to work together and provide much better service to their customers, not our customers.
So it’s about the data engineering on building faster data pipelines, which means they can provide faster, more accurate, and more relevant answers. It’s about the data sciences on training machine learning models and very efficiently doing so. So they’re continuously learn and improve their experience as they’re pushing to their customers. And last but not least, they’re very typical and traditional data analysts, visualizing the financial and operational metrics so that they continue dashboarding and monitoring and improving on their own KPIs using the two platforms. So that’s a very typical, classical case of a retailer using these two powerful engines from Tableau and Databricks together, pushing their own business to the next step.

Blair Hutchison: So as Razi just shared, we have a number of customers that are taking advantage of Tableau and Databricks today. We have a proven history with all of those customers. And a common theme is that all of them are thinking about what they can do with more data. They want more data in order to make more precise and accurate decisions with the data that they already have. And the industry has really been looking for a solution to this data access problem that we face currently. So what can we do to solve that? And that’s what we’re really excited to talk about today, which is Delta Sharing. So let’s talk about Delta Sharing at a high level and why Tableau is so excited to be a launch partner for this project? So Delta Sharing is the industry’s first open protocol for secure data sharing. And it makes it simple to share data with other organizations, regardless of which computing platform that they’re using.
And this is so exciting because Tableau and Databricks have been on a mission to democratize data. We hear from our customers, our partners and our suppliers that they need, what the Delta Sharing is able to provide. And it’s such a critical link, I think in our whole data ecosystem that our customers are living in. So we’re tremendously excited about the opportunity that this provides everyone that can take advantage of this new protocol. So I should also just mention that it’s an open source project. That’s a part of the Delta lake project, and any platform that supports the Delta Sharing will be able to easily and securely share data with other platforms via the open protocol. So on the right-hand side, you have your data providers and data sharers. They make their data accessible via a Delta Sharing server. You notice on the right-hand side in big orange letters, it says, “Data does not move.”
And that’s the case. And it gets updated and refreshed by the data providers and then simply gets passed along via the Delta Sharing server. You can see on the bottom there, the examples of where these tables could live, include GCP, Azure, and AWS, as you can see them on the bottom. And then you have your consumers that are able to see the metadata that’s provided via the Delta Sharing server in tools like Tableau. So now let’s look at what that looks like in action. How does someone go from getting access to a Delta Sharing table to being able to visualize and explore that dataset in Tableau? So there are two workflows that we see Tableau users taking advantage of the Delta Sharing protocol. The first is, what I’m calling direct connect. You’re within the Tableau UI and you connect directly to that dataset, how you normally would in Tableau.
The second is a little bit different and I’ll save that until after the demo. So let’s move left from right in this flow here. So the data share, shares that data set on the Delta Sharing server. Try saying that 10 times fast. Then the Tableau user connects via a web data connector, which is what WDC stands for to the Delta Sharing server. So the data sharer makes their data available on the sharing server, and then when send a URL or an activation link to the Tableau user. That’s, a one-time use activation link that they can share. The Tableau user downloads those credentials from the activation link and pastes them in the web data connector, which are here, you’ll see here in a second. There’s an underlying handshake between the user and the sharing server that happens via the secure Delta protocol.
And then the Tableau web data connector has the signed URL, which gives them access to that dataset so that they can drag and drop and explore that dataset, however they’d like. So, without further ado, let’s look at this interaction, let me quickly transition over to my desktop. So I’m going to start with the activation links. So this was sent to me by one of the data sharers. And what I’ll do is I’ll download that credential file, which is saving an adjacent file locally to my computer. And as I’m hovering over that, you can see, I can’t download the credentials again. So next I’ll jump over to Tableau desktop, and I’ll put in my URL to use the web data connector. And I’ll add those credentials that I just saved locally in my computer. So once I’ve done that, then you can see that I’ve got access to that dataset.
And now I can drag in the tables that I’m looking to visualize, in this case, just the vaccines table. So when I update this, this is now pulling in the data that that data sharer or data provider is given me access to. And in this case, this is some global health information, that someone has shared with me that I’m looking to use to explore some COVID responses based on the number of vaccines that have been doled out for each country. And at the top right you see that this is an extract, but anytime that I refresh this connection, I’ll be able to actually pull in brand new data that that date sharer has loaded to that particular dataset. So what I’m interested in, it is looking at the stringency index across time. And if you’re not familiar with that particular measure, it’s a composite measure based on nine response indicators, including school closures, workplace closures in travel bans. And it scales from a value of zero to 100.
So the higher value you have, the more stripped you are. So now we’re looking at all countries over time and it was highest around April 12th. What I’m going to do is drag over just a subset of European countries. We’ll look at the UK, France and Spain here, and I’ll just drag that on the color. So that we can see both of those broken out, and see that everyone had a pretty high response in March and April of 2020. And then there was some varying degrees of stringencies applied across each country. So I’m taking the liberty of creating a dashboard that includes that stringency index view that we just created. And what I’ll do here is expand the date range. We can see this a little bit more clearly, how things trended over time. So what you can, I’ve called out here is the UK add had a really high stringency index that they set compared to the rest, and then they have that really strong drop-off.
So this is really just an example of creating a dashboard that’s using data that’s provided by Delta Sharing, and you can see just how easy it is to connect to that data, and start getting meaningful insights from it. All right, so let’s transition back into slides. And I mentioned at the beginning that there were two ways that Tableau customers can take advantage of Delta Sharing and Tableau. The first was that direct [inaudible], which you just in action a second ago. And the second is using Delta Sharing plus Tableau Hyper. And if you aren’t already aware, Hyper is our in-memory data engine for fast data ingestion and analytical query and processing. So it just makes your dashboards in your exploring data go really, really quickly. And we see our customers using a hybrid approach to querying things live and using Hyper. And Hyper comes with an API, which allows you to create high profiles and then publish those to a Tableau server or Tableau online, to make those available for people to access.
And so we imagine that customers can take advantage of Delta Sharing and Hyper just as they would to transform data in any other process. So for example, that could be using a Databricks notebook to bring in multiple different datasets. You’re using the Delta Sharing server to really augment data that you’re already collecting. And then bring that information together, put it into a Hyper file and then make that available and publish it to Tableau so anyone could end up using it. So those are the two different ways that we see, that we think that customers can take advantage of Delta Sharing and Tableau. And we’re really excited for you guys all to go try this out and let us know how it goes. So in summary, we’ve boiled it down into three major benefits for Tableau users when using Delta Sharing. So the first one is that it simplifies data access.
So Tableau users are really easily able to connect and explore data that’s made available by providers. And most importantly, it’s done securely because whenever we’re talking about data needs to be secure. You saw that in the demo that we showed, how easy it is to connect and start exploring your datasets. The second is that it allows for data enrichment and it’s, then becomes so much more useful. If you’re able to pull in data that you’re already looking at and can augment it with data that’s being made available to you around financial data, corporate data, consumer data, geographical data, and the list really goes on. I think that time will tell what this project really enables. So I’m really excited to see what that ends up looking like. And the third, which really is made good by what the first two allow you to do, which is creating that data culture within an organization.
And this is what Tableau and Databricks are on a mission to create, which is behaviors and beliefs that people have that value and practice and encourage the use of data to improve decision-making, throughout the organization. So the call to actions that we have listed out are the following. The first is feel free to connect with us, we’d love to connect one-on-one with you, to go into a little bit of a deeper dive into the use case that you think that this would bring about. And the second is that, I’m going to make the demo and the reference architecture for the web data connector, as well as some sample scripts to get started in using the Hyper API, plus the Delta Sharing protocol available on our community pages. So download and comment on that information, I will be an active viewer of that, and I’m really curious to see what the conversation starts [inaudible] up like. So thank you so much for joining the session. Razi, I’ll pass it back over to you.

Razi Sharir: So guys, thank you so much for staying through the entire session. I want to kind of reiterate and encourage you guys to go to the Tableau community site and see how this whole thing may fit your own goals and your own data needs in your own data sharing journey. As you go forward in your own organizations and really, really encourage you and welcome you to come and be part of the community and contribute and read and learn. And with this opportunity in mind, I would like to thank you again and hope we get to see you shorter in the future in that in another data summit or another session. And all the best stay safe and healthy. Thank you.

Razi Sharir

Razi Sharir

Razi Sharir joined Tableau as the VP of Product Management in 2020. Prior to Tableau, he was the VP of PM at Informatica, leading Data Management on-pre mand cloud. Prior to Informatica, Razi was th...
Read more

Blair Hutchinson

Blair is a Product Manager at Tableau and works with strategic technology partners, like Databricks, to help delight customers looking to get the most out of platforms. Blair’s been at Tableau for a...
Read more