2022年7月19日火曜日

ENGLISH. Tableau, Snowflake, and AI are changing the way data is used - Creating nimble dashboards from 150 million sales performance data. Comment: ZOOM offers a service that allows you to reserve an interpreter to discuss domestic and foreign policy issues among governments around the world based on such BIGDATA, or to discuss business negotiations among governments and trading companies. It is also a good idea to have a Discord chat at home and abroad.

JAPAN.

ー--
ENGLISH.



~Creating a Lightweight Dashboard from 150 Million Sales Data


BI online report: Using Tableau and Snowflake to change the way we use data - Creating nimble dashboards from 150 million sales records

On October 28, 2020, "[Webinar] Changing the Common Sense of Data Utilization with Tableau and Snowflake: Creating a Lightweight Dashboard from 150 Million Sales Data" was held at GEAL Corporation. This is a report on the contents of the seminar.


Introduction

In order to freely analyze and utilize data, analysis tools that are easy for users to use are necessary. However, that alone is not enough to analyze data. It is necessary to prepare the data itself in a place where it is free from various constraints. In this seminar, we will introduce "Tableau" and "Snowflake" as products that remove those constraints. Tableau" is a BI tool that allows anyone to easily analyze any data, and "Snowflake" is a cloud-based data warehouse (DWH) service. And we will introduce the benefits of this combination from Gir, which has a proven track record of implementation for its customers.


Table of Contents

Part 1: Demonstration of the challenges of data utilization to date and future data utilization scenarios

Part II: True data utilization realized by the combination of BI (Tableau)/AI (Einstein Analytics)

Part III: Reducing data latency to zero - "Data Cloud," an infrastructure that does not stop the flow of analysis

Summary


Part 1: Demonstration of the challenges of data utilization to date and future data utilization scenarios

<Lecturer> Mr. Mika Kamei, Business Development Department, Zeal Co.


Past data utilization and issues

When we surveyed our customers about their data utilization issues, many responded that they "do not know how to proceed with data integration because data is dispersed" and "would like to start data utilization but do not know how to proceed.



We also often hear that there is no one to consult regarding data analysis and utilization, or that multiple sets of data yield different results that are difficult to interpret.



We believe that the combination of Tableau and Snowflake is the solution to these problems.

Tableau is a visual analysis platform that allows anyone to perform very powerful analysis with intuitive operations. One of Tableau's strengths is that it has an extensive community, so even if you have trouble using the product or don't know anyone you can talk to, you can easily find a friend and ask them for help.



Snowflake is a next-generation, fully managed, cloud-native data warehouse.

It supports multiple cloud services and features separate storage and compute resources. While traditional data warehouses suffer overall performance degradation when processing is intensive, Snowflake can dynamically allocate the necessary resources when processing is intensive, allowing users to use Snowflake without worrying about performance.



Snowflake's data ware size allocation screen


As shown in the red line in the figure below, Snowflake can auto-scale to automatically increase or decrease the number of clusters to be used depending on the server size and the multiplicity of processing, so you can stop the conventional operation of constantly starting up resources in advance and always have fresh data in a cost-optimized environment. This allows for analysis and data utilization using fresh data in a cost-optimized environment.



Collecting data in Snowflake provides access with good performance, so there is no need to "search for distributed data," which was one of the challenges of data utilization discussed at the beginning of this article, and there is no need to create physical marts to improve performance. Tableau is easy to use, easy to analyze, easy to visualize, and easy to share dashboards so you can find the answers you need quickly. So if you are wondering where to start, why not start small with these two solutions?


A free trial of Snowflake is available here. You can try it now for free for 30 days.

Please sign up using the form linked below.



Part 2: True Data Utilization Achieved by Combining BI (Tableau)/AI (Einstein Analytics)

<Lecturer> Mr. Kei Kuroi, Partner Account Manager, Tableau Software, LLC


At the Tableau Conference held on October 7 for three days, it was announced that "Einstein Analytics" (Einstein Analytics) will be rebranded as Tableau CRM. I will introduce the contents of this presentation and the new implications of combining Tableau and Snowflake, and what this means.


1. Why is it so difficult to utilize data across organizations?

There is a well-known report by the Ministry of Economy, Trade and Industry (METI) called "The 2025 Cliff," which explains why data utilization is not progressing. The reason is that existing systems are built by business unit and do not allow for company-wide, cross-organizational data utilization.

In addition, data utilization does not work just by visualizing data. Tableau proposes that the process of utilizing data should be changed in this way. We suggest that you change the process of utilizing data in this way.

In the conventional data utilization process, data is extracted from mission-critical systems to create documents, but from the point of extraction, the data loses its freshness. This makes it difficult to make decisions based on the latest and correct information.



With Tableau, by connecting directly to data sources such as mission-critical systems, the cycle of looking at the data and linking it to actions becomes very fast. We believe that we can overcome organizational barriers by changing the process of data utilization as shown in the figure below.



2. How to gain insight from business systems

Last year, Tableau became part of Salesforce. Salesforce has a vision called "Customer360. This is a system that supports the digital transformation of our customers. The vision is to cover all customer issues with a variety of technologies, to look at customers centrally, and to make optimal proposals.


The figure below lists five typical internal systems. From left to right, they are: an accounting system, a forecasting system, a production management system, a system for managing inquiry information, and a customer management system for managing sales activity information and other data.

Usually, each of these internal systems has a mechanism for visualizing data. This allows us to understand data closed to each system, but it does not allow us to multiply the data with data from other systems. For this reason, the data is exported to CSV files, which are then compiled in spreadsheets or visualized in Tableau and referred to by management.

However, for the same reasons mentioned above, the exported files are not fresh, and even if real-time information is available only in some departments (Sales Cloud is applicable in the figure), the "multiplied data" seen by management is not up-to-date.


As a solution to this, we recommend aggregating data in Snowflake and analyzing it in Tableau.



For example, if your accounting system shows that accounts receivable are increasing and not being collected, you need to find out what the problem is. If you look at SalesCloud, which manages sales activities, you see that several customers are behind in their payments. To investigate further, we look at ServiceCloud, which manages inquiry information. After checking, we find that there is a problem with a particular product, and in some cases, the product has not been inspected. Since this means that there is a quality problem, we check the production control system. This time, we were able to identify a certain part or lot that caused the quality problem. The next thing to check is the forecasting and actual production control system. We can hypothesize that quality problems may have occurred due to pressure on the planned value.


In this way, by using Tableau and Snowflake to connect the various business systems, it is possible to obtain insights from a variety of data.



3. Data value chain created with Tableau and Snowflake

The figure below shows the new Tableau lineup.

Einstein Analytics" (Einstein Analytics) is a Tableau CRM.

Some of you may have been using this product for some time, but you can rest assured that the branding will only change, and the features and pricing will remain the same.



Also, each of Tableau's products are explained below.

Tableau Prep Builder : Connects to various data sources and cleans (processes and formats) data.

Tableau Desktop: Connect to data and gain insight through visual analysis.

Tableau Server / Online: Share insights with your team, report to your manager, and disseminate to other departments.

Tableau Catalog: Catalog your data assets under Tableau management. It facilitates search, management, and history management.

Tableau Prep Conductor: Automates Tableau Prep Builder procedures (flow).



Einstein Analytics and Tableau are separate products, but we plan to integrate them in the future. First, we will start by integrating and linking the AI/ML and data layers.



A roadmap has been announced to eventually integrate all layers.



The following three functions are concrete images of the integration of AI/ML and data layers.


Dashboard Extensions: access real-time forecast results from dashboards

Analytic Calculations: Embed real-time forecasts into Viz

Tableau Prep: Add forecast values to datasets

Each of these is explained below.


Dashboard Extensions: Access real-time forecast results from the dashboard


In the figure below, the left side is the Tableau dashboard and the right side is Einstein.

The 31% on the right side indicates the possibility that shipments will be delayed, and below that, the question is: What are the causes of the delay? What are the factors that may hasten the shipment? This is the reason for the delay. The results of Einstein's analysis of various forecast data are displayed in this way, so you can quickly take the next step.



Analytics Calc: Real-time forecasts embedded in Viz


The scatterplot in the lower right of the figure below shows Tableau CRM with a calculation formula from a Tableau worksheet. Currently, there are mechanisms to call external functions such as Python or R, but the same mechanism is implemented.



Tableau Prep: Adding Forecast Numbers to the Dataset


A function to pass data to a forecast model and receive the results is added to the Tableau Prep flow. You can select a forecast model that has been created in advance, so you can prepare data that incorporates the results of the forecast model without having to know the details of the forecast model.



The more predictions you can make, the easier it will be to respond to changes. The next step is to show you how to make it.



Connect Desktop and Prep to Snowflake to clean up the data. Then pass it to Tableau CRM to create the forecast data. Upload the data with predictive data to Tableau Online, and here you have a data set of historical and future data. Analysts can analyze that in Tableau Desktop, people in the field can connect to Tableau Online, or in some cases, Salesforce can be embedded in the customer's portal, and users of the portal can indirectly view that online data. The reality of the underlying data set is that it's not just a data set. The reality of the underlying data set is Snowflake, which collects data from SAP, Anaplan, and others. By doing this, you can view the data that you are using in your various operations in a horizontal manner.

Anyone with Tableau Creator, Tableau CRM, and Snowflake can get started.


Part 3: Data Cloud, an infrastructure that does not stop the flow of analysis

<Lecturer> Mr. KT, Senior Sales Engineer, Snowflake K.K.


The third part of this session is about Snowflake and what Snowflake can do and what it aims to achieve in this market.

Since Tableau was introduced in the previous session and I was also a Tableau employee until May 2020, before talking about Snowflake, I will introduce the [visual analysis cycle], which illustrates the steps of "how people understand data" as proposed by Tableau. Here is a diagram. The diagram is a simple representation of what it looks like when people use data.



There are various tasks, such as using data to improve sales and profits, reduce costs, and so on. To solve those tasks, we think about what kind of data to acquire, what kind of graph to represent it in, and finally share it with someone to encourage them to take action. Each of these actions should not take a long time and seem slow. The time spent waiting interrupts our flow of thought. Therefore, it is very important how quickly we can perform each of these steps without waiting.


One day we find ourselves in a world that has suddenly changed drastically.

Due to the spread of the new coronavirus, a state of emergency was declared in Japan in April 2020, forcing many people to stay home. Lockdown measures were taken around the world, and people around the world were greatly restricted from going outside. But if you stay at home, you don't know what's going on outside and you don't feel the infection is spreading. So what should we do? To answer this question, we have tried to use data to figure out what we cannot see, and there are many reports on Tableau Public by volunteers related to coronavirus. Various people around the world were making these efforts to see if we could do something about this situation and if we could do something about it. However, behind these easy-to-understand visualizations, there was a steady and persistent effort that could not be seen from the surface.



Information on the number of people infected with coronaviruses was compiled while making daily adjustments to the data, because the method of compiling the data was subject to sudden changes. It is not inherently the job of the public health department to compile data on infected persons. Had the data been recorded automatically, how different things would have been. Unfortunately, however, a significant amount of the data was being compiled and transmitted manually. As a result, it would have put a strain on the original job of inspection. The very act of collecting data would have interfered with the work that needed to be done.


So, we were far from being able to use data to know things in our world, but we were far from being able to use data to know things in our world. It is very easy to bring or go get data. However, we realized that the data we wanted to retrieve did not exist in the first place, which was a major challenge.


Our Mission: Enable every organization to be data-driven

We generate data on a daily basis just by existing, as long as we live.

We browse websites, log into systems, "like" social networking sites such as Facebook, and send surveys without being aware that we are generating data. In any case, when we are alive, we generate a variety of data. In many cases, we are not even aware of how that data is being used. Data that is not even aware that it was generated is somehow stored by a very few companies that know a lot about data, and before you know it, it is being used without your knowledge. We must be aware that we own the data we generate. The world is now at a major crossroads. Either the world will become data-driven, with organizations and individuals who can manage and utilize their own data and ethically increase business value while respecting the will and dignity of the data owner, or the data will be exploited by a very limited number of organizations that are skilled in handling data, and those who belong to other organizations will be left to their own devices. This is a major turning point in the world, where we are entering a disparate society where only a limited number of people benefit from data.

The various possibilities of using data are for all organizations. We want to create a place where everyone can use data, not just data for a very few. The reason this is not happening now is because it is too difficult to use data. That's why we created Snowflake to make every organization data-driven, to make it easy for anyone to work with data.


A Data Cloud for Every Organization

We provide a data cloud that every organization can use. This is a place where data can be clean, unmaintained, voluminous, and used by any number of people, in any situation. If your data can be collected in one place in a data cloud, and maintenance can be reduced to a minimum, data management will become much easier. We are aiming for a world where people can collaborate with each other without worrying about data management.


To begin with, why can't we use data when we want to use it? There are three challenges to this.


Three major challenges

1. origin of data

  Different data come from different places, in different forms and quantities, and require different capabilities and ingenuity to process them quickly.

2. Siloing

  Data can easily become siloed because of the different places where it is generated in the first place. In addition, even after data has been integrated once, it may be converted to a data mart for performance reasons, and siloed data may be generated again.

3. hurdles to data sharing

  When data is shared with other organizations, it is necessary to duplicate the data and send it to them. When data that should have been protected once is sent again to another location, security settings regarding data location and communication methods must be re-examined. Transmitting heavy data or linking data that is updated frequently involves enormous costs.


It has been thought that huge hardware resources are all that is needed to flexibly process data scattered in various locations. So it was thought that the cloud, which can instantly use even huge resources, would solve the above problems. But that didn't work out. However, it did not work out, because there were only two types of existing clouds

Application clouds.

  For example, Salesforce.

Infrastructure cloud

  For example, AWS or Azure.


After all, it is difficult to view data in the current cloud environment alone. There are complex steps to integrate and compute the data generated from applications in order to view the data. Our problem was that simply having a place in the cloud environment to process the data was not an efficient use of the cloud's resources. We want to make this conundrum as easy and simple as possible. In order to make data available to everyone, no matter where it is located, we are aiming for a state where it exists as a cushion between the application cloud and the infrastructure cloud as a data cloud. To achieve this, we believe it is very important to have a multi-cloud and multi-region architecture so that the data can be linked wherever it is stored.



Architecture Supporting Data Clouds


Storage is consolidated in one location, to which compute resources can be individually attached. Any number of computer resources can be created to access the stored data. Even so, Snowflake's architecture is such that no matter how many computer resources access the data integrated in one place at the same time, there is no conflict at all.



It can handle a small amount of data at the beginning, and then grows as the data grows over time. As the size of the warehouse is increased, the processing time gets shorter and shorter. The cost is the time it runs x performance. For example, if a process that takes 16 minutes in size XS can be processed in 1 minute in size XL, the amount of money is the same. If you can get 1/16th the speed for the same amount of money, you should choose the faster one.


Normally, Snowflake is secure, but if you want to release a part of your data to the outside world, or if you want to lend this data to a certain person only, you can do so. You can store your data in one place, without copying it anywhere, and another user can access everyone's data using the warehouse that person has. This is the data sharing feature. There are also data marketplaces that take advantage of this functionality, where data providers can set up store with their data as if it were a marketplace. Real-time information on new coronaviruses, for example, is made available free of charge on the data marketplace and is used by research institutions, medical and pharmaceutical industries, and governments around the world. This is a revolutionary way to share data that is only possible because of how many warehouses can access it.


Because of these features and vision, we now have over 3,000 customers using Snowflake as their data infrastructure.


We hope to work with you to create an environment in which each of you can be as creative as you want to be, and to help all of your organizations become data-driven.


Conclusion

We often hear the term "self-service BI," and I believe that Tableau is the product that has led the way in self-service BI. Also, the best place to prepare data is probably Snowflake, which is cloud-based and transcends vendor boundaries. This is the best combination of products needed to freely utilize data. Start small with a combination of both products for data-driven management. If you are interested, please contact us.


LOGO

Data Utilization

Data Management

Open Data

Product Testimonials

Interviews

Reports

News

About This Site

Terms of Use

Privacy Policy

Contact Us

Return to top of page

0 コメント:

コメントを投稿