WEBINAR AIRED: FEB 28, 2018

4 Key Considerations For Your Big Data Analytics Strategy

Presenters:

John Myers

John L. Myers
Managing Research Director for Data and Analytics
Enterprise Management Associates

Steve Wooledge

Steve Wooledge
VP Marketing
Arcadia Data

More and more companies are turning to big data analytics to fulfill their need for information. But some of their tools are simply not sophisticated enough to scale effectively.

To get the most out of big data, the tools need to be both flexible enough to accomodate a whole range of users and powerful enough to support exploration, machine learning and operational analytics. Most importantly, the users should be able to stream events from their devices and interact with the data in real time.

Watch this webinar with industry experts from Enterprise Management Associates and Arcadia Data to learn:

  • The state of big data analytics today
  • How data-driven organizations push analytics to the front line
  • Why exploration/discovery requires detailed data
  • The necessity of integrating streaming data in real time
  • Real world examples of implementation

Transcript

[00:00:00.000]
so hello good morning good

[00:00:02.299]
afternoon and good evening to everybody

[00:00:04.400]
welcome for webinar for

[00:00:06.400]
key considerations for your

[00:00:08.500]
big data analytics strategy which

[00:00:10.800]
is co-presented by Arcadia data and

[00:00:12.800]
Emi MMA my

[00:00:15.599]
name is Paul Revere moderator today

[00:00:19.699]
and more and more companies are

[00:00:22.199]
turning to big data analytics to fulfill

[00:00:24.500]
their need for information but

[00:00:26.600]
some of their tools are simply not sophisticated

[00:00:28.699]
enough to scale fictive lie

[00:00:30.800]
so today we have a

[00:00:32.899]
steep willage for VP marketing at

[00:00:35.100]
Arcadia data and John Meyers managing

[00:00:37.399]
research director at EMA cool

[00:00:39.899]
provide us with a 40 considerations

[00:00:42.100]
that we should off think about a

[00:00:44.299]
Dodge on C4 thanks for

[00:00:46.299]
joining us today I just want to let you

[00:00:48.399]
know we are pretty good audience here it's it's International

[00:00:50.700]
and nature got North we

[00:00:52.700]
got pretty much all the continents including

[00:00:55.000]
Australia and the

[00:00:57.100]
audience includes data architect

[00:00:59.100]
project managers and even the marketing

[00:01:01.200]
director so I think we're going to have a pretty good conversation

[00:01:04.000]
here so

[00:01:06.500]
with these are the four keep the the sections

[00:01:09.099]
that will be going through big

[00:01:11.299]
data analytics have evolved Discovery

[00:01:14.000]
looking at real-time where

[00:01:17.099]
self service is

[00:01:19.099]
important I will get some real world

[00:01:21.299]
examples and of course as I said we

[00:01:23.900]
will do the question and answers

[00:01:27.799]
and consideration number one

[00:01:30.000]
so big data analytics have evolve

[00:01:32.200]
John can you start

[00:01:34.400]
us off with that presenting

[00:01:41.200]
with the with Steve and you

[00:01:43.200]
know I think one of the things that is really

[00:01:45.400]
driving the evolution of big data analytics is

[00:01:48.099]
something that we

[00:01:50.200]
idiom a call David Britton cultures

[00:01:52.599]
data-driven strategies are all part of being a

[00:01:54.599]
data-driven organization so

[00:01:56.599]
in the past we had

[00:01:59.099]
organizational structures that were based

[00:02:01.200]
on looking at data is kind of a backward-looking

[00:02:04.099]
or if you will the exhaust of

[00:02:06.900]
their operation so you know

[00:02:08.900]
what was in the transaction log was

[00:02:11.599]
you know the revenue for the month and we're always kind

[00:02:13.800]
of looking backwards but we're seeing a lot of organizations

[00:02:16.099]
that are starting to say how

[00:02:18.300]
do we take the date of the inherent

[00:02:20.599]
in our organization already if you can

[00:02:22.699]
look at that that that data icon

[00:02:24.800]
in the left-hand side how do we analyze

[00:02:27.199]
it how do we discover inside

[00:02:29.400]
the Explorer around in it and how do we help it

[00:02:31.599]
use it to help us become a

[00:02:33.699]
better organization weather

[00:02:35.699]
that do cross-sell upsell to

[00:02:38.300]
our customers was that be

[00:02:40.300]
improving business models things

[00:02:42.400]
of that nature but really at this point these

[00:02:44.699]
organizations are saying how do we take the

[00:02:46.800]
data that we already have are these new

[00:02:48.800]
data sources that we see in

[00:02:50.900]
the environment and

[00:02:53.099]
how do we leverage that here

[00:02:55.500]
at EMA we have a concept

[00:02:57.900]
called the hybrid data ecosystem in his

[00:02:59.900]
part of that we looked at what

[00:03:02.099]
are some of the applications the organizations

[00:03:04.800]
that are really using as part of their big data

[00:03:06.800]
environment and they're really looking across

[00:03:09.199]
a wide swath of different types

[00:03:11.199]
of things sometimes we think of Big

[00:03:13.400]
Data as being kind of the data science

[00:03:15.900]
or the you

[00:03:18.000]
know that's what the exploratory

[00:03:20.099]
environment only but it's really evolved

[00:03:22.599]
since those early days of Duke we

[00:03:25.000]
now see organizations doing all the

[00:03:27.000]
way from that exploration type

[00:03:29.199]
of activity to operational

[00:03:31.199]
analytics trying to use it to to address

[00:03:33.300]
fraud was that

[00:03:35.400]
be if you're looking at it from a credit

[00:03:37.400]
card transaction or a telecommunications

[00:03:39.699]
event type of process how

[00:03:42.099]
do we look at it in terms of operational

[00:03:44.599]
processing how do we make big data

[00:03:46.800]
part of the horror of what we're doing for

[00:03:49.000]
our business and they're also doing analytics and

[00:03:51.400]
in terms of what they're doing their managing

[00:03:53.699]
their operations they're looking at customer

[00:03:55.800]
experience they're looking at marketing

[00:03:58.199]
and sales there's getting new insights

[00:04:00.599]
into their customers and their products which

[00:04:02.699]
products are selling well which customers

[00:04:04.800]
are interacting with their applications etcetera

[00:04:06.800]
and how do we look at the way that we can

[00:04:08.900]
manage cost and risk associated with

[00:04:11.000]
that in those early days

[00:04:13.199]
a big data analytics you

[00:04:16.300]
know we had kind of the Prototype

[00:04:18.300]
data scientist than the gentleman

[00:04:20.399]
on screen is actually the other guy who

[00:04:22.500]
was CTO for Obama

[00:04:24.500]
2012 and 2008

[00:04:26.899]
and he's a he's a great genius

[00:04:29.000]
looks a lot of great information but

[00:04:31.199]
he was a very technical resource

[00:04:33.399]
it was someone who really got into

[00:04:35.500]
the numbers and the data what was going on and

[00:04:38.100]
we're seeing an evolution of the data

[00:04:40.300]
consumer associated with big data

[00:04:42.300]
and we're seeing that individual being

[00:04:44.399]
more of a senior

[00:04:47.100]
or Advanced business analyst

[00:04:49.699]
someone who looks at the

[00:04:51.800]
at the business reasons for

[00:04:53.800]
what they're doing with the data and less of the technical

[00:04:55.800]
aspects and a lot of these folks are

[00:04:57.800]
looking at things differently in

[00:04:59.800]
the world where as we might have had our

[00:05:01.800]
traditional data scientist as someone

[00:05:04.600]
who could drop down to the command line

[00:05:07.000]
do a lot of their own analytics in

[00:05:09.100]
terms of a script or

[00:05:11.100]
something of that nature these

[00:05:13.100]
citizen data scientist these Advanced

[00:05:15.500]
business analyst

[00:05:17.500]
are looking at things a lot differently

[00:05:20.000]
and the tools that support them need to

[00:05:22.000]
be a lot different in the way that they take a look

[00:05:24.000]
at things nasty

[00:05:26.000]
I know that you guys over at Arcadia

[00:05:28.100]
look at this this this challenge

[00:05:30.600]
the same way that we're seeing out of the MMA

[00:05:34.000]
yeah what's interesting to me is I've

[00:05:36.199]
been in the big data market for about 10

[00:05:38.500]
years and you're 10 years ago we would talk about

[00:05:40.500]
how data is changed and

[00:05:42.500]
it's you instead of rows and columns it's

[00:05:44.500]
complex multi-structure there's

[00:05:46.699]
more need for real-time information on

[00:05:49.600]
that complexity lots of different data

[00:05:51.600]
sources so we talked about David Variety in

[00:05:53.699]
philosophy of the three big Z's if

[00:05:55.699]
you fell and it

[00:05:57.699]
was systems like Apache

[00:06:00.000]
Hadoop which really gave a lot more flexibility

[00:06:02.399]
for people to deal with these multi structured

[00:06:04.600]
data sets and ways and

[00:06:06.600]
see your point it's the data scientist said

[00:06:08.699]
he's really technical people that go in there was

[00:06:10.699]
no python that

[00:06:12.699]
produce different types of coding environments

[00:06:14.899]
Pig and surrounding

[00:06:17.699]
we're behind these different things to be able to get

[00:06:19.699]
information out of that I really

[00:06:21.699]
what has not changed is

[00:06:24.500]
until recently bi tools

[00:06:26.800]
as you said sequel is

[00:06:28.800]
the lingua Franca for Alice

[00:06:30.800]
that want to go after data but

[00:06:32.899]
bi tools that we have today are

[00:06:35.300]
really relying on tree and

[00:06:37.399]
Aggregates pointed out of those to do plus

[00:06:39.399]
others in many cases or if you're trying to

[00:06:41.500]
query they do provide BAE

[00:06:43.800]
Systems a lot of times they haven't built in really

[00:06:46.000]
good work with management things and in such

[00:06:48.199]
so the question is been why

[00:06:50.300]
haven't you bi tools really involved to

[00:06:52.300]
keep up and enable these business analyst to do

[00:06:54.300]
what they want and part of it's

[00:06:56.300]
the architecture they don't not

[00:07:00.699]
built to scale the same way distributed

[00:07:03.399]
scale-out MPP systems

[00:07:05.600]
work they run a single

[00:07:07.699]
note environment and yeah

[00:07:10.399]
that's fine if you got a small number of users to

[00:07:12.500]
go after big data and it's fine she's

[00:07:14.500]
got lots of users on a small data

[00:07:16.600]
set but it doesn't do that the combination of

[00:07:18.699]
those two and

[00:07:20.899]
the guy that's built for data warehouses weren't

[00:07:23.300]
meant to interpret Json

[00:07:26.100]
structures or other semi-structured

[00:07:28.699]
things that have metadata

[00:07:31.199]
and scheme embedded

[00:07:33.300]
in the date of format itself so then

[00:07:35.399]
you've got to flatten out the data put

[00:07:37.600]
it into a relational format really

[00:07:39.699]
that these bi schools can access my just

[00:07:41.699]
slows things down and

[00:07:43.699]
then the other thing I think it's really interesting as

[00:07:46.199]
the promise of to do and

[00:07:48.199]
they did it was load and go

[00:07:50.199]
schema on read

[00:07:53.000]
instead of scheme on right meaning we don't have

[00:07:55.000]
to transform the data and get it normalized

[00:07:57.199]
before somebody can access

[00:07:59.399]
it which you in

[00:08:01.399]
the data warehousing world I've worked with customers that

[00:08:03.600]
literally they would say that if they wanted to add

[00:08:05.600]
a new dimension to the data warehouse it

[00:08:07.600]
would literally be

[00:08:09.199]
6 to 12 months worth of time

[00:08:11.199]
to go through all the process to get it set up and

[00:08:13.800]
roughly million dollars cost

[00:08:16.000]
at a large Healthcare company that I

[00:08:18.000]
work with previously so then

[00:08:20.300]
it's chilly that people need in a big date of world

[00:08:22.600]
can't operate

[00:08:24.600]
in that way so it

[00:08:26.699]
what what we see is the people

[00:08:28.800]
that have either a date of birth data warehouse

[00:08:31.100]
or a date of Lake if you're using that's true I'll

[00:08:33.299]
be at school you have to go to this process

[00:08:35.600]
where you landed at a secure it you're

[00:08:37.799]
going to build some kind of a semantic layer

[00:08:39.799]
models be able to put business terms

[00:08:42.100]
on that data then

[00:08:44.799]
you'll need to move that into a VIP server or

[00:08:46.899]
you could create optimizations

[00:08:49.000]
within the data warehouse to speed up

[00:08:51.000]
the performance and then provide that security

[00:08:53.000]
layer performance modeling

[00:08:55.299]
to the end user so that I can do ad hoc Discovery

[00:08:58.000]
in an ounce spot like we're talking about if

[00:09:00.899]
Bill sets of questions that I

[00:09:02.899]
T set up in the schema for

[00:09:05.000]
an end-user to access if

[00:09:07.600]
all the data wasn't there that and you dress to

[00:09:09.600]
go back to iTunes they had like to add this to mention

[00:09:11.700]
it that's what I cost that time comes so

[00:09:13.700]
when you get to the second generation and

[00:09:15.899]
Federation of all the different question

[00:09:18.100]
somebody will want to do Discovery it becomes

[00:09:20.200]
a very arduous process do you have to keep going

[00:09:22.299]
back and forth so I T so

[00:09:25.100]
what we're really trying to do is make

[00:09:27.100]
sure that you don't have these

[00:09:29.100]
issues of the

[00:09:31.100]
way it's repetitive nature and you know this is

[00:09:33.200]
a long process and a lot of cost

[00:09:35.399]
so we want to shorten that within a data Lake

[00:09:37.399]
and these new technologies

[00:09:40.799]
that are native that run inside the lake provide

[00:09:42.799]
a lot more flexibility you can do the

[00:09:44.799]
LA Discovery up front because they can

[00:09:46.799]
interpret complex data on the Fly

[00:09:48.899]
you don't have to do for his

[00:09:50.899]
modeling until you decide you want

[00:09:52.899]
to push those insights out to a broader audience and

[00:09:55.000]
production also there's only one

[00:09:57.000]
security model made of bi tools

[00:09:59.100]
will disinherit the security based

[00:10:01.399]
in the date of platform you're not moving data

[00:10:03.600]
out to Circuit bi server or

[00:10:06.399]
into Data Warehouse so all the days that

[00:10:08.399]
you need is in one system so just speeds

[00:10:10.700]
up dramatically the duration for

[00:10:12.700]
the country but then also being able to deploy it

[00:10:14.799]
out to large number of users in

[00:10:16.799]
a production mode so that's really was changing

[00:10:18.899]
I think in the process perspective to Naval those

[00:10:21.399]
end-users which I must more business-oriented

[00:10:23.799]
to your point John

[00:10:26.299]
oh I agree and I think that

[00:10:28.399]
one of the keys to this is being

[00:10:30.799]
able to discover that

[00:10:32.799]
detailed so a lot of data-driven

[00:10:34.799]
organizations are able to

[00:10:36.799]
go through that because

[00:10:39.000]
a lot of times you don't know what

[00:10:41.500]
you're trying to look at your you're trying

[00:10:43.600]
to make that consideration and you

[00:10:45.899]
know a lot of times were when were presented

[00:10:47.899]
with information comes

[00:10:50.500]
in a big mass of information

[00:10:52.600]
is very hard to tease out the

[00:10:54.899]
key components that go along with

[00:10:57.700]
that particular aspect of

[00:11:00.000]
visualization and a lot of people what

[00:11:02.100]
they do is because people are naturally a

[00:11:04.299]
very visual pattern

[00:11:06.600]
looking to a group to be able

[00:11:08.600]
to bring that information out and

[00:11:10.799]
to be able to take a look at it now in

[00:11:13.200]
the past we've had situations where

[00:11:15.299]
people have dumped data from

[00:11:17.299]
you know whether it be you've

[00:11:19.399]
been borrowing the edw environment whatever

[00:11:21.600]
and kind of gone with a

[00:11:23.600]
desktop cowboy type of approach

[00:11:25.899]
where they're looking at and exploring

[00:11:28.299]
data only on their desktop but

[00:11:30.299]
when you do that you've got to get the car out of the data

[00:11:32.500]
as you pointed out you got to

[00:11:34.799]
look at it now

[00:11:36.799]
after awhile it becomes isolated on

[00:11:39.200]
a particular environment you really

[00:11:41.200]
don't have the smoothness

[00:11:43.500]
that you would like to know you may find

[00:11:45.500]
an inside but you may not have the

[00:11:47.500]
data that your root the underlined supporting

[00:11:49.799]
data that really makes. I work

[00:11:52.100]
out to work looking at organizations that are

[00:11:54.200]
really trying to stay away from this

[00:11:56.899]
type of

[00:11:58.899]
desktop Cowboy approach to

[00:12:00.899]
their exploration now you know I've

[00:12:02.899]
talked about how exploration is great

[00:12:05.100]
for data-driven organizations in terms

[00:12:07.399]
of what they're doing with exploration what

[00:12:09.899]
they're doing with analytics

[00:12:11.899]
how they're looking at their operations

[00:12:14.500]
another area that we're seeing a lot of value

[00:12:16.700]
in is the growing area of

[00:12:19.100]
machine

[00:12:21.399]
learning and artificial intelligence

[00:12:23.799]
lot of people think that this this

[00:12:25.799]
kind of machine

[00:12:28.700]
learning but a lot of times you got to get

[00:12:30.799]
that data so that you can

[00:12:32.799]
look at it some

[00:12:35.299]
exploration around it understand

[00:12:37.399]
the components that go with that then after you've

[00:12:39.600]
done those things then you can

[00:12:41.600]
really enable that machine learning process

[00:12:44.100]
and that artificial intelligence

[00:12:46.299]
techniques that allow you to scale this

[00:12:48.399]
out but being able to use Discovery

[00:12:51.200]
and exploration at that detail

[00:12:53.200]
level is really where organizations

[00:12:55.899]
are finding the value of doing this

[00:12:57.899]
and if they can

[00:12:59.000]
through that process it really makes

[00:13:01.100]
things by exactly

[00:13:05.100]
and there's a lot of excitement around machine-learning

[00:13:07.399]
artificial-intelligence but if you can't

[00:13:09.600]
make it usable to end-users now

[00:13:12.299]
there is a lot of automation of things that will happen

[00:13:14.299]
but in the case of cybersecurity we

[00:13:16.600]
worked with companies that need to

[00:13:18.600]
enable security analyst to

[00:13:20.899]
be notified if there is a

[00:13:23.100]
Potential Threat using a machine

[00:13:25.200]
learning in this

[00:13:27.200]
screenshot for showing a

[00:13:29.200]
demo application we built

[00:13:31.200]
where machine learning is ranking

[00:13:33.299]
essentially the potential threats of

[00:13:35.299]
different endpoints or users

[00:13:37.600]
in a network I'm looking at

[00:13:39.700]
the net flow data of all

[00:13:41.899]
the log files through the system this is our entire

[00:13:44.100]
view of all the different systems within an organization

[00:13:46.600]
be able to do visualization

[00:13:48.700]
expiration then of all right I see that there

[00:13:50.899]
is an issue here potentially with the endpoint looking

[00:13:53.600]
over on the right at the network graph what are the other machines

[00:13:56.899]
that that influence connected to who are the users

[00:13:59.100]
associated with that that might be involved

[00:14:01.799]
in this scam or something that's

[00:14:03.899]
happened and then be able to drill to

[00:14:05.899]
his much details necessary right within

[00:14:08.000]
one system analytics

[00:14:11.100]
you have to switch from one application

[00:14:13.500]
to another to try and do this types of rat

[00:14:16.899]
hunting with in cybersecurity it

[00:14:19.000]
it takes a lot longer to address

[00:14:21.299]
the problem and yeah we've all seen the news

[00:14:23.299]
we're Banks and retailers

[00:14:25.399]
Etc getting hacked so the more visibility

[00:14:27.399]
we can provide across all the big data to

[00:14:29.500]
the security Alison may not

[00:14:31.500]
be ignored data scientist but they can understand

[00:14:33.799]
Network performance and sniffling all those

[00:14:35.899]
things are really really important

[00:14:37.899]
that's the kind of things that some you

[00:14:40.000]
are really possible when you got access

[00:14:42.000]
to all the data on a

[00:14:44.000]
system that allows end-users to

[00:14:47.200]
self-serve so that the

[00:14:49.200]
thing I would say is that there's different types of Architecture

[00:14:51.799]
is that are out there

[00:14:53.899]
and we really need to be at a scale to

[00:14:56.100]
support you know petabytes information

[00:14:58.700]
for the scenarios one example is when

[00:15:01.299]
you've got a date or I was bi

[00:15:03.399]
school like there's nothing wrong with that you can still

[00:15:05.500]
pull data out you put it into the server

[00:15:07.899]
but it's going to again be aggregate

[00:15:09.899]
some reason I commit a drill detail as

[00:15:11.899]
quickly as you need to and medications you might have to

[00:15:13.899]
call up the data scientist go figure out what's

[00:15:16.299]
happening down at the log file level

[00:15:18.299]
not to mention of your

[00:15:20.299]
shipping date around you have to do different

[00:15:22.500]
types of modeling within the bi server which is

[00:15:24.500]
different from the day to Lake got to

[00:15:26.500]
keep those NSYNC as well as security so

[00:15:29.200]
that can be a hurdle there's

[00:15:31.299]
been a new generation of

[00:15:34.000]
Big Data bi middleware

[00:15:36.600]
essentially that allows you to do cubing

[00:15:38.899]
on the daylight cluster

[00:15:41.100]
and what happens as it's kind of like

[00:15:43.100]
the traditional cubes that and

[00:15:45.200]
we saw a 15 20 years ago where you have to model

[00:15:47.399]
the Cuban Advance you to figure

[00:15:49.399]
out what are the dimensions somebody's going to want to analyze you

[00:15:51.799]
put that on an edge node in the Buster

[00:15:53.799]
but that's only refreshed

[00:15:56.000]
at once every 24 hours typically

[00:15:58.200]
it's not a real time View and

[00:16:00.799]
again if there's some detail as someone

[00:16:02.799]
wants it's not going to be within that she

[00:16:04.799]
abused then I've got to go down to another tool

[00:16:07.000]
to get down into the detail and

[00:16:09.100]
important is a lot of the data

[00:16:12.299]
locality in the information on how to speed

[00:16:14.500]
up queries is store down at the data file

[00:16:16.799]
level so if the file system

[00:16:18.899]
levels so if your only passing sequel

[00:16:20.899]
queries back and forth with the FBI

[00:16:23.299]
school to these these Edge knows they're

[00:16:25.399]
just losing a lot of the understand

[00:16:27.600]
how that datastore the filters and semantic

[00:16:30.000]
information that's out there so again

[00:16:32.299]
native bi Technologies from how to

[00:16:34.299]
be a fully distributed bi

[00:16:36.500]
server that runs directly in

[00:16:39.500]
the data lake so whether that's hdfs

[00:16:42.000]
or on an object store in a pod

[00:16:44.100]
environment the date of locality

[00:16:46.200]
understand where data is placed so

[00:16:48.299]
that you can optimize query performance

[00:16:50.299]
and push

[00:16:52.600]
down at Symantec model that Samantha Clara's

[00:16:54.899]
far down into the distributed Buster's

[00:16:57.700]
possible really speed up the performance and

[00:16:59.700]
intelligence about how to do that and

[00:17:02.100]
what also happens then is yes there are Aggregates

[00:17:05.099]
and logical values that

[00:17:07.299]
are traded or physical views of the day that I will

[00:17:09.299]
speed up and allow you to have hundreds

[00:17:11.299]
of security analysts being able to monitor

[00:17:13.400]
the network would say but it's

[00:17:15.400]
done iterative Wayne it's based on the

[00:17:17.400]
actual usage of what tables

[00:17:19.799]
people are access you what careers are running and

[00:17:22.500]
then machine learning is essentially

[00:17:24.700]
recommending the way to structure that

[00:17:26.700]
data so that the queries can be

[00:17:28.700]
sped up over time back to the administrator

[00:17:30.799]
the best part is is no data movement

[00:17:32.900]
Securities in I once you

[00:17:35.400]
don't have to worry about people taking data out

[00:17:37.500]
of the cluster under a separate bi server on

[00:17:39.599]
their desk or on their desktop which

[00:17:42.299]
gets into Data governance issues so this allows

[00:17:44.500]
i t to keep it really clean and

[00:17:46.700]
allows the end users to access and allies

[00:17:49.200]
and share information we call this lossless in

[00:17:51.799]
that you're not losing any of the High Fidelity

[00:17:53.799]
high-definition Insight because

[00:17:56.099]
it's all done natively within the date of self

[00:17:58.599]
and Steve I think

[00:18:00.599]
one of the things you raise here with

[00:18:02.900]
the native bee eye on it within a day

[00:18:04.900]
Lake architecture example is not

[00:18:07.400]
just from a data governance if you

[00:18:09.400]
will from a management of the date

[00:18:11.500]
of perspective but now we're starting

[00:18:13.700]
to see privacy and Regulatory

[00:18:16.000]
Compliance become a whole lot

[00:18:18.099]
more important organizations particularly

[00:18:21.099]
with the I guess I guess the first Boogeyman out

[00:18:23.099]
there is gdpr and

[00:18:25.299]
its implementation time frame in

[00:18:27.599]
in later this year but being

[00:18:30.299]
able to minimize the Hops

[00:18:32.700]
of the replication the date that you're talking

[00:18:34.900]
about you you're talking about quite

[00:18:37.200]
nicely about the high-definition analytics

[00:18:40.099]
that goes along with it you also have that

[00:18:42.200]
sense of understanding

[00:18:44.599]
inventory

[00:18:46.599]
and management of that data that

[00:18:48.900]
allows you to answer questions if

[00:18:51.000]
you will about TP

[00:18:53.299]
III compliance PCI compliance

[00:18:55.799]
gdpr inquiry

[00:18:57.799]
things of that nature where in these

[00:18:59.799]
other architecture is it gets

[00:19:01.900]
a little dicey but when we have different pieces

[00:19:04.200]
in different places

[00:19:06.200]
yeah that's a great point and it's only going to

[00:19:08.200]
get more and more stringent that

[00:19:10.299]
need for a day to Stewart's and data governance is

[00:19:12.500]
going to just yeah

[00:19:15.400]
that does those are really insightful and important

[00:19:17.799]
points also a consideration

[00:19:19.799]
number 3 so real time

[00:19:21.900]
is a real thing John can

[00:19:23.900]
you start with that the

[00:19:30.400]
worst CMA is the increased

[00:19:33.799]
use of

[00:19:38.099]
I think we're going backwards

[00:19:43.900]
animal shouldn't be trusted with the growing

[00:19:53.000]
area around streaming

[00:19:55.500]
data streaming data integration points

[00:19:57.599]
and things of that nature so we've

[00:20:00.400]
done multiple end-user studies

[00:20:02.500]
in one we saw at 82%

[00:20:05.000]
of organizations are looking

[00:20:07.000]
at streaming

[00:20:09.000]
Technologies and streaming applications

[00:20:11.400]
and another one we saw 72%

[00:20:14.400]
so you know there's a lot of organizations that

[00:20:16.700]
are getting into how can we

[00:20:18.700]
look at the real time

[00:20:21.000]
data that comes from our organization

[00:20:23.099]
again those new data sets that dated driven

[00:20:25.299]
organizations are focused

[00:20:27.299]
on now some of them are those traditional

[00:20:30.299]
types of iot instances

[00:20:33.200]
self you know connected vehicle

[00:20:35.200]
data coming

[00:20:37.200]
off of well heads out in an

[00:20:39.299]
in an environment how do we

[00:20:41.299]
manage better hour

[00:20:44.000]
renewable energy cut

[00:20:46.700]
contributors to our our electrical

[00:20:49.200]
grids etcetera but we're also seeing

[00:20:51.299]
it in terms of real-time applications

[00:20:54.000]
how are we looking at ordering

[00:20:56.000]
fulfillment and payments

[00:20:58.200]
to be able to better understand

[00:21:00.200]
how these things are going this gives

[00:21:02.500]
us that ability to as

[00:21:04.799]
we as these data driven organizations like

[00:21:06.799]
can we start to project

[00:21:08.900]
the types of sales that we are

[00:21:10.900]
going to see based on the people who are browsing

[00:21:13.299]
the types of products that are that their browsing

[00:21:15.599]
what are some of the the components

[00:21:18.000]
that go along well with that are

[00:21:20.099]
we seeing order and fulfillment

[00:21:22.299]
and payment all within a particular time

[00:21:24.500]
frame if so what are the products

[00:21:26.700]
that make that happen if we see

[00:21:28.799]
lags what are the products for customers that

[00:21:31.099]
are like that as well and taking that

[00:21:33.099]
same type of mentality down to the

[00:21:35.200]
mobile or online path

[00:21:37.200]
analysis so being able

[00:21:39.200]
to understand not just from

[00:21:41.599]
that transaction level which is

[00:21:44.000]
you know the shopping cart things of that

[00:21:46.000]
nature on the back side

[00:21:48.099]
which would be very backward-looking but

[00:21:50.700]
what half are our customers

[00:21:52.799]
taking to get to either

[00:21:55.000]
a perch in

[00:21:57.500]
abandonment or what are some

[00:21:59.599]
of the things that they doing are we seeing customers that

[00:22:01.900]
are using us as a showcase

[00:22:04.200]
to look at things that they

[00:22:06.200]
might buy another another location how

[00:22:08.799]
are they using our mobile applications

[00:22:10.799]
and things of that nature but in these areas

[00:22:13.099]
were seeing organizations

[00:22:15.299]
take on that that real-time

[00:22:18.000]
streaming type of a pro and

[00:22:20.799]
it can't get a lot of those things

[00:22:23.099]
are not available if

[00:22:25.200]
we take that approach that see was

[00:22:27.200]
talking about if we have to stop and

[00:22:29.400]
replicate or move data and

[00:22:31.500]
particular place is because when

[00:22:33.700]
we look at the new business models are

[00:22:35.799]
in the new product development that goes into

[00:22:37.900]
the business scenarios

[00:22:40.000]
associated with streaming in real

[00:22:42.200]
time we don't have time

[00:22:44.299]
to make those particular stops

[00:22:46.500]
in terms of process and operational

[00:22:49.000]
productivity you often times will

[00:22:51.200]
find organizations that are looking

[00:22:53.200]
at how do we improve the performance

[00:22:55.400]
of our manufacturing floor or how

[00:22:57.500]
do we improve the performance of

[00:22:59.599]
our warehousing our inventory

[00:23:02.000]
processes and while there

[00:23:04.099]
is some wiggle room in there if

[00:23:06.099]
we don't have a product how do we get

[00:23:08.299]
that information to our

[00:23:10.299]
customers in a time frame that

[00:23:12.299]
allows them to make the

[00:23:14.299]
most of that whether we can offer them a different

[00:23:16.400]
product set their expectation

[00:23:18.599]
that the deposit will be getting will be later

[00:23:21.099]
on and then if we can start to

[00:23:23.099]
bring that indoor Supply Chain management

[00:23:25.099]
to be able to say how do we understand

[00:23:27.799]
where all of all the inputs

[00:23:30.099]
and outputs of what we're trying to do are

[00:23:32.400]
sitting with our supply chain and be

[00:23:34.500]
able to manage those things cross-purposes

[00:23:37.200]
now again some of these can

[00:23:39.299]
have a little bit

[00:23:40.900]
a wiggling it but for the most part

[00:23:42.900]
the business analyst and

[00:23:45.200]
the the the business models that

[00:23:47.200]
are looking at these new types of things

[00:23:49.200]
need to be able to rely on that

[00:23:51.299]
real-time scalable

[00:23:53.599]
types of activities so that they can

[00:23:55.599]
identify a new business

[00:23:57.799]
model as it's taking place so imagine

[00:24:00.599]
if you will we had a major sporting

[00:24:04.500]
event and somebody Wharf just the

[00:24:06.599]
the right shirt and everybody goes hey

[00:24:08.599]
I want to have that shirt or

[00:24:10.700]
the right type of glasses now

[00:24:12.799]
if you're a business analyst you can go hate

[00:24:15.200]
can we support that can we set that up

[00:24:17.200]
yes we do we can get it we can identify

[00:24:19.299]
it and now they

[00:24:21.299]
can execute on that whereas

[00:24:23.599]
before when we had friction in there

[00:24:25.700]
or we had that lower

[00:24:27.900]
Fidelity of the analytics for exploration

[00:24:30.000]
process we would run into issues

[00:24:32.599]
that went along with that now to

[00:24:35.000]
say that we got this the

[00:24:37.000]
stuff going with streaming is is

[00:24:39.099]
a perfect Panacea or a Silver

[00:24:41.400]
Bullet if you will we actually ever

[00:24:43.400]
gotten some obstacles to what organizations

[00:24:45.900]
are trying to do one the quality

[00:24:48.000]
and the reliability of the data that goes

[00:24:50.099]
back to that multiple

[00:24:52.200]
hops the leader

[00:24:54.500]
types of approaches for bi

[00:24:57.000]
of big data analytics you know

[00:24:59.000]
do we have the right data and is it

[00:25:01.000]
where we needed to be can we get

[00:25:03.000]
that conductivity real-time conductivity

[00:25:05.000]
to those data sources so

[00:25:07.500]
that we can feel confident that if we build

[00:25:09.900]
something out we'll be able to get the right

[00:25:12.200]
information to the right people I think

[00:25:14.400]
the only thing worse than not

[00:25:16.400]
making a recommendation at the end of a sale

[00:25:18.500]
for across cell up cell type

[00:25:20.599]
of thing if I'm an online retailer

[00:25:22.900]
I'm trying to create that great

[00:25:24.900]
customer experience is to

[00:25:27.200]
make the wrong one to not understand

[00:25:29.799]
what a customer is doing right then

[00:25:32.000]
and there or what had they have been doing overtime

[00:25:34.700]
and say hey how can we help

[00:25:36.700]
these folks do that you also see

[00:25:38.799]
that a lot in terms of customer

[00:25:40.900]
how do we understand

[00:25:43.400]
if a customer is having an issue with

[00:25:45.700]
one part of our organization now

[00:25:48.000]
they want to be on customer care how do we

[00:25:50.000]
get that and then the other issue is the inability

[00:25:52.200]
to ingest restore that

[00:25:54.799]
data as we're moving forward because as

[00:25:57.700]
the streaming data starts to get to

[00:25:59.700]
build up you know I think everybody

[00:26:01.799]
kind of imagines that all of our connected

[00:26:04.500]
fleet vehicles will be automatically

[00:26:06.799]
your auto automatically enabled

[00:26:09.599]
with streaming data it's

[00:26:11.599]
going to build overtime but once we get

[00:26:13.599]
to some of those critical links we're going to

[00:26:15.599]
have a whole lot more data coming into the

[00:26:17.599]
systems are them their organizations

[00:26:20.200]
are really able to handle that Steve

[00:26:22.200]
I know that you guys have seen some great ways to

[00:26:24.299]
handle that in terms of how the Acadia

[00:26:26.700]
platform works

[00:26:29.099]
yeah I see your point on connected Vehicles

[00:26:31.299]
right you want to be able to see what's happening real-time

[00:26:33.599]
within your flea and if there's a

[00:26:36.000]
driver that stalled out somewhere where there's an issue

[00:26:38.099]
with a machine be alerted

[00:26:40.099]
to that in real time for the analyst can see

[00:26:42.200]
what's happening but then also go to look down in

[00:26:44.200]
the correlation of it was their part

[00:26:46.400]
that's failing based on driving behavior

[00:26:48.500]
overtime and that sort of thing so what we

[00:26:50.799]
see is little and

[00:26:52.900]
architecture but you have sort of a path for

[00:26:55.000]
streaming data you need to be to

[00:26:57.000]
visualize that then you also want

[00:26:59.099]
to land and store that data over time so you

[00:27:01.200]
have the historical context and you

[00:27:03.799]
if you can provide both the real time in historic

[00:27:06.000]
on one interface for an analyst it just helps reduce

[00:27:08.599]
that friction like you talked about him Kafka

[00:27:11.000]
certainly one of the things that were seeing more

[00:27:13.200]
more customers adopt for the streaming

[00:27:15.200]
where there's also a spark streaming and blink

[00:27:17.299]
and all these other streaming

[00:27:19.400]
Technologies out there but it's just a pretty fascinating

[00:27:21.900]
world and me we living in a real time on demand Road

[00:27:24.200]
between Twitter Snapchat and everything else

[00:27:26.299]
people expect to be able to sense

[00:27:28.500]
respond interact with data in

[00:27:30.500]
real-time in the Enterprise needs to be the same way

[00:27:32.500]
it just happens to be a lot larger data volume

[00:27:34.799]
so that's what we need to be able to deliver to

[00:27:37.099]
our clients for sure

[00:27:40.000]
and part of

[00:27:42.000]
that sorry Slide

[00:27:44.099]
the

[00:27:46.599]
other part of that is complex date

[00:27:48.700]
as we mentioned earlier and you had the visual of

[00:27:50.700]
it it's hard for humans to interpret

[00:27:53.000]
the stuff you see on the left and now

[00:27:55.599]
this would be a Json file read got nested

[00:27:57.799]
structures and Tricia you have to flatten

[00:27:59.900]
this output into some tables and

[00:28:01.900]
then serve it up through a data

[00:28:04.200]
warehouse space bi to a

[00:28:06.500]
really a native approach allows

[00:28:08.500]
you to interpret that build the structure

[00:28:10.599]
on the Fly and visualize it's a rough down the date

[00:28:12.599]
as its Landing in the cluster so you

[00:28:15.099]
learn might be coming in on the streaming you picking

[00:28:17.099]
that up but it's also being written

[00:28:19.099]
in variable instantaneously

[00:28:21.599]
essentially as it's written to disc in the system

[00:28:23.599]
so I just really streamlined ability

[00:28:26.099]
to look at that information as well

[00:28:28.700]
and see I think one of the things that you

[00:28:30.700]
raised is it a good one you know you show

[00:28:32.799]
the the the structure

[00:28:35.200]
on the left hand side there and

[00:28:37.900]
their lot of times were seeing with streaming

[00:28:40.299]
date of horses. They

[00:28:42.700]
are still in that process of defining

[00:28:45.000]
what the structure is going to be so

[00:28:47.299]
some organization will say hey let's flattened

[00:28:50.000]
or Jason so

[00:28:52.200]
that it starts to look like a table now

[00:28:54.500]
that's one approach that can

[00:28:56.599]
work but if the you

[00:28:58.700]
have flexibility and variability in

[00:29:01.599]
that Jason you're going to run into issues

[00:29:03.900]
where either you have data

[00:29:05.900]
that doesn't match up with the other flattens

[00:29:08.400]
table structure or you're going to start to

[00:29:10.400]
lose data elements that go

[00:29:12.500]
along with that and I think that's one

[00:29:14.599]
of the things that particular these early stages

[00:29:16.799]
for iot and particularly

[00:29:19.200]
for the environment

[00:29:21.200]
around mobile and

[00:29:23.599]
application development a lot of

[00:29:25.599]
those guys there I wouldn't say they're going

[00:29:28.099]
by the sea pants but they don't have

[00:29:30.099]
as robust a

[00:29:32.500]
configuration change management

[00:29:34.799]
as we probably would like to see

[00:29:36.799]
in those environments and they're changing their

[00:29:39.099]
Json which is what it's designed

[00:29:41.400]
to do on the Fly and

[00:29:43.799]
you need to be able to handle that as

[00:29:46.400]
opposed to making a choice

[00:29:48.599]
of do we let it in and

[00:29:50.900]
lose some of our Fidelity do we keep it

[00:29:52.900]
out having that flexibility really

[00:29:55.299]
enables organizations

[00:29:57.400]
to do with that they're trying to do that

[00:30:04.700]
you're working with i that takes us to

[00:30:06.700]
self service John

[00:30:08.900]
you want to start off with I want to know

[00:30:11.400]
about Big

[00:30:14.099]
Data consumers are coming in all shapes and sizes

[00:30:16.299]
so we've got the traditional

[00:30:18.400]
data scientist we've got RBI

[00:30:21.299]
analyst those those folks that might be a little bit

[00:30:23.500]
more focused on the technology sides

[00:30:25.700]
but we're starting to see

[00:30:28.599]
people that want to get in interact

[00:30:30.900]
with this data and data-driven organizations

[00:30:33.200]
that are coming from or external

[00:30:35.299]
users like customers or

[00:30:37.700]
a Partners we're looking at our line

[00:30:39.700]
of business Executives who want to say

[00:30:41.799]
they want to get into and start taking a

[00:30:43.900]
look at this date as as being made available our

[00:30:46.500]
operations team so Frontline employees

[00:30:48.500]
really want to start getting their

[00:30:50.700]
fingers in the data but unfortunately

[00:30:52.900]
you know not all of these

[00:30:55.099]
folks are looking for the same type

[00:30:57.900]
of experience but what they are looking

[00:31:00.200]
for is a self-service type

[00:31:02.500]
of experience to going back to Steve's Point

[00:31:04.500]
removing that friction of

[00:31:06.900]
how things work they

[00:31:09.000]
always want it now I've never been I've never

[00:31:11.000]
met a business analyst or a

[00:31:13.799]
business executive who's or whose comfortable with

[00:31:15.900]
waiting for the day that I want to look

[00:31:17.900]
at it in terms of a data-driven organization

[00:31:20.299]
those requirements continues it

[00:31:22.400]
to turn up but how can we get

[00:31:24.500]
people the information that they need and

[00:31:27.500]
in a format that they can

[00:31:29.000]
quickly without having to

[00:31:31.200]
get them all into these these big environment

[00:31:33.500]
since the best example I like to use

[00:31:35.599]
and I don't know if if you folks

[00:31:37.700]
like to use it or not but the you scan that

[00:31:39.700]
you see popping up big

[00:31:42.000]
box retailers you know.

[00:31:44.200]
The Home Improvement store that I go to now uses

[00:31:46.400]
them my grocery store uses them I

[00:31:48.400]
hear a rumor were soon to have an

[00:31:50.500]
entire for

[00:31:53.299]
that his nothing but self-checkouts

[00:31:55.599]
but how do we enable that so that if I've

[00:31:57.700]
got a small question or a small task

[00:32:00.000]
but I need to do how do we get them

[00:32:02.099]
to be able to go out and get that

[00:32:04.099]
works as I pointed out or

[00:32:06.400]
did scientists and our Advanced data

[00:32:08.400]
analyst they're going to like that exploration

[00:32:10.799]
experience because they kind of know what they

[00:32:12.799]
want to do they are comfortable

[00:32:15.099]
with the technology and things of that nature but

[00:32:17.400]
was we get out to more of a

[00:32:19.400]
front line employee or the edge of our data

[00:32:22.200]
interactions are

[00:32:24.400]
Partners in our customers aren't going to

[00:32:26.400]
want that exploration experience

[00:32:28.900]
going to want to know hey this is the order

[00:32:31.200]
that I have can I understand where

[00:32:33.200]
it's going if that's a partner we

[00:32:35.200]
can look at how there's the the approximate

[00:32:39.000]
selling through us how are they being managed

[00:32:41.299]
are operational teams like our

[00:32:43.299]
Frontline employees a great example

[00:32:45.299]
of customer care a

[00:32:47.599]
lot of those folks don't necessarily want

[00:32:49.900]
to or have the time in

[00:32:52.500]
their job to go exploring

[00:32:54.700]
around and data what they're really looking for

[00:32:56.799]
is to have the information about

[00:32:59.000]
a customer pop up

[00:33:01.099]
are they having a good or bad experience

[00:33:03.200]
with our organization are

[00:33:05.299]
they a good customer neutral customer

[00:33:07.599]
a bad customer sometimes I described it as gold

[00:33:09.799]
silver bronze and Lead

[00:33:12.099]
and how can I help them with the

[00:33:14.099]
next step and that's much more of

[00:33:16.099]
an application if you

[00:33:18.200]
will then it is an exploratory dashboard

[00:33:21.500]
where you have to start

[00:33:23.599]
to go configure and do things like that but

[00:33:26.000]
we're really looking for is an environment that

[00:33:28.900]
allows us to merge these two things together

[00:33:31.200]
and really meet the

[00:33:33.599]
needs of what each of

[00:33:35.700]
these data consumers is looking for

[00:33:39.700]
yeah and and to do that you

[00:33:42.400]
know you need to have that performance the scale

[00:33:44.700]
that the speed to support

[00:33:46.799]
these types of applications the cybersecurity example

[00:33:49.500]
I gave her earlier was one where he has planned

[00:33:51.900]
formation from your security

[00:33:54.000]
systems and a linking

[00:33:56.200]
back there sweet you can click on those things

[00:33:58.299]
and we rank information that's coming

[00:34:00.400]
in and search correct machine learning miles to

[00:34:02.400]
humans feeding

[00:34:04.599]
back to that connected

[00:34:06.700]
cars another example but it's it's having

[00:34:08.800]
that distributed system that

[00:34:11.000]
allows us to have that scale and performance

[00:34:13.000]
to enable that

[00:34:15.000]
in length and the

[00:34:17.599]
other thing is just huge numbers of users

[00:34:19.800]
right so we've got customers that

[00:34:22.099]
are supporting over 1,200

[00:34:24.199]
users who have got one that wants to scale

[00:34:26.199]
up to something like a hundred thousand

[00:34:28.500]
users and really build a customer-facing

[00:34:31.500]
application you know analytics-as-a-service

[00:34:33.699]
new stars one example where they built

[00:34:36.099]
marketing attribution

[00:34:38.500]
and other types of analytical Services they provide

[00:34:40.500]
back as a service to people so within

[00:34:43.099]
the system we've got to figure out a way to

[00:34:45.099]
support that high level of user

[00:34:47.300]
control NC so I choose not to go and configure

[00:34:50.199]
multiple environments

[00:34:52.400]
probably want a multi-tenant

[00:34:54.500]
environment so what we've got is a patent

[00:34:56.800]
any technology around what we call Smart acceleration

[00:34:59.000]
where again we are monitoring the

[00:35:01.099]
data usage and providing

[00:35:03.500]
recommendations on a little confused that

[00:35:05.500]
can help support hundreds

[00:35:08.199]
of 800x faster performance for

[00:35:10.599]
the end users as they come in its

[00:35:13.099]
minimum modeling we're not building in advance

[00:35:15.099]
those views are being created based

[00:35:17.699]
on machine learning recommendations from system

[00:35:20.000]
on what data

[00:35:22.199]
needs to be aggregated cash

[00:35:24.400]
and stored in a physical way in

[00:35:26.800]
a different way to speed up that type of a

[00:35:28.900]
data-driven application for the business

[00:35:31.000]
and this is the best route

[00:35:33.000]
from one of our and customers

[00:35:35.000]
there a large telecommunications company

[00:35:37.000]
that supports webinar

[00:35:39.300]
types of events I'll just say and

[00:35:42.199]
they have a customer support team

[00:35:44.599]
that needs to understand and troubleshoot

[00:35:46.800]
the network if there's an issue happening with someone's

[00:35:49.199]
deployment of this telecommunications

[00:35:52.300]
company product they're using traditional

[00:35:54.300]
Legacy bi tools and wanted

[00:35:56.300]
to access the date of the store they're

[00:35:58.300]
using or looking at different seek want

[00:36:00.400]
to do pensions like Hive and Paula and Spark

[00:36:02.500]
but not

[00:36:05.000]
give them the ability to use granular

[00:36:07.099]
analysis but not complex queries

[00:36:09.199]
not 4:30 concurrent users

[00:36:11.400]
which was there requirement for the

[00:36:13.400]
system and with a distributed

[00:36:15.500]
bi server or native bi-product

[00:36:18.000]
handling and providing

[00:36:20.000]
those down a little confused and weak Grace

[00:36:22.300]
coming back in a reasonable amount of time where is once

[00:36:24.300]
you got to 15 or 30 people on

[00:36:26.900]
system allow these other Technologies

[00:36:29.300]
just could not bring the results that's that's

[00:36:31.599]
back at all these would complete so

[00:36:33.800]
this is a view of how that performance starts

[00:36:36.000]
to look last.

[00:36:38.400]
It's pretty amazing. I see no that's not how quickly

[00:36:40.599]
that goes they have all weekend to some real world

[00:36:42.699]
examples sure

[00:36:48.599]
I've got a couple more so I'll just say I was

[00:36:50.699]
talking about this telecommunications company

[00:36:52.800]
at Arcadia we work

[00:36:54.800]
with a lot of smaller startup

[00:36:56.800]
digital first types of companies new

[00:36:59.300]
stars one example about their market share

[00:37:01.300]
and they got a choir building that Self

[00:37:03.400]
Service marketing attribution

[00:37:05.599]
type of application software

[00:37:07.699]
that service to their end users who else

[00:37:09.699]
have large cpg companies online retailers

[00:37:12.400]
that are security Kaiser

[00:37:14.900]
Permanente is a customer that back to John's

[00:37:17.199]
playing around data governance near there is she was

[00:37:19.199]
they didn't want to have to pull it out

[00:37:21.199]
of the bus there and worry about pii information

[00:37:23.400]
that was downloaded to somebody's

[00:37:25.599]
desktop for the Legacy bi to also

[00:37:27.599]
being able to keep it all in one place is really really

[00:37:29.599]
important for them if you double

[00:37:31.900]
click on some of these different

[00:37:34.000]
types of marketing applications

[00:37:36.000]
that people are looking at you know you talked

[00:37:38.099]
about the path to purchase and that is

[00:37:40.099]
sort of the the golden path of you

[00:37:42.199]
of what people trying to understand which is how

[00:37:44.199]
are consumers interacting with the

[00:37:46.199]
brand and figure

[00:37:48.800]
it out what is it going to purchase something or

[00:37:50.800]
not so it do we have the Right

[00:37:52.800]
audience for their marketing are we scoring understanding

[00:37:55.199]
as consumers are we measuring the impact

[00:37:57.199]
what stores are doing well from the sales distribution

[00:37:59.500]
perspective so being able

[00:38:01.699]
to allow the campaign manager the

[00:38:03.900]
product manager whatever it is to see

[00:38:06.199]
which campaigns are running how people are interacting

[00:38:08.300]
with that and do multivariate testing those

[00:38:10.599]
types of things C4 the digital

[00:38:12.599]
Dollar Tree spending what's the

[00:38:14.699]
value of that in reaching our audience

[00:38:16.699]
and then you might have really

[00:38:18.800]
good sales in but say the state

[00:38:20.900]
of New York but when you double-click

[00:38:23.800]
down into the ZIP code or yeah

[00:38:26.099]
I've been hearing about people looking at down the parcel

[00:38:28.400]
level so by household being able to

[00:38:30.400]
track your coupons

[00:38:32.500]
digital ads and different things and is

[00:38:34.800]
that in parcel level

[00:38:36.800]
know not to mention zip code at cetera

[00:38:39.099]
of intent to buy and things

[00:38:41.199]
like that are we wasting our dollars you know marketing

[00:38:43.400]
to certain part of the the state

[00:38:45.599]
or the county

[00:38:47.199]
questions now that end

[00:38:49.500]
users can ask for themselves once they've got

[00:38:51.500]
all the data in one place so

[00:38:53.699]
this is really interesting in terms of the

[00:38:56.000]
types of analytics

[00:38:58.099]
that can happen and then when she look at

[00:39:00.199]
it from an operations perspective this

[00:39:02.500]
is a really interesting example where is

[00:39:05.000]
very large Manufacturing Company used

[00:39:08.000]
to bring in Consultants they would pull data

[00:39:10.199]
from a bunch of different places from the supply chain you

[00:39:12.599]
know the trucks the products which warehouses

[00:39:14.800]
pallets they're being shipped out

[00:39:16.900]
how their stacking those pallets of different products

[00:39:18.900]
is their waist and what shipping

[00:39:20.900]
to the various locations that was a

[00:39:23.000]
six to eight month project and

[00:39:25.099]
would cost somewhere in the neighborhood of

[00:39:27.199]
$100,000 to do that every

[00:39:29.900]
6 or 8 months to look for efficiencies

[00:39:31.900]
in the butt with

[00:39:33.900]
all the grain of a date in one place now they're integrated

[00:39:36.500]
financial data together with physical flow

[00:39:38.699]
data not just for let's

[00:39:41.199]
say a truckload of a product but down to pallet

[00:39:43.500]
level and now that you have RFID use it

[00:39:46.000]
can be down at the individual no package

[00:39:48.300]
level to optimize

[00:39:50.300]
the pallet splits and what gets shipped where

[00:39:52.400]
and this Sankey diagram you

[00:39:54.699]
see on the screen here allows

[00:39:57.000]
people to do a century what if analysis to

[00:39:59.000]
look at all the different paths for the

[00:40:01.000]
price is taking a different distribution

[00:40:03.099]
stop off points to look at this a better way to

[00:40:05.099]
do it and

[00:40:07.300]
so it's highly interactive this type of analysis

[00:40:09.500]
done in traditional sequel is

[00:40:11.500]
multipass equals really complex takes

[00:40:13.599]
a lot of time now that was more advanced

[00:40:15.800]
analytics at a very large scale

[00:40:18.099]
of data with lot of granular you can

[00:40:20.199]
extract a lot more Value Inn efficiencies on

[00:40:22.599]
the list of been really transformative

[00:40:24.699]
at this person said for their business

[00:40:26.800]
so just amazing some diagrams

[00:40:32.699]
one of the things I think a lot of organization

[00:40:35.099]
Great Value out of it is it dispels

[00:40:37.800]
some of their preconceived

[00:40:39.800]
notions about what their processes

[00:40:42.000]
look like they may have thought about a

[00:40:44.099]
golden path goes this is the way it's

[00:40:46.199]
supposed to work

[00:40:47.099]
but all the sudden they start to look at a diagram

[00:40:49.800]
like this one and in real time to go

[00:40:51.800]
our orders coming

[00:40:53.900]
in to the Fulfillment component

[00:40:56.400]
without actually being quoted or

[00:40:59.000]
estimated at the beginning and

[00:41:01.300]
then they go we'll wait a minute are we missing data

[00:41:03.500]
or is there something wrong with our process and

[00:41:05.900]
in some instances they find real

[00:41:08.300]
organizational change that they can make it

[00:41:11.000]
because they have this access to that

[00:41:13.099]
level of granularity without

[00:41:15.099]
having to you know Joe's

[00:41:17.599]
Cycles with it they're really

[00:41:19.800]
able to get great inside out of this

[00:41:21.800]
type of analysis and the other ones that you

[00:41:23.800]
can talk about yeah

[00:41:27.500]
great example

[00:41:30.199]
in this is a little bit of technical but I think for

[00:41:32.300]
the The Architects on a line people

[00:41:34.300]
that have invested in various to do

[00:41:36.300]
platforms you one thing

[00:41:38.400]
that I would point out which is really interesting as

[00:41:40.400]
I need to be eyes that are

[00:41:42.800]
we can stall with existing

[00:41:44.800]
systems like fire manager and Bari

[00:41:47.000]
we inherit security and roll base access

[00:41:49.099]
controls from projects like a patchy

[00:41:51.400]
century and Ranger and those types of things so front

[00:41:53.599]
Administration perspective you

[00:41:55.800]
know you have a lot more confidence that

[00:41:57.900]
you can deploy this to a

[00:41:59.900]
large number of users and you're not worrying about keeping

[00:42:02.199]
Security in sink and

[00:42:04.699]
what the bi departments do two different from

[00:42:07.000]
your tools all or from your platforms

[00:42:09.199]
all one system so this is been a huge time

[00:42:12.099]
to value and just lower

[00:42:14.300]
Administration considerations

[00:42:17.800]
people that are deployed data like so

[00:42:19.800]
much broader audience now to get

[00:42:21.800]
the value that has been promising Big D and I think

[00:42:23.900]
you know big data and data Lakes

[00:42:25.900]
you can say it's it's in the trough of disillusionment but

[00:42:28.400]
I think it's because we haven't had good ways

[00:42:30.500]
to give that access to

[00:42:32.500]
live in users to build applications on

[00:42:35.400]
these platforms are really make it a big

[00:42:37.699]
data application platform it really

[00:42:39.699]
this is kinda Last Mile being able to do

[00:42:41.699]
this on to the business people to understand

[00:42:44.000]
the business and how it needs to happen and to

[00:42:46.099]
build customer-facing applications just so much

[00:42:48.400]
more easy

[00:42:51.699]
way to do that agreed

[00:42:53.900]
to know and I think you know whenever

[00:42:55.900]
ma does end user research

[00:42:58.000]
around a big data streaming

[00:43:00.400]
things of that nature security

[00:43:03.500]
questions are always like boating in

[00:43:05.500]
Chicago people vote early

[00:43:07.500]
vote often I can almost see them like double-clicking

[00:43:10.000]
on the yes Securities really important

[00:43:12.000]
to me unless I wish I could

[00:43:14.000]
get an intensity button how

[00:43:16.199]
quickly and how often are clicking on

[00:43:18.199]
that foot know you race great

[00:43:20.199]
issues here and as we

[00:43:22.199]
look at privacy and security and

[00:43:24.599]
moving into different components around

[00:43:27.000]
how do we manage that security

[00:43:29.099]
it's going to be important be able to answer

[00:43:31.099]
these questions because if you can't

[00:43:33.199]
your CSO might just fold

[00:43:35.599]
their arms and go no you can't do that or

[00:43:38.099]
you run great risk of violating

[00:43:40.699]
the trust of your partners customers

[00:43:42.699]
and suppliers so I know

[00:43:44.900]
I think those are all great points and ones

[00:43:47.099]
that I think you're fantastic so

[00:43:49.400]
as we kind of look at what the steps

[00:43:51.699]
are kind of wrapping things

[00:43:53.800]
up in in the opinion of your

[00:43:55.900]
DM a data-driven organizations

[00:43:58.199]
are capitalizing on big data analytics to

[00:44:00.199]
fundamentally change their

[00:44:02.300]
business model organizations

[00:44:05.800]
that focus on rear looking data

[00:44:08.800]
are going to be quick we left behind I think we're

[00:44:10.900]
already seeing that particularly

[00:44:12.900]
in the online and brick-and-mortar retail

[00:44:15.500]
space I think we're just going to continue

[00:44:17.599]
to see that and these approaches

[00:44:19.699]
required new and different methods

[00:44:21.900]
than our traditional if

[00:44:24.199]
you will

[00:44:25.800]
kallax refine

[00:44:27.900]
presents model

[00:44:29.900]
of all the different components and

[00:44:31.900]
we have to remove a lot

[00:44:34.000]
of the friction out of what

[00:44:36.199]
those traditional approaches have been

[00:44:38.300]
keep their best if you will patterns

[00:44:40.900]
you know for Quality data

[00:44:43.000]
access etcetera but how do

[00:44:45.000]
we speed them up and develop new

[00:44:47.099]
best practices and again

[00:44:49.199]
or of those those new things I

[00:44:51.199]
think we're going to see our detail and speed

[00:44:53.599]
of access are really going to be key

[00:44:55.900]
when we look at the future big data analytics

[00:44:57.900]
Suites you know those days Drive

[00:45:00.099]
in organizations they won't let

[00:45:02.199]
the business stakeholders will not sit

[00:45:04.300]
still long enough to allow

[00:45:06.500]
their architectures

[00:45:08.500]
to catch up they will either

[00:45:10.500]
move along with

[00:45:12.500]
the group or those guys will find ways to

[00:45:14.599]
Define alternatives to make that happen so

[00:45:17.000]
EMA

[00:45:19.699]
where big data analytics are going

[00:45:21.699]
in the near future yams

[00:45:27.500]
from our perspective

[00:45:29.800]
yeah I've been in this industry gas

[00:45:32.000]
18 years now and I think the

[00:45:34.099]
same way we saw data warehouses created

[00:45:36.500]
as her of a separate thing that's

[00:45:38.599]
designed for analytics and

[00:45:40.800]
back in the 90s I think with

[00:45:42.800]
the data Lake and Big Data Systems are at the

[00:45:44.800]
scale-out world where you know

[00:45:47.099]
there's laws of physics for

[00:45:49.199]
Hardware just cannot keep up with the growth of data

[00:45:51.300]
it's only getting worse at IRT so everything

[00:45:54.099]
needs to go to scale-out distributed systems I

[00:45:56.599]
think Enterprises need to look at having

[00:45:59.300]
two separate VI platforms one standard

[00:46:01.300]
for your data warehouse which is designed perfectly for

[00:46:03.400]
what that's good at but for the data

[00:46:05.400]
Lakes a scale-out cloud types of environments you

[00:46:07.699]
really need to look at a scale-out the eye solution

[00:46:09.900]
that can enable you to create

[00:46:12.400]
these data applications and they both

[00:46:14.400]
houses of users to really get the value of these

[00:46:16.400]
Investments you've made and collecting all this

[00:46:18.400]
information so that's our perspective on what's

[00:46:21.300]
happened to Mark and it's pretty exciting times

[00:46:24.199]
yeah I think Steven was working

[00:46:27.099]
off of John was speaking about that that's in this concept

[00:46:29.500]
they just introduced makes makes

[00:46:31.599]
a lot of sense hey we've

[00:46:33.599]
got a bunch of questions and

[00:46:35.599]
I think we have some time to go through it just

[00:46:38.000]
saw everybody online to know you can put

[00:46:40.400]
questions through the the

[00:46:42.400]
pan on the right and also through

[00:46:44.599]
the art of the Twitter

[00:46:46.699]
feeds but why we

[00:46:48.699]
go through this question so I just want to give you some

[00:46:50.699]
resources here and just

[00:46:53.099]
know it on the right-hand we have a new GM

[00:46:55.099]
a white paper

[00:46:57.199]
that's coming out and you'll be able to link to it from

[00:46:59.199]
here and will send it out on your email

[00:47:01.599]
but one question we have

[00:47:03.599]
so far is how

[00:47:06.300]
do companies make the change

[00:47:08.300]
from traditional to data-driven

[00:47:11.500]
well I know one of the keys

[00:47:13.500]
is to make a culturally if

[00:47:16.300]
your CEO is sitting there with his arms

[00:47:18.500]
folded going you know I don't need a

[00:47:21.500]
machine to tell me what I what I know

[00:47:23.599]
about my customers I know them either

[00:47:25.900]
they don't have a very wide customer

[00:47:27.900]
base or soon they

[00:47:30.099]
won't have a very wide customer

[00:47:32.099]
base because there are people out there

[00:47:34.099]
that are making these changes of one there's a cultural

[00:47:36.400]
component that goes along with that and

[00:47:38.400]
2D I have the systems that allow you

[00:47:40.500]
to take the data that you are collecting and

[00:47:43.599]
make use of it so as we've talked about today

[00:47:45.900]
being able to have an

[00:47:48.000]
underlying architecture that is flexible

[00:47:50.099]
and if

[00:47:52.099]
you will Nimble enough to meet those

[00:47:54.300]
changing requirements because I

[00:47:56.500]
can guarantee you one thing data driven organizations

[00:47:58.900]
don't like the concept of please write all

[00:48:01.000]
your requirements down in a requirement document

[00:48:03.000]
for me and in 6 to 12 months will

[00:48:05.099]
be able to execute on them what

[00:48:07.099]
they want to see is an environment where they can

[00:48:09.300]
stick their fingers in the data kind of

[00:48:11.400]
Waterpark take a look at it associate

[00:48:13.599]
some other things with it and really

[00:48:15.699]
find those new insights and then be

[00:48:17.800]
able to apply those inside so

[00:48:19.800]
that the you know how you go

[00:48:21.800]
from traditional to do is driven

[00:48:23.800]
one is cultural one

[00:48:26.000]
is Technical and when the rubber hits

[00:48:28.199]
that road you're off like

[00:48:30.199]
a Shot Ya thank you

[00:48:32.300]
and then this one this next one

[00:48:34.400]
really I think start speaking to the idea self

[00:48:36.500]
service but in a really getting to the capabilities

[00:48:39.000]
of analytics that year that we

[00:48:41.000]
get our business

[00:48:43.099]
analyst technical enough and

[00:48:45.500]
lesson quotes technical enough to

[00:48:47.699]
drive value from big data analytics platforms

[00:48:50.599]
you know that that

[00:48:52.699]
is a good question I've

[00:48:54.800]
never seen a business analyst go please

[00:48:57.000]
send me the customer data from a Duke they

[00:48:59.300]
always ask questions like may I see

[00:49:01.300]
the customer data Massey the product data

[00:49:03.400]
and the more that we can encapsulate

[00:49:05.900]
from them and then there was a slide

[00:49:08.000]
that Steve had up where he had an example of

[00:49:10.000]
a Json format

[00:49:11.400]
which was very confusing

[00:49:13.800]
and how if you know the day that you

[00:49:15.800]
know what really well you can start

[00:49:18.000]
to get in there but if we can turn

[00:49:20.199]
that Json format and encapsulate

[00:49:22.900]
that complexity without losing Steve

[00:49:26.099]
concept of the Fidelity then

[00:49:29.199]
they can start to look at all let

[00:49:31.199]
me look at the number of Customer Events that we have

[00:49:33.300]
looks look at the the different products

[00:49:35.300]
better in these data sets and

[00:49:37.400]
at that point there very

[00:49:39.400]
there very Adept to being able to

[00:49:41.400]
answer those questions but simply

[00:49:43.500]
dropping them in front of a Jason or

[00:49:45.500]
some other type of text-based

[00:49:47.500]
multi structured

[00:49:49.599]
environment that's going to not

[00:49:51.699]
necessarily confuse them but

[00:49:54.099]
it will throw up a barrier to them

[00:49:56.199]
really jumping in and if we

[00:49:58.300]
can remove that freak that removes

[00:50:00.500]
up barrier and now you got

[00:50:02.599]
business analyst who who are going this

[00:50:05.000]
is how I can take a look at this data

[00:50:07.000]
and I can just imagine that just screams line

[00:50:09.300]
streamline the process to

[00:50:11.400]
hey Steve I'm going to give you a true

[00:50:13.400]
partner. I think how

[00:50:16.099]
is AI and ml leveraged

[00:50:18.099]
and big data analytics how

[00:50:20.900]
can bi tools lever

[00:50:24.800]
yeah we talked about a little bit the webinar

[00:50:26.900]
think to John's point on our

[00:50:30.199]
business analyst technical enough to handle

[00:50:33.000]
big data analytics what's a I think what's

[00:50:35.099]
cool about machine learning in the eyes

[00:50:37.300]
it's it's kind of like no power

[00:50:39.900]
brakes or parking assist right we

[00:50:41.900]
want to provide intelligence to wear

[00:50:44.099]
casual users to enable them to

[00:50:46.800]
do more things so to me it

[00:50:49.400]
is not just running models in automating

[00:50:51.699]
the world where robots take over but

[00:50:53.800]
it's also assisting humans to make decisions

[00:50:56.199]
more quickly so in cybersecurity

[00:50:58.400]
example of your monitoring all

[00:51:00.500]
the stuff that's happening in the networking and bubbling thing

[00:51:02.500]
is up for that in for the security

[00:51:04.500]
Alice to look at

[00:51:06.500]
something this potentially threatening but then do

[00:51:08.699]
some even analysis of things

[00:51:11.400]
that you know machines aren't quite

[00:51:13.599]
there yet on our

[00:51:15.900]
perspective on how I can help

[00:51:18.000]
analyst

[00:51:20.400]
Brighton and a lot of people talking obviously

[00:51:22.699]
the cloud right and how does cloud

[00:51:24.900]
factor in with big data analytics and

[00:51:28.199]
specifically can Arcadia date a sport

[00:51:30.300]
that be

[00:51:33.400]
a second part is we have a

[00:51:35.800]
number of customers that have deployed systems in

[00:51:37.800]
the Pod people are you know not

[00:51:41.400]
many people are clouds made of what say unless

[00:51:43.400]
they build their business on what's Amazon

[00:51:45.800]
S3 to start and we have companies like that your

[00:51:49.199]
new stars one example or soft pod bay software-as-a-service

[00:51:51.699]
as you'd expect but most large

[00:51:53.900]
Enterprises are hybrid environments or they'll

[00:51:55.900]
have some data in iCloud someday

[00:51:59.400]
tan trim so we can work across

[00:52:01.500]
both those environments and snow

[00:52:03.500]
certainly more more people moving in the cloud and then it's

[00:52:05.500]
just there working out do we do a lifting shifting move

[00:52:07.800]
our environment there or architect things

[00:52:10.000]
it's primarily it cost base

[00:52:12.099]
thing that people look at it initially

[00:52:14.199]
they don't want to have to manage data center but that's not

[00:52:16.199]
always the right fit for certain

[00:52:18.400]
industries and everything

[00:52:20.800]
else but I think overtime or more people will

[00:52:22.800]
be moving so we've architected

[00:52:24.800]
are software to support that for sure

[00:52:27.300]
excellent excellent I think

[00:52:30.300]
we have time for one more so

[00:52:32.300]
we had a lot of questions so I'm going to try to get through this is

[00:52:35.400]
John can you give

[00:52:37.800]
can you give a specific examples

[00:52:40.300]
of how front-line employees can

[00:52:42.900]
use self-service apps and data driven

[00:52:45.000]
organizations

[00:52:47.000]
no I got some great example

[00:52:49.099]
of a case study recently

[00:52:51.300]
around an organization that does

[00:52:53.300]
not. Big group

[00:52:55.300]
but a small organization

[00:52:57.500]
that ships out you

[00:53:00.699]
know Candy snacks Etc

[00:53:02.900]
things that you might find another of the

[00:53:05.000]
break room of your average company and

[00:53:07.099]
what day did was they

[00:53:09.699]
set up information about how

[00:53:11.900]
first shift was doing against 2nd shift

[00:53:14.199]
then how they were meeting at solae's

[00:53:16.300]
and things of that nature now you

[00:53:18.699]
could do that an exploratory dashboard

[00:53:20.699]
but because they made a nap and they just

[00:53:22.699]
presented it to their teams those guys

[00:53:24.900]
were now informed and send hey you know 2nd

[00:53:27.599]
shift hit their numbers

[00:53:29.800]
we were going to go and try to hit our numbers

[00:53:31.800]
and not in a competitive Cutthroat

[00:53:34.199]
type of thing but that friendly if

[00:53:36.500]
you will competition that you see a lot of times

[00:53:38.800]
between shifts or between offices

[00:53:41.000]
and things of that nature and it

[00:53:43.300]
was very simple it was about how are we meeting

[00:53:45.400]
our our our customers name how

[00:53:47.699]
are we meeting or at solae's which are all part

[00:53:50.000]
of the way the organization's working but

[00:53:52.000]
presenting it in an app that

[00:53:54.000]
was clean and clear and those

[00:53:56.199]
teams got more productive because they

[00:53:58.900]
knew what the levers were were

[00:54:00.900]
happening and you know people were

[00:54:03.000]
presented with information of its I

[00:54:05.599]
want to call it really really old

[00:54:07.699]
school but you know what Monarch was gets

[00:54:09.800]
monitored gets managed when you present it

[00:54:11.800]
to your team's they they want to get

[00:54:13.800]
better and if they they know that something does

[00:54:15.800]
that it helps the organization

[00:54:18.199]
okay well we're coming up to the top

[00:54:20.400]
of the hour and you know I thought this

[00:54:22.400]
is a great webinar thank you John and

[00:54:24.400]
Steve for for giving us your Insight

[00:54:26.900]
and you don't have to say that you

[00:54:29.000]
know has to ask more people are turning to big data

[00:54:31.000]
analytics as a as a

[00:54:33.000]
standard now you

[00:54:35.099]
know really really understanding the ecosystem

[00:54:37.699]
in and what kind of tools you need to handle it

[00:54:39.800]
becomes much much more

[00:54:41.900]
important so I want to thank

[00:54:43.900]
everybody for joining us we

[00:54:45.900]
can get to all the questions but thank you very much I

[00:54:48.500]
will get back to you and you

[00:54:50.800]
know keep your eye out for this this

[00:54:53.199]
new info that

[00:54:55.500]
EMA is putting out and

[00:54:57.500]
it's going to be really interesting so

[00:54:59.699]
thank you everybody and

[00:55:01.900]
good evening