4 Key Considerations For Your Big Data Analytics Strategy


John Myers

John L. Myers
Managing Research Director for Data and Analytics
Enterprise Management Associates

Steve Wooledge

Steve Wooledge
VP Marketing
Arcadia Data

More and more companies are turning to big data analytics to fulfill their need for information. But some of their tools are simply not sophisticated enough to scale effectively.

To get the most out of big data, the tools need to be both flexible enough to accomodate a whole range of users and powerful enough to support exploration, machine learning and operational analytics. Most importantly, the users should be able to stream events from their devices and interact with the data in real time.

Watch this webinar with industry experts from Enterprise Management Associates and Arcadia Data to learn:

  • The state of big data analytics today
  • How data-driven organizations push analytics to the front line
  • Why exploration/discovery requires detailed data
  • The necessity of integrating streaming data in real time
  • Real world examples of implementation


so hello good morning good

afternoon and good evening to everybody

welcome for webinar for

key considerations for your

big data analytics strategy which

is co-presented by Arcadia data and

Emi MMA my

name is Paul Revere moderator today

and more and more companies are

turning to big data analytics to fulfill

their need for information but

some of their tools are simply not sophisticated

enough to scale fictive lie

so today we have a

steep willage for VP marketing at

Arcadia data and John Meyers managing

research director at EMA cool

provide us with a 40 considerations

that we should off think about a

Dodge on C4 thanks for

joining us today I just want to let you

know we are pretty good audience here it's it's International

and nature got North we

got pretty much all the continents including

Australia and the

audience includes data architect

project managers and even the marketing

director so I think we're going to have a pretty good conversation

here so

with these are the four keep the the sections

that will be going through big

data analytics have evolved Discovery

looking at real-time where

self service is

important I will get some real world

examples and of course as I said we

will do the question and answers

and consideration number one

so big data analytics have evolve

John can you start

us off with that presenting

with the with Steve and you

know I think one of the things that is really

driving the evolution of big data analytics is

something that we

idiom a call David Britton cultures

data-driven strategies are all part of being a

data-driven organization so

in the past we had

organizational structures that were based

on looking at data is kind of a backward-looking

or if you will the exhaust of

their operation so you know

what was in the transaction log was

you know the revenue for the month and we're always kind

of looking backwards but we're seeing a lot of organizations

that are starting to say how

do we take the date of the inherent

in our organization already if you can

look at that that that data icon

in the left-hand side how do we analyze

it how do we discover inside

the Explorer around in it and how do we help it

use it to help us become a

better organization weather

that do cross-sell upsell to

our customers was that be

improving business models things

of that nature but really at this point these

organizations are saying how do we take the

data that we already have are these new

data sources that we see in

the environment and

how do we leverage that here

at EMA we have a concept

called the hybrid data ecosystem in his

part of that we looked at what

are some of the applications the organizations

that are really using as part of their big data

environment and they're really looking across

a wide swath of different types

of things sometimes we think of Big

Data as being kind of the data science

or the you

know that's what the exploratory

environment only but it's really evolved

since those early days of Duke we

now see organizations doing all the

way from that exploration type

of activity to operational

analytics trying to use it to to address

fraud was that

be if you're looking at it from a credit

card transaction or a telecommunications

event type of process how

do we look at it in terms of operational

processing how do we make big data

part of the horror of what we're doing for

our business and they're also doing analytics and

in terms of what they're doing their managing

their operations they're looking at customer

experience they're looking at marketing

and sales there's getting new insights

into their customers and their products which

products are selling well which customers

are interacting with their applications etcetera

and how do we look at the way that we can

manage cost and risk associated with

that in those early days

a big data analytics you

know we had kind of the Prototype

data scientist than the gentleman

on screen is actually the other guy who

was CTO for Obama

2012 and 2008

and he's a he's a great genius

looks a lot of great information but

he was a very technical resource

it was someone who really got into

the numbers and the data what was going on and

we're seeing an evolution of the data

consumer associated with big data

and we're seeing that individual being

more of a senior

or Advanced business analyst

someone who looks at the

at the business reasons for

what they're doing with the data and less of the technical

aspects and a lot of these folks are

looking at things differently in

the world where as we might have had our

traditional data scientist as someone

who could drop down to the command line

do a lot of their own analytics in

terms of a script or

something of that nature these

citizen data scientist these Advanced

business analyst

are looking at things a lot differently

and the tools that support them need to

be a lot different in the way that they take a look

at things nasty

I know that you guys over at Arcadia

look at this this this challenge

the same way that we're seeing out of the MMA

yeah what's interesting to me is I've

been in the big data market for about 10

years and you're 10 years ago we would talk about

how data is changed and

it's you instead of rows and columns it's

complex multi-structure there's

more need for real-time information on

that complexity lots of different data

sources so we talked about David Variety in

philosophy of the three big Z's if

you fell and it

was systems like Apache

Hadoop which really gave a lot more flexibility

for people to deal with these multi structured

data sets and ways and

see your point it's the data scientist said

he's really technical people that go in there was

no python that

produce different types of coding environments

Pig and surrounding

we're behind these different things to be able to get

information out of that I really

what has not changed is

until recently bi tools

as you said sequel is

the lingua Franca for Alice

that want to go after data but

bi tools that we have today are

really relying on tree and

Aggregates pointed out of those to do plus

others in many cases or if you're trying to

query they do provide BAE

Systems a lot of times they haven't built in really

good work with management things and in such

so the question is been why

haven't you bi tools really involved to

keep up and enable these business analyst to do

what they want and part of it's

the architecture they don't not

built to scale the same way distributed

scale-out MPP systems

work they run a single

note environment and yeah

that's fine if you got a small number of users to

go after big data and it's fine she's

got lots of users on a small data

set but it doesn't do that the combination of

those two and

the guy that's built for data warehouses weren't

meant to interpret Json

structures or other semi-structured

things that have metadata

and scheme embedded

in the date of format itself so then

you've got to flatten out the data put

it into a relational format really

that these bi schools can access my just

slows things down and

then the other thing I think it's really interesting as

the promise of to do and

they did it was load and go

schema on read

instead of scheme on right meaning we don't have

to transform the data and get it normalized

before somebody can access

it which you in

the data warehousing world I've worked with customers that

literally they would say that if they wanted to add

a new dimension to the data warehouse it

would literally be

6 to 12 months worth of time

to go through all the process to get it set up and

roughly million dollars cost

at a large Healthcare company that I

work with previously so then

it's chilly that people need in a big date of world

can't operate

in that way so it

what what we see is the people

that have either a date of birth data warehouse

or a date of Lake if you're using that's true I'll

be at school you have to go to this process

where you landed at a secure it you're

going to build some kind of a semantic layer

models be able to put business terms

on that data then

you'll need to move that into a VIP server or

you could create optimizations

within the data warehouse to speed up

the performance and then provide that security

layer performance modeling

to the end user so that I can do ad hoc Discovery

in an ounce spot like we're talking about if

Bill sets of questions that I

T set up in the schema for

an end-user to access if

all the data wasn't there that and you dress to

go back to iTunes they had like to add this to mention

it that's what I cost that time comes so

when you get to the second generation and

Federation of all the different question

somebody will want to do Discovery it becomes

a very arduous process do you have to keep going

back and forth so I T so

what we're really trying to do is make

sure that you don't have these

issues of the

way it's repetitive nature and you know this is

a long process and a lot of cost

so we want to shorten that within a data Lake

and these new technologies

that are native that run inside the lake provide

a lot more flexibility you can do the

LA Discovery up front because they can

interpret complex data on the Fly

you don't have to do for his

modeling until you decide you want

to push those insights out to a broader audience and

production also there's only one

security model made of bi tools

will disinherit the security based

in the date of platform you're not moving data

out to Circuit bi server or

into Data Warehouse so all the days that

you need is in one system so just speeds

up dramatically the duration for

the country but then also being able to deploy it

out to large number of users in

a production mode so that's really was changing

I think in the process perspective to Naval those

end-users which I must more business-oriented

to your point John

oh I agree and I think that

one of the keys to this is being

able to discover that

detailed so a lot of data-driven

organizations are able to

go through that because

a lot of times you don't know what

you're trying to look at your you're trying

to make that consideration and you

know a lot of times were when were presented

with information comes

in a big mass of information

is very hard to tease out the

key components that go along with

that particular aspect of

visualization and a lot of people what

they do is because people are naturally a

very visual pattern

looking to a group to be able

to bring that information out and

to be able to take a look at it now in

the past we've had situations where

people have dumped data from

you know whether it be you've

been borrowing the edw environment whatever

and kind of gone with a

desktop cowboy type of approach

where they're looking at and exploring

data only on their desktop but

when you do that you've got to get the car out of the data

as you pointed out you got to

look at it now

after awhile it becomes isolated on

a particular environment you really

don't have the smoothness

that you would like to know you may find

an inside but you may not have the

data that your root the underlined supporting

data that really makes. I work

out to work looking at organizations that are

really trying to stay away from this

type of

desktop Cowboy approach to

their exploration now you know I've

talked about how exploration is great

for data-driven organizations in terms

of what they're doing with exploration what

they're doing with analytics

how they're looking at their operations

another area that we're seeing a lot of value

in is the growing area of


learning and artificial intelligence

lot of people think that this this

kind of machine

learning but a lot of times you got to get

that data so that you can

look at it some

exploration around it understand

the components that go with that then after you've

done those things then you can

really enable that machine learning process

and that artificial intelligence

techniques that allow you to scale this

out but being able to use Discovery

and exploration at that detail

level is really where organizations

are finding the value of doing this

and if they can

through that process it really makes

things by exactly

and there's a lot of excitement around machine-learning

artificial-intelligence but if you can't

make it usable to end-users now

there is a lot of automation of things that will happen

but in the case of cybersecurity we

worked with companies that need to

enable security analyst to

be notified if there is a

Potential Threat using a machine

learning in this

screenshot for showing a

demo application we built

where machine learning is ranking

essentially the potential threats of

different endpoints or users

in a network I'm looking at

the net flow data of all

the log files through the system this is our entire

view of all the different systems within an organization

be able to do visualization

expiration then of all right I see that there

is an issue here potentially with the endpoint looking

over on the right at the network graph what are the other machines

that that influence connected to who are the users

associated with that that might be involved

in this scam or something that's

happened and then be able to drill to

his much details necessary right within

one system analytics

you have to switch from one application

to another to try and do this types of rat

hunting with in cybersecurity it

it takes a lot longer to address

the problem and yeah we've all seen the news

we're Banks and retailers

Etc getting hacked so the more visibility

we can provide across all the big data to

the security Alison may not

be ignored data scientist but they can understand

Network performance and sniffling all those

things are really really important

that's the kind of things that some you

are really possible when you got access

to all the data on a

system that allows end-users to

self-serve so that the

thing I would say is that there's different types of Architecture

is that are out there

and we really need to be at a scale to

support you know petabytes information

for the scenarios one example is when

you've got a date or I was bi

school like there's nothing wrong with that you can still

pull data out you put it into the server

but it's going to again be aggregate

some reason I commit a drill detail as

quickly as you need to and medications you might have to

call up the data scientist go figure out what's

happening down at the log file level

not to mention of your

shipping date around you have to do different

types of modeling within the bi server which is

different from the day to Lake got to

keep those NSYNC as well as security so

that can be a hurdle there's

been a new generation of

Big Data bi middleware

essentially that allows you to do cubing

on the daylight cluster

and what happens as it's kind of like

the traditional cubes that and

we saw a 15 20 years ago where you have to model

the Cuban Advance you to figure

out what are the dimensions somebody's going to want to analyze you

put that on an edge node in the Buster

but that's only refreshed

at once every 24 hours typically

it's not a real time View and

again if there's some detail as someone

wants it's not going to be within that she

abused then I've got to go down to another tool

to get down into the detail and

important is a lot of the data

locality in the information on how to speed

up queries is store down at the data file

level so if the file system

levels so if your only passing sequel

queries back and forth with the FBI

school to these these Edge knows they're

just losing a lot of the understand

how that datastore the filters and semantic

information that's out there so again

native bi Technologies from how to

be a fully distributed bi

server that runs directly in

the data lake so whether that's hdfs

or on an object store in a pod

environment the date of locality

understand where data is placed so

that you can optimize query performance

and push

down at Symantec model that Samantha Clara's

far down into the distributed Buster's

possible really speed up the performance and

intelligence about how to do that and

what also happens then is yes there are Aggregates

and logical values that

are traded or physical views of the day that I will

speed up and allow you to have hundreds

of security analysts being able to monitor

the network would say but it's

done iterative Wayne it's based on the

actual usage of what tables

people are access you what careers are running and

then machine learning is essentially

recommending the way to structure that

data so that the queries can be

sped up over time back to the administrator

the best part is is no data movement

Securities in I once you

don't have to worry about people taking data out

of the cluster under a separate bi server on

their desk or on their desktop which

gets into Data governance issues so this allows

i t to keep it really clean and

allows the end users to access and allies

and share information we call this lossless in

that you're not losing any of the High Fidelity

high-definition Insight because

it's all done natively within the date of self

and Steve I think

one of the things you raise here with

the native bee eye on it within a day

Lake architecture example is not

just from a data governance if you

will from a management of the date

of perspective but now we're starting

to see privacy and Regulatory

Compliance become a whole lot

more important organizations particularly

with the I guess I guess the first Boogeyman out

there is gdpr and

its implementation time frame in

in later this year but being

able to minimize the Hops

of the replication the date that you're talking

about you you're talking about quite

nicely about the high-definition analytics

that goes along with it you also have that

sense of understanding


and management of that data that

allows you to answer questions if

you will about TP

III compliance PCI compliance

gdpr inquiry

things of that nature where in these

other architecture is it gets

a little dicey but when we have different pieces

in different places

yeah that's a great point and it's only going to

get more and more stringent that

need for a day to Stewart's and data governance is

going to just yeah

that does those are really insightful and important

points also a consideration

number 3 so real time

is a real thing John can

you start with that the

worst CMA is the increased

use of

I think we're going backwards

animal shouldn't be trusted with the growing

area around streaming

data streaming data integration points

and things of that nature so we've

done multiple end-user studies

in one we saw at 82%

of organizations are looking

at streaming

Technologies and streaming applications

and another one we saw 72%

so you know there's a lot of organizations that

are getting into how can we

look at the real time

data that comes from our organization

again those new data sets that dated driven

organizations are focused

on now some of them are those traditional

types of iot instances

self you know connected vehicle

data coming

off of well heads out in an

in an environment how do we

manage better hour

renewable energy cut

contributors to our our electrical

grids etcetera but we're also seeing

it in terms of real-time applications

how are we looking at ordering

fulfillment and payments

to be able to better understand

how these things are going this gives

us that ability to as

we as these data driven organizations like

can we start to project

the types of sales that we are

going to see based on the people who are browsing

the types of products that are that their browsing

what are some of the the components

that go along well with that are

we seeing order and fulfillment

and payment all within a particular time

frame if so what are the products

that make that happen if we see

lags what are the products for customers that

are like that as well and taking that

same type of mentality down to the

mobile or online path

analysis so being able

to understand not just from

that transaction level which is

you know the shopping cart things of that

nature on the back side

which would be very backward-looking but

what half are our customers

taking to get to either

a perch in

abandonment or what are some

of the things that they doing are we seeing customers that

are using us as a showcase

to look at things that they

might buy another another location how

are they using our mobile applications

and things of that nature but in these areas

were seeing organizations

take on that that real-time

streaming type of a pro and

it can't get a lot of those things

are not available if

we take that approach that see was

talking about if we have to stop and

replicate or move data and

particular place is because when

we look at the new business models are

in the new product development that goes into

the business scenarios

associated with streaming in real

time we don't have time

to make those particular stops

in terms of process and operational

productivity you often times will

find organizations that are looking

at how do we improve the performance

of our manufacturing floor or how

do we improve the performance of

our warehousing our inventory

processes and while there

is some wiggle room in there if

we don't have a product how do we get

that information to our

customers in a time frame that

allows them to make the

most of that whether we can offer them a different

product set their expectation

that the deposit will be getting will be later

on and then if we can start to

bring that indoor Supply Chain management

to be able to say how do we understand

where all of all the inputs

and outputs of what we're trying to do are

sitting with our supply chain and be

able to manage those things cross-purposes

now again some of these can

have a little bit

a wiggling it but for the most part

the business analyst and

the the the business models that

are looking at these new types of things

need to be able to rely on that

real-time scalable

types of activities so that they can

identify a new business

model as it's taking place so imagine

if you will we had a major sporting

event and somebody Wharf just the

the right shirt and everybody goes hey

I want to have that shirt or

the right type of glasses now

if you're a business analyst you can go hate

can we support that can we set that up

yes we do we can get it we can identify

it and now they

can execute on that whereas

before when we had friction in there

or we had that lower

Fidelity of the analytics for exploration

process we would run into issues

that went along with that now to

say that we got this the

stuff going with streaming is is

a perfect Panacea or a Silver

Bullet if you will we actually ever

gotten some obstacles to what organizations

are trying to do one the quality

and the reliability of the data that goes

back to that multiple

hops the leader

types of approaches for bi

of big data analytics you know

do we have the right data and is it

where we needed to be can we get

that conductivity real-time conductivity

to those data sources so

that we can feel confident that if we build

something out we'll be able to get the right

information to the right people I think

the only thing worse than not

making a recommendation at the end of a sale

for across cell up cell type

of thing if I'm an online retailer

I'm trying to create that great

customer experience is to

make the wrong one to not understand

what a customer is doing right then

and there or what had they have been doing overtime

and say hey how can we help

these folks do that you also see

that a lot in terms of customer

how do we understand

if a customer is having an issue with

one part of our organization now

they want to be on customer care how do we

get that and then the other issue is the inability

to ingest restore that

data as we're moving forward because as

the streaming data starts to get to

build up you know I think everybody

kind of imagines that all of our connected

fleet vehicles will be automatically

your auto automatically enabled

with streaming data it's

going to build overtime but once we get

to some of those critical links we're going to

have a whole lot more data coming into the

systems are them their organizations

are really able to handle that Steve

I know that you guys have seen some great ways to

handle that in terms of how the Acadia

platform works

yeah I see your point on connected Vehicles

right you want to be able to see what's happening real-time

within your flea and if there's a

driver that stalled out somewhere where there's an issue

with a machine be alerted

to that in real time for the analyst can see

what's happening but then also go to look down in

the correlation of it was their part

that's failing based on driving behavior

overtime and that sort of thing so what we

see is little and

architecture but you have sort of a path for

streaming data you need to be to

visualize that then you also want

to land and store that data over time so you

have the historical context and you

if you can provide both the real time in historic

on one interface for an analyst it just helps reduce

that friction like you talked about him Kafka

certainly one of the things that were seeing more

more customers adopt for the streaming

where there's also a spark streaming and blink

and all these other streaming

Technologies out there but it's just a pretty fascinating

world and me we living in a real time on demand Road

between Twitter Snapchat and everything else

people expect to be able to sense

respond interact with data in

real-time in the Enterprise needs to be the same way

it just happens to be a lot larger data volume

so that's what we need to be able to deliver to

our clients for sure

and part of

that sorry Slide


other part of that is complex date

as we mentioned earlier and you had the visual of

it it's hard for humans to interpret

the stuff you see on the left and now

this would be a Json file read got nested

structures and Tricia you have to flatten

this output into some tables and

then serve it up through a data

warehouse space bi to a

really a native approach allows

you to interpret that build the structure

on the Fly and visualize it's a rough down the date

as its Landing in the cluster so you

learn might be coming in on the streaming you picking

that up but it's also being written

in variable instantaneously

essentially as it's written to disc in the system

so I just really streamlined ability

to look at that information as well

and see I think one of the things that you

raised is it a good one you know you show

the the the structure

on the left hand side there and

their lot of times were seeing with streaming

date of horses. They

are still in that process of defining

what the structure is going to be so

some organization will say hey let's flattened

or Jason so

that it starts to look like a table now

that's one approach that can

work but if the you

have flexibility and variability in

that Jason you're going to run into issues

where either you have data

that doesn't match up with the other flattens

table structure or you're going to start to

lose data elements that go

along with that and I think that's one

of the things that particular these early stages

for iot and particularly

for the environment

around mobile and

application development a lot of

those guys there I wouldn't say they're going

by the sea pants but they don't have

as robust a

configuration change management

as we probably would like to see

in those environments and they're changing their

Json which is what it's designed

to do on the Fly and

you need to be able to handle that as

opposed to making a choice

of do we let it in and

lose some of our Fidelity do we keep it

out having that flexibility really

enables organizations

to do with that they're trying to do that

you're working with i that takes us to

self service John

you want to start off with I want to know

about Big

Data consumers are coming in all shapes and sizes

so we've got the traditional

data scientist we've got RBI

analyst those those folks that might be a little bit

more focused on the technology sides

but we're starting to see

people that want to get in interact

with this data and data-driven organizations

that are coming from or external

users like customers or

a Partners we're looking at our line

of business Executives who want to say

they want to get into and start taking a

look at this date as as being made available our

operations team so Frontline employees

really want to start getting their

fingers in the data but unfortunately

you know not all of these

folks are looking for the same type

of experience but what they are looking

for is a self-service type

of experience to going back to Steve's Point

removing that friction of

how things work they

always want it now I've never been I've never

met a business analyst or a

business executive who's or whose comfortable with

waiting for the day that I want to look

at it in terms of a data-driven organization

those requirements continues it

to turn up but how can we get

people the information that they need and

in a format that they can

quickly without having to

get them all into these these big environment

since the best example I like to use

and I don't know if if you folks

like to use it or not but the you scan that

you see popping up big

box retailers you know.

The Home Improvement store that I go to now uses

them my grocery store uses them I

hear a rumor were soon to have an

entire for

that his nothing but self-checkouts

but how do we enable that so that if I've

got a small question or a small task

but I need to do how do we get them

to be able to go out and get that

works as I pointed out or

did scientists and our Advanced data

analyst they're going to like that exploration

experience because they kind of know what they

want to do they are comfortable

with the technology and things of that nature but

was we get out to more of a

front line employee or the edge of our data

interactions are

Partners in our customers aren't going to

want that exploration experience

going to want to know hey this is the order

that I have can I understand where

it's going if that's a partner we

can look at how there's the the approximate

selling through us how are they being managed

are operational teams like our

Frontline employees a great example

of customer care a

lot of those folks don't necessarily want

to or have the time in

their job to go exploring

around and data what they're really looking for

is to have the information about

a customer pop up

are they having a good or bad experience

with our organization are

they a good customer neutral customer

a bad customer sometimes I described it as gold

silver bronze and Lead

and how can I help them with the

next step and that's much more of

an application if you

will then it is an exploratory dashboard

where you have to start

to go configure and do things like that but

we're really looking for is an environment that

allows us to merge these two things together

and really meet the

needs of what each of

these data consumers is looking for

yeah and and to do that you

know you need to have that performance the scale

that the speed to support

these types of applications the cybersecurity example

I gave her earlier was one where he has planned

formation from your security

systems and a linking

back there sweet you can click on those things

and we rank information that's coming

in and search correct machine learning miles to

humans feeding

back to that connected

cars another example but it's it's having

that distributed system that

allows us to have that scale and performance

to enable that

in length and the

other thing is just huge numbers of users

right so we've got customers that

are supporting over 1,200

users who have got one that wants to scale

up to something like a hundred thousand

users and really build a customer-facing

application you know analytics-as-a-service

new stars one example where they built

marketing attribution

and other types of analytical Services they provide

back as a service to people so within

the system we've got to figure out a way to

support that high level of user

control NC so I choose not to go and configure

multiple environments

probably want a multi-tenant

environment so what we've got is a patent

any technology around what we call Smart acceleration

where again we are monitoring the

data usage and providing

recommendations on a little confused that

can help support hundreds

of 800x faster performance for

the end users as they come in its

minimum modeling we're not building in advance

those views are being created based

on machine learning recommendations from system

on what data

needs to be aggregated cash

and stored in a physical way in

a different way to speed up that type of a

data-driven application for the business

and this is the best route

from one of our and customers

there a large telecommunications company

that supports webinar

types of events I'll just say and

they have a customer support team

that needs to understand and troubleshoot

the network if there's an issue happening with someone's

deployment of this telecommunications

company product they're using traditional

Legacy bi tools and wanted

to access the date of the store they're

using or looking at different seek want

to do pensions like Hive and Paula and Spark

but not

give them the ability to use granular

analysis but not complex queries

not 4:30 concurrent users

which was there requirement for the

system and with a distributed

bi server or native bi-product

handling and providing

those down a little confused and weak Grace

coming back in a reasonable amount of time where is once

you got to 15 or 30 people on

system allow these other Technologies

just could not bring the results that's that's

back at all these would complete so

this is a view of how that performance starts

to look last.

It's pretty amazing. I see no that's not how quickly

that goes they have all weekend to some real world

examples sure

I've got a couple more so I'll just say I was

talking about this telecommunications company

at Arcadia we work

with a lot of smaller startup

digital first types of companies new

stars one example about their market share

and they got a choir building that Self

Service marketing attribution

type of application software

that service to their end users who else

have large cpg companies online retailers

that are security Kaiser

Permanente is a customer that back to John's

playing around data governance near there is she was

they didn't want to have to pull it out

of the bus there and worry about pii information

that was downloaded to somebody's

desktop for the Legacy bi to also

being able to keep it all in one place is really really

important for them if you double

click on some of these different

types of marketing applications

that people are looking at you know you talked

about the path to purchase and that is

sort of the the golden path of you

of what people trying to understand which is how

are consumers interacting with the

brand and figure

it out what is it going to purchase something or

not so it do we have the Right

audience for their marketing are we scoring understanding

as consumers are we measuring the impact

what stores are doing well from the sales distribution

perspective so being able

to allow the campaign manager the

product manager whatever it is to see

which campaigns are running how people are interacting

with that and do multivariate testing those

types of things C4 the digital

Dollar Tree spending what's the

value of that in reaching our audience

and then you might have really

good sales in but say the state

of New York but when you double-click

down into the ZIP code or yeah

I've been hearing about people looking at down the parcel

level so by household being able to

track your coupons

digital ads and different things and is

that in parcel level

know not to mention zip code at cetera

of intent to buy and things

like that are we wasting our dollars you know marketing

to certain part of the the state

or the county

questions now that end

users can ask for themselves once they've got

all the data in one place so

this is really interesting in terms of the

types of analytics

that can happen and then when she look at

it from an operations perspective this

is a really interesting example where is

very large Manufacturing Company used

to bring in Consultants they would pull data

from a bunch of different places from the supply chain you

know the trucks the products which warehouses

pallets they're being shipped out

how their stacking those pallets of different products

is their waist and what shipping

to the various locations that was a

six to eight month project and

would cost somewhere in the neighborhood of

$100,000 to do that every

6 or 8 months to look for efficiencies

in the butt with

all the grain of a date in one place now they're integrated

financial data together with physical flow

data not just for let's

say a truckload of a product but down to pallet

level and now that you have RFID use it

can be down at the individual no package

level to optimize

the pallet splits and what gets shipped where

and this Sankey diagram you

see on the screen here allows

people to do a century what if analysis to

look at all the different paths for the

price is taking a different distribution

stop off points to look at this a better way to

do it and

so it's highly interactive this type of analysis

done in traditional sequel is

multipass equals really complex takes

a lot of time now that was more advanced

analytics at a very large scale

of data with lot of granular you can

extract a lot more Value Inn efficiencies on

the list of been really transformative

at this person said for their business

so just amazing some diagrams

one of the things I think a lot of organization

Great Value out of it is it dispels

some of their preconceived

notions about what their processes

look like they may have thought about a

golden path goes this is the way it's

supposed to work

but all the sudden they start to look at a diagram

like this one and in real time to go

our orders coming

in to the Fulfillment component

without actually being quoted or

estimated at the beginning and

then they go we'll wait a minute are we missing data

or is there something wrong with our process and

in some instances they find real

organizational change that they can make it

because they have this access to that

level of granularity without

having to you know Joe's

Cycles with it they're really

able to get great inside out of this

type of analysis and the other ones that you

can talk about yeah

great example

in this is a little bit of technical but I think for

the The Architects on a line people

that have invested in various to do

platforms you one thing

that I would point out which is really interesting as

I need to be eyes that are

we can stall with existing

systems like fire manager and Bari

we inherit security and roll base access

controls from projects like a patchy

century and Ranger and those types of things so front

Administration perspective you

know you have a lot more confidence that

you can deploy this to a

large number of users and you're not worrying about keeping

Security in sink and

what the bi departments do two different from

your tools all or from your platforms

all one system so this is been a huge time

to value and just lower

Administration considerations

people that are deployed data like so

much broader audience now to get

the value that has been promising Big D and I think

you know big data and data Lakes

you can say it's it's in the trough of disillusionment but

I think it's because we haven't had good ways

to give that access to

live in users to build applications on

these platforms are really make it a big

data application platform it really

this is kinda Last Mile being able to do

this on to the business people to understand

the business and how it needs to happen and to

build customer-facing applications just so much

more easy

way to do that agreed

to know and I think you know whenever

ma does end user research

around a big data streaming

things of that nature security

questions are always like boating in

Chicago people vote early

vote often I can almost see them like double-clicking

on the yes Securities really important

to me unless I wish I could

get an intensity button how

quickly and how often are clicking on

that foot know you race great

issues here and as we

look at privacy and security and

moving into different components around

how do we manage that security

it's going to be important be able to answer

these questions because if you can't

your CSO might just fold

their arms and go no you can't do that or

you run great risk of violating

the trust of your partners customers

and suppliers so I know

I think those are all great points and ones

that I think you're fantastic so

as we kind of look at what the steps

are kind of wrapping things

up in in the opinion of your

DM a data-driven organizations

are capitalizing on big data analytics to

fundamentally change their

business model organizations

that focus on rear looking data

are going to be quick we left behind I think we're

already seeing that particularly

in the online and brick-and-mortar retail

space I think we're just going to continue

to see that and these approaches

required new and different methods

than our traditional if

you will

kallax refine

presents model

of all the different components and

we have to remove a lot

of the friction out of what

those traditional approaches have been

keep their best if you will patterns

you know for Quality data

access etcetera but how do

we speed them up and develop new

best practices and again

or of those those new things I

think we're going to see our detail and speed

of access are really going to be key

when we look at the future big data analytics

Suites you know those days Drive

in organizations they won't let

the business stakeholders will not sit

still long enough to allow

their architectures

to catch up they will either

move along with

the group or those guys will find ways to

Define alternatives to make that happen so


where big data analytics are going

in the near future yams

from our perspective

yeah I've been in this industry gas

18 years now and I think the

same way we saw data warehouses created

as her of a separate thing that's

designed for analytics and

back in the 90s I think with

the data Lake and Big Data Systems are at the

scale-out world where you know

there's laws of physics for

Hardware just cannot keep up with the growth of data

it's only getting worse at IRT so everything

needs to go to scale-out distributed systems I

think Enterprises need to look at having

two separate VI platforms one standard

for your data warehouse which is designed perfectly for

what that's good at but for the data

Lakes a scale-out cloud types of environments you

really need to look at a scale-out the eye solution

that can enable you to create

these data applications and they both

houses of users to really get the value of these

Investments you've made and collecting all this

information so that's our perspective on what's

happened to Mark and it's pretty exciting times

yeah I think Steven was working

off of John was speaking about that that's in this concept

they just introduced makes makes

a lot of sense hey we've

got a bunch of questions and

I think we have some time to go through it just

saw everybody online to know you can put

questions through the the

pan on the right and also through

the art of the Twitter

feeds but why we

go through this question so I just want to give you some

resources here and just

know it on the right-hand we have a new GM

a white paper

that's coming out and you'll be able to link to it from

here and will send it out on your email

but one question we have

so far is how

do companies make the change

from traditional to data-driven

well I know one of the keys

is to make a culturally if

your CEO is sitting there with his arms

folded going you know I don't need a

machine to tell me what I what I know

about my customers I know them either

they don't have a very wide customer

base or soon they

won't have a very wide customer

base because there are people out there

that are making these changes of one there's a cultural

component that goes along with that and

2D I have the systems that allow you

to take the data that you are collecting and

make use of it so as we've talked about today

being able to have an

underlying architecture that is flexible

and if

you will Nimble enough to meet those

changing requirements because I

can guarantee you one thing data driven organizations

don't like the concept of please write all

your requirements down in a requirement document

for me and in 6 to 12 months will

be able to execute on them what

they want to see is an environment where they can

stick their fingers in the data kind of

Waterpark take a look at it associate

some other things with it and really

find those new insights and then be

able to apply those inside so

that the you know how you go

from traditional to do is driven

one is cultural one

is Technical and when the rubber hits

that road you're off like

a Shot Ya thank you

and then this one this next one

really I think start speaking to the idea self

service but in a really getting to the capabilities

of analytics that year that we

get our business

analyst technical enough and

lesson quotes technical enough to

drive value from big data analytics platforms

you know that that

is a good question I've

never seen a business analyst go please

send me the customer data from a Duke they

always ask questions like may I see

the customer data Massey the product data

and the more that we can encapsulate

from them and then there was a slide

that Steve had up where he had an example of

a Json format

which was very confusing

and how if you know the day that you

know what really well you can start

to get in there but if we can turn

that Json format and encapsulate

that complexity without losing Steve

concept of the Fidelity then

they can start to look at all let

me look at the number of Customer Events that we have

looks look at the the different products

better in these data sets and

at that point there very

there very Adept to being able to

answer those questions but simply

dropping them in front of a Jason or

some other type of text-based

multi structured

environment that's going to not

necessarily confuse them but

it will throw up a barrier to them

really jumping in and if we

can remove that freak that removes

up barrier and now you got

business analyst who who are going this

is how I can take a look at this data

and I can just imagine that just screams line

streamline the process to

hey Steve I'm going to give you a true

partner. I think how

is AI and ml leveraged

and big data analytics how

can bi tools lever

yeah we talked about a little bit the webinar

think to John's point on our

business analyst technical enough to handle

big data analytics what's a I think what's

cool about machine learning in the eyes

it's it's kind of like no power

brakes or parking assist right we

want to provide intelligence to wear

casual users to enable them to

do more things so to me it

is not just running models in automating

the world where robots take over but

it's also assisting humans to make decisions

more quickly so in cybersecurity

example of your monitoring all

the stuff that's happening in the networking and bubbling thing

is up for that in for the security

Alice to look at

something this potentially threatening but then do

some even analysis of things

that you know machines aren't quite

there yet on our

perspective on how I can help


Brighton and a lot of people talking obviously

the cloud right and how does cloud

factor in with big data analytics and

specifically can Arcadia date a sport

that be

a second part is we have a

number of customers that have deployed systems in

the Pod people are you know not

many people are clouds made of what say unless

they build their business on what's Amazon

S3 to start and we have companies like that your

new stars one example or soft pod bay software-as-a-service

as you'd expect but most large

Enterprises are hybrid environments or they'll

have some data in iCloud someday

tan trim so we can work across

both those environments and snow

certainly more more people moving in the cloud and then it's

just there working out do we do a lifting shifting move

our environment there or architect things

it's primarily it cost base

thing that people look at it initially

they don't want to have to manage data center but that's not

always the right fit for certain

industries and everything

else but I think overtime or more people will

be moving so we've architected

are software to support that for sure

excellent excellent I think

we have time for one more so

we had a lot of questions so I'm going to try to get through this is

John can you give

can you give a specific examples

of how front-line employees can

use self-service apps and data driven


no I got some great example

of a case study recently

around an organization that does

not. Big group

but a small organization

that ships out you

know Candy snacks Etc

things that you might find another of the

break room of your average company and

what day did was they

set up information about how

first shift was doing against 2nd shift

then how they were meeting at solae's

and things of that nature now you

could do that an exploratory dashboard

but because they made a nap and they just

presented it to their teams those guys

were now informed and send hey you know 2nd

shift hit their numbers

we were going to go and try to hit our numbers

and not in a competitive Cutthroat

type of thing but that friendly if

you will competition that you see a lot of times

between shifts or between offices

and things of that nature and it

was very simple it was about how are we meeting

our our our customers name how

are we meeting or at solae's which are all part

of the way the organization's working but

presenting it in an app that

was clean and clear and those

teams got more productive because they

knew what the levers were were

happening and you know people were

presented with information of its I

want to call it really really old

school but you know what Monarch was gets

monitored gets managed when you present it

to your team's they they want to get

better and if they they know that something does

that it helps the organization

okay well we're coming up to the top

of the hour and you know I thought this

is a great webinar thank you John and

Steve for for giving us your Insight

and you don't have to say that you

know has to ask more people are turning to big data

analytics as a as a

standard now you

know really really understanding the ecosystem

in and what kind of tools you need to handle it

becomes much much more

important so I want to thank

everybody for joining us we

can get to all the questions but thank you very much I

will get back to you and you

know keep your eye out for this this

new info that

EMA is putting out and

it's going to be really interesting so

thank you everybody and

good evening