You know the insights are in there

OVUM log


Presenter: Tony Baer
Principal Analyst | OVUM
Presenter: Laurent-Olivier Liote
Research Analyst | OVUM
Presenter: Steve Wooledge
VP Marketing | Arcadia Data


Mysteries of the Data Lake Revealed: Five Secrets to Unlocking Enterprise Value

As organizations modernize their data and analytics platforms, the data lake concept has gained momentum as a shared enterprise resource for supporting insights across multiple lines of business. The use of Apache Hadoop, Spark, and cloud-based data lakes is maturing and must now support direct business end-user access and analyses. A proliferation of SQL-on-Hadoop engines has helped bridge the gap between business users and big data, but it’s not enough.

Join this complimentary webinar with industry experts from Ovum Research and Arcadia Data who will discuss how leading companies are realizing value and success with analytics from data lakes by:

  • Delivering on key requirements to drive value from data lakes and more fully complement EDWs and data marts.
  • Reaching a broader user base with intuitive, code-free UIs for visual analytics on the data lake.
  • Operationalizing machine learning and advanced analytics to complement and extend traditional BI.
  • Securing and governing data lakes with unified security across the data and analytics platforms.
  • Scaling to support hundreds and thousands of users in multi-tenant data lakes with ultra-fast performance.


welcome everyone this is Steve will it for

the Arcadia data the top of the hour going to

get started just in a minute here the way

the system works as people are going to have

to log into the browser right

here at the top of the house or just given one minute and we'll

get started thanks for joining

alright looks like we've got a

quorum here today session

is how to scale bi and analytics with Sir

Duke bass platforms I'll be

your host this is Steve willage I

work at Arcadia State CPA marketing

and I'm really excited for this topic

because I've really seen a shift since

Kayla architecture is primarily around

data storage and I think the movement

now is around how do we get analytics

in value out of the system to scale

out behind analytics platform so

that's what we talking about today and

I'm really delighted to have to

write presenters with us today our

special guest is Boris

Ellison he's the vice president and

principal analyst serving application

development and delivery profession who's

the leading expert in business intelligence work

to 434 number of years I worked with him over

a lot of those years and I really like

working with Boris cuz he also was a practitioner

early on in his career the

poem data warehouses at places

like Citibank and being a strategic advisor

at JPMorgan Chase so you really understand

the technology in addition to the

application and then we got

print this out who's the co-founder and

chief product officer here in Arcadia data

whose instrumental in the development

of our products and has

a rich history and analytic

data bases at things like Astor

data in teradata so was that

I'd like to just get a little

level set with the audience today

we're going to start out with Boris talking

about systems of insight and this

shift ever seen in Next Generation VI

platforms that scale natively

with Hadoop another scale-out architectures

and Priyanka will get a little bit more into Arcadia

and how we tackle that problem and we'll

have some time at the end

in addition we like to run

a little Pole now just to level set

with the audience of where you are in

your big data Journeys so hopefully

you can see if voting tab

that will pop up here and will try and show these

results you felt I'm so go ahead and I'll

read the question out loud just as you're looking

at it but we're asking where are you with

your big data deployment

FaZe egg be gathering knowledge just

thinking about his new brother scale outdated

platforms or maybe you're developing

a strategy to find the architectures

collecting tools of your piloting

and got a system up and running or

maybe you're in a disappointed face recipe using

your Foreman

using analytics to give and users access

the system so you can select

any of those choices that

would be great I'm just give it a

couple seconds here for

people to respond

I guess by the way while you're doing

that Russell take question throughout the

session today there should be some

tabs on your right hand side that you can see where

you can answer or answering

questions as we go

cancel you should see

the results at the bottom of the screen is

there a streaming in real time there so it

looks like 50% of people

are gathering knowledge we've got some

people that are deployed at shifting here

as I thought we can share

these results back with everybody that's

what we'll probably do a Blog wrap up at

the end thanks for taking the time to fill that out

alright so we'll move on

then and I'll turn things over to Boris Yeltsin

to talk about what's happening in the Market

Force XR thanks very

much for the introduction and for the opportunity to

present yes indeed you are correct

I've been in this market for over

30 years I think about 35

years at this point last 10

years of forest and I'm really really happy

to see what's happening in the market

because finally finally after many

many years of trying to instill

the discipline or fronting

business by the numbers by by

data decision

driven company

this is more of the finally beginning

to Abby, reality is so

the great great data points putting

what I just said this one is from Morningstar

it is predicting of that talk

in the next few years in size

driven so we now like the term

insides driven as opposed to data-driven because

all this data is just

bits and bytes he can't look at bits and

bites Tim Duncan make decisions so

bi and systems of inside that

I will talk about in a second I really all about transforming

that raw data bits

and bites at the meaningful insightful

actionable information for those companies

out there that are going to be inside driven

are going to grow eight to

nine times faster than

their peers 8th to 9th times faster

than the speed and older so that's a huge

chunk chunk

of change.. Something that's easily

overlooked some some other very interesting

numbers from Forester basically

with what's been happening over the last few years is

a B I slowly but surely

has been moving from the back offices

in to the front office is basically

the morphine from being

I just about compliance

or transparency on nice to have

a type of application to

to basically corporate asset that

that everyone is using to compete

on so she can see on this data point

pulling customer satisfaction and

getting competitive advantage of at

the top of everyone's agendas and

those of you on those 59%

of you who will tell us that bi is

there a top priority also reaping

some very tangible benefits

so more than 45% of you

tell us that to you you are today getting

double-digit returns a tangible

Returns on your behind and

when we correlate your position

in the industry as an industry leader

as opposed to an industrial ladder

that we basically Define that has

anyone who is growing faster than the

industry average the average within

your air industry segment if

you're growing faster you also

happen to be investing more more than

a third of your b i o u i

t Budget Inn to be high so not

only tangible are awhile an individual

be indispensable but also correlation


call hog roast but

that road is not without challenges and

I'm sure all of you on the phone or where

would those challenges are if I

am pressed to talk about one

single challenge will be through there are

multiple but one single challenge

use a disconnect between business

and I T unfortunately I see that all

over the place today and we

only ITC definitely

unfortunately to blame for

this because we

are professional

that's still really stuck in the technology

for the sake of Technology a conundrum

that we are overly emphasized

a streamlined data

architecture and centralization of

your bi organization

of single bi platform

and that I'm basically trying to get

to that Nirvana single version of the

truth and it's know that these are not that

important priorities absolutely

by all means is a very very important

goals and objectives should be

at all priorities but not the

top priority and unfortunately

often to forget that this is how

I really just want to give their

jobs so you taking to get

the job done with Excel faster and

better than with your Enterprise

bi platform then you know what you didn't

do something right so

that that's a very important lesson

learned and I delete more and more

I team it

of NBI shops are beginning to Embraces

so as it as a result of this

act as a result of a complexity of

architecture that we bring into a

cheetah that version of the truth we see

some pretty pretty bleak

results that on average you tell

us that you all leverage less

than 50% of your structure than less

than 25% of your unstructured data for

this year than making these a self-reported

numbers that we know that you are not aware

of all of the day of the child

there is specially external data

expecially unstructured data so

a more detailed study

as whenever we go in and we kind

of analyze the environment

adult client to see even Bleaker results

that less than 20% of your

structured 8 and 10 less than 10%

of your unstructured data I being used

for a inside sound decisions

and I'm sure this fight is near

and dear to your heart and I know no matter

how long it has

been investing

in the eye and putting out

bi Publications old

XL stool side is it's the

platform that provides instant

gratification sell about two-thirds

of you still tell us at more than 50%

of your B I contact the sitting

in spreadsheets and other homegrown

Shadow ITI application so enough

about the bad news now let's start talking about

the good news so so what are some of

the solutions out there

that we think I'm going to help

us get closer to that Nirvana

or for highly efficient

and effective of the AI environment and

Enterprises and we had forced to believe that

the answer is light in these three

areas will Aleve the topic

of artificial intelligence for a letter

time this is this definition for in

special separate discussion

today will concentrate in the first two

topics extremely important would

be between a business agility and

deep datta capabilities I think

it is about 90% of the

success on a tie is just going to

be the icing on the cake so

let's let's dive into these to talk

it's business agility and these

day and talk about why they are so

important the reason that businesses

you it is so important because today

we are at the age of a customer

Woodforest that means is that we are way past

the age of information and

we are in the age of a customer customers

room so I don't really care how

your internal organ location

of how you internal Enterprise

processes such as Finance

HR supply chain risk management

Erp CRM I do not

really care how well oiled those

processes are and how well they're on

either those processes are not allowing

you to follow and embrace

your customer behavior and

address your customer Behavior you going

to a fall behind should we call that not

Enterprise under the age of a customer

and the prices are outside in

driven as opposed to Inside

Out driven so basically there's

nothing more important than following

your customers in addressing their knees

regardless of what your internal

business processes are all about and

once we realize that a few years ago we

went out and we conducted some research

where we decided to quantify

what businesses you it

is all about so we came up with the

stand Dimensions that you see on

the right of this light end dimensions of business

agility and Israel common sense

Common Sense Dimension

sold to see if your channels are integrated

you are more of a gel and responsive

if in the middle of this slide

you or your infrastructure is elastic

and can shrink and grow and

you can provisional deprovision resources

based on customer demand

to see you're going to more be more

responsive Michigan

CBI is front-and-center Right

smack in the middle of all these capabilities and

some other capabilities here

are directly dependent on VI Sol

Market responsiveness no knowledge dissemination

all we did is we

applauded Ibiza

situated capabilities of about 350

public companies putting

them into Dimension to Dimension so

they are aware this Apollo

well aware they are all these business

agility capabilities and how well

can they can they execute

on these capabilities and I'm sure you already

guessing what I'm going show you

on the next flight yes you are your

gift correct with high or higher

performers those companies that grow

faster than industry averages

that grow faster

than their of their

peers are all over what

we call the formidable category right

in the upper right quadrant meaning these companies

are aware of the business of Judah

capabilities and they can execute well

versus lower performers Industrial

Average on the left side are all over

with the call Leah humorous the oldest

of the coolest like a paddle a

down with the wear and then with executing

so definitely again some

quantitative proof of a correlation between

overall business success

and business situated then

the next question is what is it that we be

eye professionals they need a management

information management professionals what is what

is it that we can do to support this

business agility capabilities


just wet your appetite

forced to recommend practicing this four-part

Hedgehog d i a framework

it is all about the edge

of software development where we emphasize

rapid prototypes

in Rapid pulse of Concepts

food into Coors eye production

environment where business users can

start using

the systems within hours

as opposed to two weeks and months when

it's already too late are the second

component or a gel organizational

structures where we realize that neither

ands of the extreme in

organizational structures of work

so organizational of

silos are old disley

not good because we

done been through sources and there is no single version

of the truth everybody's getting different

answers to the same question but would

forget to give The Silo

some credit silos are much

more Edge out much more responsive

than the centralized organization

because they are if

I own my own development shop

if I own my own dude b i t i don't

need to share those Resources with

anyone or Miss Lee I call the shots and

I'm much more responsive to

the market and we moved over

to the other side of the extreme where

sometimes erroneously

organizations overly centralized

they are bi support yes they

are rational eyes resources but

they now have become highly bureaucratic

organizations they spend

endless hours and steering committees

and prioritization a meeting

and as a result they move slower

and when they move slower yes what is

this and user start doing they go

back to Excel and homegrown homegrown

application so we recommend

some kind of a middle of the ground we've got a lot of

deep research behind that they welcome

everyone taking a look at it

I obviously we need to practice Edge

out the eye of processes so

as I said rapid prototyping

as opposed to a long waterfall

development cycle is one of

the examples and last but not least told

me see what the Arcadia data will talk

about later in the presentation that

you have to have an edge LBI platform

if you if you have an older generation

that bi platform still

use the one that the older

generation are get that check where it

takes a long time to change anything

long time to develop anything

that that's not going to help

in the business agility

way in environment business

agility use a top priority business

agility and how Edge LBI

places that plays an important

role in that environment let's

move to the second part of the big data

and let's talk about what the neural Big

Data plays in this business

agility equation

ridiculous that tend to stay

away from the multiples

of the definitions of Big Data it

it's good to start the discussion about

big data using the terms

like the volume. Louis City

variety and variability but

it doesn't really it's

not really actually both so you know once once we

have this discussion on DVDs with

the clients they say alright well what do I do

with without now how does at 8-8

translate into some kind

of a follow-up items like

to train the

discussion along these four lines

rather than creating a definition of

these data and then I'll buy the way between

two or three analysts

and the two or three vendors do

you know when you get 5 or 6 people in the

room you're going to get 20 different opinions

that's the one big data is

all about it's not so rather than defining

it. Just understand that some

of the use cases

so we have this four-part discussion

with our clients the very first one is

about people and processes right so all

that technology implementations have

three components right technology people

in process if your challenges

in the people and process part so

if you are challenged with business

and I T alignment does I talked about there early

if you challenged with the data and

bi Governor if you challenges

are data quality has nothing

to do with technology so please don't don't

equate these challenges

with anything that big data can't can

help you solve an address the

second part of the discussion of all

about upgrades so if

your system is running slowly

if you do not have the right


maybe you still using the old types

of databases maybe you

are still using RIT Centric

of the eye platform where

a business users are not truly empowered

to self author

a majority their own the

icon that maybe you just need to scale

up or scale out you need to add some CPUs

you need to add some nose in your service

and maybe it is purely technology

upgrade situation

which has nothing to do with me then

we do indeed dive

into a different discussion wed Big Data

indeed I have some used

you know what you do a

deep dive on a toilet not into that on

an ex lied and then I

once we kind of understand

that indeed your requirements

for Big Data address to use

cases are real a big day at application

then we talk to you about that sounds

implications when you upgrade to Big Data

when you upgrade to the Duke

based Information Management

the data governance and the single

version of the truth and data quality

become completely different

issues they they still

need to be addressed but they are address

a very different than four reasons

that we'll talk about it in a second

so here's what's really

behind at 3rd points on the

previous line so these are they really


for areas off for areas

of requirements to necessitate

investments in Big Data

Base the RBI technologist number

one your business requirement score for

linear scalability right so so

we all know that even

class Starbase stem bi

platforms the north scale

Ali nearly databases

that are based on a massively

parallel distributed

scale-out technology

but if you'll be high platform is

not based on this semester of

parallel and distributed technology that

old this way you're going to be a limit

of yourself limiting yourself

in a linear scalability this

second part of this is all about that

business agility

very important or people that I spent

the last 15 minutes talking about is

that traditional early gmbi platforms

are based strictly on schema

on right type of

an architecture and hopefully you all know

what that is and that that means we're data

and metadata are very

tightly bound meaning

that you first create your day tomorrow

and then you populate

the day tomorrow with Dara and therefore

you there and they tomorrow at tightly

bound and the only way to change the

day tomorrow is to rebuild the whole

database from scratch dropping

call I'm changing in this is changing

primary for and keys etcetera etcetera is

a very difficult adorable but

they were Source intensive process

in skimmer on a right

type of database schema on

read database is your data

and your method data are separate

and therefore

you and I can have I

can have two different views into

exactly the same day I said that by the way

is why single version of the truth

requires different type

of handling but schema on

Reed's tactics architecture

definitely infinitely

more I add shout and the last

couple of reasons is about

bottlenecks and moving

data between clients

and servers between your front-end and

back-end application in traditional

earlier jndi architecture

you no matter how scalable

you make it you still Bound by

all of the traffic all the sequel queries

that are generated in your front end and go

to your back end up with a

database and database generates

the answers to your queries

and chance the results back to

you are a bi application so

that that traffic

sometimes causes bottlenecks

especially if you are sending the data

across your fire also either

way we can keep basically

data and applications together all

right inside our classes as opposed

to having to move the data in and

out of her classes that that's really what's

behind that of a three or four high-level

use cases where a big

day at specifically for Dupont

Spartan Beast architectures play

a key role so with

that in mind with Forest to do a lot of research

into open source

Technologies like Hadoop and Spark and

as you can see the market is really really

embracing of this type of scale

of parallel processing technology

about 800 million is

being spammed on Hadoop this year

and about 48%

of telling us that you already have built hadoop-based

a data lakes and I'm looking

at the

I'm looking at the results over

the pole that even you

just ran it yet I do see indeed just

about 250 plus percent

of people are telling us that they are piloting

will have

already deployed some kind of

a day late could do based behind

warm and so I think the numbers differently

a line here so

when you when you look at

your Jeep and Spark base

the FBI architecture

and it carries certain

implication that it's

not the same as just looking at

your database vs.

FBI technical architecture Hadoop

and Spark are you know

very complex a project

there was a huge complex ecosystem

around them so as you're looking at

this platforms you definitely understand

water this data management layers and

what kind of a file systems are based on what

is the app management that cost management

what's the managers that they run on what

kind of a data processing is it that

produces at Sparkle

some other types of architecture

and last but not least what kind of

curious what kind

of data processing

does the platform support is

a basin SQL is a based on

streaming or some other some

other technology help with that in mind what

we did a few months ago is we

ran and evaluation

process or for several

top vendors in the space and

we looked at them there a comprehensive

list is it going to see on the right

side of the slide we

evaluated as vendors by about

15 criteria

so on the technology

side we looked at the data preparation criteria

will look that large Enterprise features such

as a security and collaboration

Administration obviously and

use a self-service is a very very important

item data visualization

you obviously can't have the i in

analytics without data visualization Sequel

and Olaf capabilities are important especially

when you reporting your

existing Legacy applications

to these new native

Hadoop type of Architecture is a

nosql or date of discovery and

exploration type of operations also important

advancement Predictive Analytics

capabilities that Hadoop and Spark

architecture that's what I show them the previous

find that that's what we evaluated in

this particular a line item integration

of multiple components making sure that

a date of preparation and

data visualization answer

cell service capabilities of tightly integrated

we looked at the deployment

options on Primus Cloud Etc

and old asleep I

took into account customer satisfaction

the Wii released two versions of

these evaluation one where as you

can see in the weights collar

where we gave more weight today

a user experience

the user interface and on

an axe flight exactly the same model

exactly the same scores but where we

gave data preparation I

always because a lot of you out there

I don't really have complex

odata visualisations complex

analysis but your data of

Euro date is all over the place and

they are you spend 80% of your time

just massaging massaging

the data before you can analyze it

so that's that's kind of how we

see the role of big data and the role

of a native could do but distributed

a parallel architecture this

is how it all comes together so

I'll die addresses and

user self-service more

of the business agility Big

Data addresses more

of the data available addresses

agility with that schema

onry top of the new

term because used

to describe this convergence of

a job behind the data so we now

call Data Systems of inside why

did we choose the term Systems off

because that that is a very familiar Toronto

to most of you already use the term

systems of rappers use the term

Systems off engagement so systems

of inside was just the natural way

to the name of that Next

Generation the sandblaster Alaska

the closing remarks on the best

practices so make sure

that you'll be IHOP locations on

North Standalone make sure the day they

are tightly integrated and embedded

into your operational application

and processes that's the only way you

are bi apps will become contextual

and they will become actionable and

definitely and I have

to admit there are none of us do a good

job of this continuous learning

and Improvement because all the sudden no one gets

it right from the start so foreign

foreign process processes

and best practices that whatever results

you get whatever outcomes you get from

from the inside so

make sure you document them you

categorize them as a positive

negative Etc and

then the next Generation you'll

learn and improve so

2 to close off here

is some high-level differences

between older early

Generation VI where we really just

concentrated on the

technology for the sake of Technology

on the single version of the truth for

the sake of the single version of the truth the

day we clearly realize it's

not about that that is really all about winning

serving and retaining your customers

and we typically see that that is much

better executed when I'm

not a line of business

Executives oci Moses

and the VP of sales own

a bi and systems of inside

environment underwear she owes and their organization

I really just supporting that

environment and that's why we have for Salida

the term business technology enough

as opposed to Information Technology because

it's just acknowledging up for the sake of Information

Technology for the sake of business

results and as you saw from my

closing slides and

I'll take a look at these last best

practices make sure that your

bi in Norman even if it's

well architected an

well-oiled and well deployed if

it's I load if it's not embedded

into your other operational applications

it's not pervasive it's not confection

Perfection herbal so

with that in mind let me turn this over to

my colleagues

I think Creon QR next

yeah I'm just going to Steve's going to jump

in here and just do a quick another

pole here thank you Boss I was very insightful

so viewers

if you look if you're not in

full screen mode on the underneath

of the viewing pane there you'll

see a another tab pop up for just

one more pool and this question

is how do you plan to give access

I'm sure how you

plan to give users access analyze the data option

A is development tools such as parking map

reduce option b or c pool engine

such as high as Impala drill Park

sequel see if I'll be out

tools do you use to do

Meadows distributed by platforms which

Forest has spoke about and Annie

and I you could specify in

the comment section if there's some other way you're thinking

about giving access to users

of your data Laker Hindu

based platform

so just take a minute to tally

the results from that again it should

be a tab next to where it

says ask a question is 1/4 vote

you should be able to

but your votes in there

and I should update in real-time we got quite

a few people on so I'm looking for

the results to Tremont here

looks like it is distributed

bi platform says the lead but

out traditional bi tools is coming up

we got sequel engines in there

in the race as well

alright well

just give it another couple seconds here those

are still coming in

guess who I think let's just leave that open

is that there's still a number of you on here I don't want to

delay the webcast anymore

but some

let's go ahead and turn things over to prions

Pacific you get up alright

thank you Steve thank

everyone for attending and I

will actually Jump Right In and pick up from

Premier bars and Adventures

this notion of system

of inside Alyssa Hernandez

push and pull like he that he

touched upon between being agile

versus having having

having some

sort of a centralized architecture

which has election in the number

of silos not complete elimination but

but but silos especially when

they go go go to the extreme

of Excel spreadsheets

on every business users desktop do

not do not necessarily

Health connection

to how how how how a

lot of you in the audience would have a good

have seen their pictures right

now starts out from debate

houses are the large-scale databases

where did I from a whole

bunch of stuff from

here they are movies and summarized

into various

projects Pacific data marks from

the moves into SLB

I servers and cubing agencies

are in memory engine specs that summarize

the data from the sequel Baystate of mods

into into another trr8

it's cashed in memory for actual

inside and Fracture usage

by the end tools eventually

the end-user your business partner

your business analyst or the end

of business user actually start getting getting

access to the day off and

you know that

you know this over here to

this pipeline is

loss of fidelity because you lose access

to the granularity of the data

in the in the in the in the lower systems

of course the Actors Studio time

they die evening are not in the

realm of an architecture like this but

an interesting side effect

of this is a high security clearance now

you have same copies

of the same versions of that information across

multiple systems and multiple

body systems describe are or

more controlling how security

is applied to them as

well as the height issue here

as most of you are probably thinking to yourself is

extremely simplistic even

from even

from the most most most early

users of even

for the most simple users of the author of these

are the reality

is what you find is

that are hundreds of Texas you will find

this kind of an architecture replicated

hundreds of times and if you add

the fact that that is that

is very Shadow systems that you are not even

aware of the damn dude in the purview of

fall off of Managed IT

system that could potentially go mm


having the movie

The Help of silos is good but not then

it leads to so many copies of

the data and and completely unmanageable

architecture especially

if you're trying to build a scalable

system backpack that enables

and generates actionable inside as

Morris was referring to in

the cloud on premise you

start out with a day late

and you start moving in

managing the large amount of data across

these different sources

into this

but I want you to realize as you as

you start playing this is that the consumption problem

the problem of consuming data

out of the Snake Still Remains you

still take some out of that move

it into a traditional systems before

after 4 or 5 years you

get your analyst getting access

to it and what that does especially

for systems that are stuck in pilots

and you been doing long by 6 photo

for a while is that good impression

of the delay it started out with

each other with an important goal

of being able to combine data across

completely different to

the dump in like that

doesn't help how you how you how you

are in a burning inside out of the system

because you still stuck with the same consumption

side on the business intelligence and

analytics pictures

so how does Arcadia is it it's

a very good with technology.

What do we do to Estes


that lives right next to

where the Beatles on

individual notes on elastic

tears with lupus finally done directly

on the last 2 years not require for

the copies of the day. We created

one that enables is

yours for you to bring your business

users father's use cases in your

analyst directly on

the lake and cut short

the time it requires to move

data to the different systems and to summarize

Data before the end users

are going to get access to it

I blew it out a little bit and then

double-click on the Ion on on how this

works of

architecture imitation

architecture and

underline a dupe system or

and underline the execution

engine in as

an operating system celebrity distributor

execution that's available in the cluster

you leverage the data storage

that could be stored in the body system

could be stored in objects towards

the metadata and the security

permissions and I'm policies that are to

find out what their what you

had on top of the SQL

engines that are available in

the systems is a date and

Dave b i compute engine

this is this is an in-memory engine

that is distributed it's MVP

runs on every node and

it leverages in memory of each

of these notes

if I slicing and dicing that

is required to enable the

end users to go from

exploration analysis to

editing population these these

dashboards reports and eventually

immersive applications are actionable

applications available to Skype or

two large numbers of architecture

does your environment

then when you're than your thinking

about enabling your

business partners or your business users an

analyst on on on data

that is as far as as

high in school as well as

it starts moving as strongly associated

with the assistance of inside

round that why

doesn't really matter in why does it matter to

actually have an architecture and you want when

you want to try to scale what

what does the date on it if I could actually buy you

it is about a

boy stuck it out and and and then

describe which is you

have to you have to bring agility

from the individualized

systems that end users are using and

still bring it to do a large-scale

okra in business agility is the

key over there you can

enable users

to not start out by flying these

models on the data requiring

them to essentially do what what

is the screen on Droid to make it

in there mods or or tubes

or extract you can still maintain that

email and read capability and

enables users to start with

a simple exploratory visual interface

and at the same time have

an architecture the bottom which which continuously

models for performance the

data that is in you say monitors

Martin models that use H pattern

off to you or off of the

system to enable fast

and high concurrency access on Stadium that's

the first benefit

is that you bring identity back to

this large-scale update

on the scale and let

it all the time that I like somebody's was inside

Sterling when you

when you drive your inside you drive them into actions

you want them you want to enable your

business users this actionable

applications that are embedded in

skin something that they are generally can human regular

work so it starts out absolutely

from the traditional production quality dashboards

but it moves very quickly into animals

have customer application nothing but it

doesn't make the distinction between when

the data was generated is the date I was was

generated in was deleted

in real-time versus if

it was available making

a batch of the intercession it still

incorporates both of these

dreams right into the application that

allows you to move Beyond basic charting

and basic analysis and allows you to embed

analysis around micro segmentation

time CDs and event analytics

right into the end-user

who can actually act upon it active

on the inside and drive

the business spotlight

let's focus first on the agility

side what does agility enabling

in an environment like this what do you running directly

on your Hadoop cluster or

in a cloud Caster the idea

is that you don't have to come

in I T perspective depend on

the creation definition of

cubes in you don't have glue scheme on the

right which is essentially the definition

of cubes are xpax you

should be able to do semen Reid start

out exploring the data directly


the scale

of 1 hundreds of millions of records

and to enable died in

a manner that you're hundreds of business

users are also going to get access to in this should

not be locked to the Stu the

small group of the small set of

of of advanced data

scientist who are who are capable

of using the system

how do you do that actually wear Arcadia

architectural on Smart

activation comes in the picture

via eliminate dependence on cubes buy

a three-step process if you look on the left


system Arcadia sitting

on the Hadoop cluster monitors

Aquarius that are being fired

off from an exploratory perspective

of a conjunction perspective from

DUI by monitoring

these crazies you build

intelligence you build the right

amount of information to be able to recommend

right beside

the bones of the raw data that

should say it back in the same

system and what happens with the recommendation

engine is that it automatically goes and creates

these leaves the right

forms of the data that we call

Emily reviews that has backed

by hdfs back by back

by the SD Storage in

memory so now when the greatest

comeback again

or the application gets consumed

by a large number of users the

results are cut short by

accessing this faster did art

forms are Disney's in Marianna slicing

and dicing if you generally associate

with a copy of the data are outside

your system into the ice at

work at the bring directly

to the to the data sitting next

to the rodeo this

is really where you eliminate your dependence

on on on on multiple copies

of the data and and enable the

exact same kind of experience

on a large dataset and

Benedict open the kind of use

cases that you can enable for

your business analyst sorry for your end business partners

to go against

access to all data is an important aspect

of agility you don't want to limit

your users to think about is

the data inside my data

or a nautical order relational system

what is the data being accessible

to an SQL engine that is native

to the hoop system around drill

Impala or Borax

or glue baste

ingestion or is it

available to spark but it is being processed

in real time

are you at are all the data is available

to a search Babyface net exam sitting

in a solar bizna elastic index

or in an SD or

no sequel system like mango and hbase

the idea is that you want

to cut short the multiple-step

Saturday. Needs to take before it

can be Blended and combine for

business inside and connect directly to

the source of the date that

you need what what what drives

agility in an architecture in

an architecture like this

the second part is that on applications

to drive without a big internal application


your end users as well

care you combine

data from real time as well

as historical systems into the

same kind of interphase how

many times have you used your

alt Tab Key

combinations on your machine to go between

two in one that will do your story

for charting you go to into

that gives you a little bit of

access to be her due date

and then you go to to number 3 which is giving

you a personal real-time access

to the screen moving between these

tools is something that Sandy

and inside the end-user is trying to get and

you don't have to do that because the dirt

with a deed-in-lieu architecture you get access to

all the way that I do you spell time or historical

side-by-side cases

around connected Vehicles

where you're trying to fit it to me in Suriname

real-time 7 security yes I'll talk about later

as well you know then

goes and does ghost words

Cedar Lake it starts out initially

as a as a management

tools as a management platform to

just tore the Rotator is becoming

more and more like a data stream that flows

through it in real time and

some of it needs to be accessed and

leverage in real time for it to

be actually action about otherwise it

doesn't have much value as it becomes older

the next is reality also

say that debate league does not

exist in isolation there is data going to

be in your traditional daily basis and

you should be able

to blank the data you should not be required

to move all the data out of

those systems of record or doors

conditional sequel

databases before you can actually

enables users to combine

that with the new screens

of data being added into a day late

and being able to do that cross connection

data plan is an important aspect of

an evening and evening an application

experience for the end users

all this comes together when you are

combining these different

views into the data we have time and

historical and building what

we call our work flow

driven applications applications

there not a single dashboard

that the user is using just to get a singles

you into the data but they give you multiple

views into the day that one video time

and historical something else coming from

the space-knight stoop

to give you an experience that is much more immersive

and it fits it ships the

consumption tear from from

what can be done nearly with

the NBA basketball game

this explains all

this this is Christ to touch upon just

a small set of four of what what

are what are Enterprise customers

are actually doing with the with

the lack of space.

Oz from companies like Procter

& Gamble that they are enabling hundreds

of grand managers to understand the

campaign intelligence that during

the day and then intelligence using micro

segmentation and a b testing to understand

how the bands are performing globally or

for example is NuStar where they're building

and embedding this kind of an

application experience into ass ass

friend that's available

to different customers for marketing attribution or

you look at HP Enterprise reflective

maintenance and understanding the customer

used engine failure detection off the servers

and off the devices out in the field

is the driving Factor behind

using I need a platform of

the new the new architecture

and that scared of potato

go to eBay and talks about the

usage of the same platform to

drive cyber security analysis United

States of responding to Insider

threats or two things you can Alice's

that happens when you combine date of somebody

else time is it at the story two sources bring

over to Royal Bank of Canada you

talking about using

60 Detroit

volume that day that they have to capture

and the combined electronic communication

to reduce the Regulatory

and compliance fines

that's that organisations their size

are constantly facelift

find Healthcare Cypress

Kaiser decor decor use

cases happen to be around controlling readmission

risk for your patience for the for the 10

+ million members that

are insured by an organization that Kaiser

Permanente they are utilizing

the longitudinal data for this patient to

understand and control the readmission disc

Whiting better Healthcare experience

for the end of

the day examples go beyond this

light but these are just some representative

examples across industries of

what what what platform

for the water system of inside of the day

like for an enable

if you if you are if you

apply business-to-business

so glad you

took it over to it

this is an interesting guy summarizes

are or captured water

what organ organization &

Gamble actually think so I think

of these God is distributed to have

a non cluster architecture

is really what the work what

sets it apart from from

having something that requires you to work

the system makes it

makes a lot of sense from an

architectural perspective and as we saw your

ad makes a significant amount of impact

of the year in waiting at the front end of your

business or you trying to reduce

the risk for your business video status for

your customer service come out

in summary think I want

to I want to answer this life but just captures

the three the tiki bitters if

you think about it think

about next gen VI and how you're scared

you'll be I am at these

larger it's getting late

MP3 systems are are are collecting

more and more I will send you

want to do you want to not fit

the same model that has skid loader

has essentially gone away

from agility around building this is

multiple copies and cute


experience application that allows you

to take action know just know

just to explore data

for the sake of exploding and do

that why having as a simplified

architecture that eliminates a lot of the complexity

the execution sitting next to work

today at 6

with that are the transition over back

to Steve to attention

I got thanks for that over to you Priyanka so

we've got some time here for the

questions of come in and

as I'm doing that here we've also got

some links to some of bourses research

in addition to the Forester

way that he spoke about some other research talking

about scaling business intelligence platform

so the first question looks

like Boris is my day going for you in

your research you talk about and

you talk a little bit today about the difference between custard

and highly parallel distributed

be I can you spend a little

more time talking about what that difference is and

how it's different yeah

I think they have definitely

a couple

of years because whenever we talk about with

the Duke class there's all this today are

a highly parallel and distributed out

of class there's but I think in the Old

Days Inn in a few people still think of a

class there's is basically load balancing


of Underwood's where

each node may be running a

full set of instructions

basically the entire program

but it is a

load balancing know

it just kind of distributes distributes

talks to at least be

talked about in carnival

natively architected massively

parallel processing

architecture of the software natively

is designed to be highly paralyzed

where each note only

performs its own task

I does opposed to running through

the whole program it's only performs a

small part of the desk and there are specialized

knowledge that you are angry gation

over the top so basically Academy

download multiple rate variations of map

and reduce type

of software so basically

at the end of the day.

Kind of a parallel highly distributed

architecture you

know has has Mike much higher theoretical

limit when having your nose

can you scale out to versus the older

generation clusters

got it makes sense thank you

cool guess you had a questions here in if

anyone else has them please answer them in if

we don't get to we can follow up with you directly as well

as Priyanka

start on which is what are some of the use

cases you explain song but talk

a little bit more about needing to get access

to all the detailed data are there are other

examples for that's really important

yes I can I can I can actually

go down and do a few few

details that make it very clear I

was like let me down let me know where where

where is detecting

inside a text responding to incidents

in real time and to put

on Forensic analysis on your network

CR users your endpoint is

one where Larry

just summarizing out what

is the number of of incidence

of tax that occurred in my network will

not will not get to the root of the

problem now you have to get to the individual

behavior analysis with

that they are doing good


having access to the data in

real-time really McCarthy lose

his cases another example

is with with

the granularity is is

the fee for starting a

certain vehicle or truck that

is very sending iot

beacons back to the structure

you have to be able to look at the

behavior of that individual track to be able to

say that is on

that pretty clear lake that's another

example on the iot side and


is that the try to catch violations

in real time or

what happens is when you try to combine this

with granular communication information around

the time a trade is made what

other information is that

trailer having access

to and then being able to combine the actual

trade cigarettes

with that other guy I need the information is

what enables you to catch much broader

set of fairness or or or potentially

fraudulent behavior and actually an

example that again being scared

really really is

required if you have to if

you did if you are to get any meaningful

inside out of it

very good example thank you in

a minute or two left here a couple other questions and

maybe these are quick can Arcadia

run on cloud systems talked

a lot about to do double click

on other systems and how the Truck

Works Chuck

yes I did

it does run in the in

the public clouds across across

all the available public clouds as well as

in a hybrid in

line I meant the the Bainbridge

Reebok in the in the clouds as we are

we are using the data native

aspect of our technology and reading

letter directly from the object stores

from the from the roster for the day. See if you

think of Amazon environment that

would be being able to keep

the data in S3 and

still being able to do if you sing

and I sing some of the eye and visual analysis

perspective that's how we enable

that experience in the cloud technology

that the MP TBI technology is

is is Big so that

it's bad and it's bad at all whether you're running

it thanks for including


very good okay


one question left anything Boris

that you like that I know we talked a little

bit about some of the use cases for detailed

at any other things you thought about that you like to share

the audience when do the last question


cybersecurity I think

of examples like that

just and financial services when

looking at the end

of the balances or any kind of aggregation

will definitely know what help

you we dial to fraud if

you transfer a million dollars

in the end of the day

0 but it's really the detail

transactions are you need to be

at analyzing another example

if we run into all the time is

when you look at a 360 degree view

of a customer you

a customer service interaction

record may have a button check

that says that this is a a satisfied

customer because last time you interacted

with that customer to call something of the customer

satisfied but then

when you bring in social media you

know over the last 24 hours at customer may

ever put something on Facebook you

know express your frustrations with your

particular product so again

if you net both out is

that customer sentiment neutral

because there was one negative and one doesn't

really tell you the full picture really

need to understand the root cause we're behind

that negative sentiment from

social media examples

like I'm

guessing your time and financial services cuz

you Oughta thoughts on that

pool table we're actually just

a minute at the top of the hour so there's

a couple of questions will follow up directly with those

folks thank you very much for

us for joining us today is our gas we really

appreciate your time I think you Priyanka

is wealthy overview and thank you to everyone

in the audience for joining today hopefully

you got some value in education from today

and will follow up with some more of this research

thank you very much and everyone have a great day