Unlocking the Power of Your Data Lake

Webinar Aired: May 31, 2018

You have a data lake — now it’s time to unlock its power. Register for the upcoming webinar “Unlocking the Power of the Data Lake” to learn how.

As Hadoop adoption in the enterprise continues to grow, so does commitment to the data lake strategy. Two-thirds of Database Trends and Applications readers are either implementing data lake projects this year or researching and evaluating solutions. Data security, governance, integration, and analytics have all been identified as critical success factors for data lake deployments.

To educate this growing audience about the enabling technologies and best practices for unlocking the power of the data lake, Database Trends and Applications is hosting a special roundtable webinar.

Transcript

[00:00:00.599]
welcome to today's Round

[00:00:02.600]
Table webcast brought you by hbr Arcadia

[00:00:05.200]
data and procona I'm

[00:00:07.400]
Steven fake directory database Trends in applications

[00:00:10.000]
and use her I will be your host

[00:00:12.099]
for today's broadcast a

[00:00:14.500]
presentation today is titled unlocking

[00:00:16.500]
the power of the data Lake before

[00:00:18.800]
we begin and when it's when they can be a part of this broadcast

[00:00:21.000]
there will be a question-and-answer session

[00:00:23.000]
if you have a question during the

[00:00:25.100]
presentation just type it into the question box

[00:00:27.100]
provided and click on the submit button

[00:00:36.799]
one like if you work today will win a

[00:00:38.799]
$100 American Express gift card the

[00:00:41.500]
winner will be announced at the end of the event to

[00:00:43.500]
stay tuned to see if it's you

[00:00:47.200]
now can you use our speakers for

[00:00:49.299]
today we

[00:00:52.799]
are at

[00:00:57.100]
Arcadia data and marketing

[00:01:00.700]
manager at Strathcona.

[00:01:11.000]
thank you very much Stephen

[00:01:13.200]
and everybody like

[00:01:23.299]
integration because

[00:01:25.700]
we work with a lot of customers and we see

[00:01:27.799]
Accord

[00:01:36.299]
LX have a variety of volume

[00:01:47.299]
because I think those are if you

[00:01:49.299]
like table fake challenges everybody

[00:01:51.900]
will will need to address those and

[00:01:55.000]
will be addressing goes beyond

[00:01:59.400]
the capability volumes

[00:02:03.299]
challenges

[00:02:11.000]
typically used as we just

[00:02:13.500]
got a date tonight with our clients

[00:02:15.900]
and and an opportunity diver

[00:02:23.400]
steps of data I

[00:02:35.699]
think Define criteria of a database

[00:02:38.000]
is a data needs to be entering

[00:02:41.300]
the day for leg with minimal agencies

[00:02:50.699]
cuz that's what we see us as well

[00:02:52.699]
as one of the two challenges none

[00:03:03.099]
of your face today that

[00:03:06.099]
I give it 30 days I think

[00:03:15.400]
I might support mobile UK this weekend

[00:03:17.400]
um across marketing machine operation

[00:03:27.500]
a government it's all over her

[00:03:30.400]
legs support many users gauge with Jake

[00:03:32.699]
and legs are crossing the Street Copperas

[00:03:36.400]
Cove there across everything there departmental

[00:03:52.599]
use cases May in fact be using

[00:03:55.000]
a scale

[00:04:04.500]
a database make a green prom or at are

[00:04:06.599]
they on a cloud-based equivalent

[00:04:08.800]
to like

[00:04:15.900]
using a file system as a

[00:04:18.000]
destination and

[00:04:20.699]
I think that's even the more dominant

[00:04:27.500]
more flexibility Bible

[00:04:29.500]
talking about large diverse at the

[00:04:31.500]
data after all so

[00:04:33.600]
from that perspective but

[00:04:42.699]
we argue even more

[00:04:44.699]
like

[00:04:52.000]
it's available

[00:05:04.300]
as a service pay-as-you-go scalable

[00:05:07.899]
relatively easy to

[00:05:10.100]
do a valuation

[00:05:12.100]
side what

[00:05:15.699]
does it look like

[00:05:17.300]
we see the variety of sources

[00:05:19.399]
write a 4th for your

[00:05:21.399]
day so that you may see a substance abuse

[00:05:23.399]
or two but if we look across

[00:05:25.600]
the board across the many use cases

[00:05:27.600]
that we support we can loss

[00:05:29.800]
of use cases that are

[00:05:31.899]
bringing a variety of

[00:05:34.000]
data along the lines of what's going

[00:05:36.000]
on this picture so we see XML

[00:05:40.000]
file data feed

[00:05:42.399]
fancy coming from external

[00:05:45.199]
sources can

[00:05:53.399]
be so far they can deal with manufacturing

[00:05:57.699]
industrial kind of day so I could be

[00:05:59.699]
sensitive from your coffee

[00:06:01.899]
machine could be tons of that I built

[00:06:04.100]
into modern automobile

[00:06:10.399]
streaming data coming in from

[00:06:12.899]
a variety of sources

[00:06:20.000]
it's call it a geographical

[00:06:22.300]
location based coming

[00:06:24.600]
in multiple datapoint all feeding into

[00:06:26.600]
the day to make more

[00:06:32.399]
traditional sources but

[00:06:36.199]
in some cases it is some

[00:06:38.399]
of the most important data organizations

[00:06:42.399]
are trying to organize an organization.

[00:06:45.300]
Trying to analyze because some

[00:06:47.300]
of the core processing

[00:06:50.199]
some of their operation

[00:06:52.800]
traditional

[00:06:56.800]
databases in application

[00:06:59.199]
and

[00:07:01.500]
Supply Chain management

[00:07:04.800]
how would we look across those different

[00:07:06.899]
data sources and we look at what kind

[00:07:09.199]
of day to feed into that

[00:07:11.199]
comes in from the various

[00:07:13.600]
systems are

[00:07:15.800]
usually fairly Alpha Theta

[00:07:17.800]
coming out of various

[00:07:20.100]
sections

[00:07:29.300]
on existing data

[00:07:31.500]
points does not happening

[00:07:33.500]
on Twitter

[00:07:42.300]
feeds you name the night all new

[00:07:45.000]
datapoint everything is out of there.

[00:07:47.800]
You think about some of that traditional

[00:07:50.399]
sources weather

[00:07:52.600]
is the Erp use are the traditional

[00:07:54.899]
databases day

[00:08:04.000]
for Lakes relational

[00:08:06.300]
database Technologies in

[00:08:16.199]
order to update or delete rows

[00:08:18.199]
for those kinds

[00:08:20.199]
of we're

[00:08:28.399]
going to have to be able to manage that we

[00:08:30.600]
going to have to be able to deal with that in

[00:08:32.899]
order to make today for like eight come

[00:08:40.000]
in from the other sources as well

[00:08:42.700]
let's talk about some of the

[00:08:45.899]
key data integration

[00:08:48.399]
challenges into the

[00:08:50.500]
data light one

[00:08:55.000]
is security Accord

[00:08:57.200]
with the day today containing

[00:08:59.399]
a consolidation

[00:09:01.799]
of sources

[00:09:07.200]
in some cases it's a

[00:09:09.200]
it's a gold mine for for

[00:09:12.700]
somebody using an

[00:09:19.100]
alarm to turn main Off Lease in

[00:09:21.500]
order to guarantee trust

[00:09:31.200]
the data

[00:09:32.799]
turn off live into some of the details

[00:09:34.899]
that

[00:09:43.600]
I mentioned that they

[00:09:47.299]
did they feel like me to resolve some

[00:09:58.299]
of those traditional run

[00:10:09.399]
a happy process

[00:10:11.700]
on that floor to you somehow

[00:10:14.299]
retrieve the information you

[00:10:16.299]
want to do continuous don't

[00:10:21.000]
miss any changes you don't have to

[00:10:23.600]
rely on what

[00:10:28.600]
happened to your dad

[00:10:31.500]
do you want to get those changes and

[00:10:33.600]
you want to get the transactional changes as

[00:10:35.700]
well as he is

[00:10:38.200]
something that's related

[00:10:40.200]
to the trust in the day,

[00:10:42.399]
but it is very important I think

[00:10:45.100]
we underestimate sum of the value that we

[00:10:47.299]
get the traditional application

[00:10:49.899]
but if you think about it long

[00:10:55.799]
does an example we

[00:10:57.899]
know

[00:11:08.899]
if

[00:11:11.000]
we are all the lines but we didn't have a

[00:11:13.299]
messy water record

[00:11:15.600]
traditional application the

[00:11:18.000]
database will ensure the

[00:11:20.000]
transaction ality of your

[00:11:22.000]
day and you're

[00:11:28.200]
not going to call

[00:11:39.700]
because those those

[00:11:41.799]
systems those traditional systems

[00:11:43.899]
are often very heavily loaded

[00:11:47.700]
Talon security ride

[00:11:50.399]
we're talking about building a date tonight

[00:11:52.399]
with creating a

[00:11:54.399]
gold mine for data where

[00:11:57.700]
Austin preaching the data across

[00:12:00.000]
a wide area network and

[00:12:02.000]
in many cases I think at least 4 out

[00:12:04.000]
of 5 cases we see things

[00:12:06.200]
like being built in the cloud

[00:12:08.200]
when lost well

[00:12:12.700]
we better encrypt the data as

[00:12:15.200]
it moves across the wire right where

[00:12:17.500]
we're going to use SSL encryption

[00:12:19.600]
to make sure that data is

[00:12:21.600]
not exposed when it goes

[00:12:23.799]
across the wire but we also want to

[00:12:25.799]
make sure it

[00:12:31.500]
needs to be encrypted you to take

[00:12:36.299]
advantage of the Key Management

[00:12:38.299]
Service that I provide an

[00:12:41.200]
encrypted data when it's there

[00:12:43.200]
and also as it arrives

[00:12:45.500]
there Animal Hospital

[00:12:47.500]
of you not me you also want

[00:12:49.500]
to ensure that there is very strong

[00:12:51.799]
authentication in the system

[00:12:54.000]
so that it's not

[00:12:56.000]
easy to break into it so

[00:12:58.100]
using certificate using

[00:13:00.600]
key is a way to ensure

[00:13:03.200]
that kind of security

[00:13:06.500]
break

[00:13:18.899]
the success of your data

[00:13:21.200]
like that

[00:13:31.399]
may or may not happen you

[00:13:33.700]
need to make sure and you're building a community

[00:13:38.200]
whomever is going to

[00:13:40.399]
rely on going

[00:13:43.500]
to tractor data

[00:14:06.500]
traditional application onto

[00:14:18.899]
a file system that day

[00:14:20.899]
so it looks very different it's no longer like

[00:14:23.200]
how it came out of Oracle it's just

[00:14:31.299]
on my file system so having a

[00:14:33.299]
date I can pass solution and

[00:14:35.299]
give that piece of mind can give that confidence

[00:14:37.299]
that the day that is in

[00:14:39.600]
fact the correct that we

[00:14:42.200]
can try and at times we can run

[00:14:44.299]
our analytics or a date at

[00:14:46.399]
Discovery against today to

[00:14:48.500]
make the

[00:14:54.600]
graduation continuous speed and

[00:15:06.500]
can you use encryption certificate

[00:15:08.899]
and finally and two

[00:15:11.000]
other the end-user is going to trust

[00:15:13.200]
the data show

[00:15:18.600]
that Steven I was going to hand

[00:15:20.600]
it back over to you thank

[00:15:22.899]
you very much Mark I would like to introduce our

[00:15:25.200]
next speaker today products

[00:15:29.399]
and solutions at Arcadia data

[00:16:33.399]
information agility information-sharing

[00:16:35.500]
got in Rich Man being able

[00:16:37.500]
to get a complete picture of your business

[00:16:39.600]
data use

[00:16:46.799]
the right tool for the job and

[00:16:48.899]
data

[00:16:56.600]
warehouses emerged as the types

[00:17:07.900]
of staph look up for

[00:17:12.299]
certain large-scale analytics in data

[00:17:14.500]
warehouses so the story is not single

[00:17:17.700]
standard for class

[00:17:20.799]
of Technology standards

[00:17:25.099]
that help you address your specific needs

[00:17:27.400]
school

[00:17:31.799]
for the job and Technology

[00:17:43.500]
what is well as

[00:17:45.799]
we like Define

[00:18:13.000]
remote computer. It was created by

[00:18:15.400]
and

[00:18:18.000]
the environment of a

[00:18:22.700]
wide

[00:18:25.500]
variety of real world situations we

[00:18:27.700]
have real-time requirements larger volumes

[00:18:29.700]
of data and Main sources including

[00:18:31.900]
those that are alongside

[00:18:37.799]
the data so you get a lot

[00:18:39.900]
of improvements

[00:18:49.700]
take a note from the bi

[00:18:51.700]
tool can .300

[00:19:05.500]
right well

[00:19:09.599]
performs well with

[00:19:21.200]
only a small

[00:19:23.400]
set of data or or vice versa where

[00:19:25.599]
you have large volume data

[00:19:27.700]
but only a piece of Accuser

[00:19:30.700]
a

[00:19:32.700]
lot of work

[00:19:43.000]
that's required in the back and so a

[00:19:46.900]
lot of data

[00:19:53.799]
and finally the truth about

[00:19:56.700]
the guilty there really a Jillian name

[00:19:58.700]
only and they're not about self service across

[00:20:00.700]
the board and all the time self-service

[00:20:02.799]
is interpreted as the ability

[00:20:04.900]
to pay the

[00:20:31.099]
one that

[00:20:33.099]
you might consider so is

[00:20:37.599]
Dad

[00:20:43.900]
awake and scratch out of the PIN

[00:20:46.000]
to your dedicated server to the four I

[00:20:48.000]
got towards the end users and

[00:20:55.500]
data

[00:21:00.700]
governance I

[00:21:03.799]
can approach is about big

[00:21:06.000]
data for the big data by architecture

[00:21:08.500]
never

[00:21:18.500]
work with this

[00:21:30.799]
is an ongoing and time-consuming process.

[00:21:45.400]
security

[00:21:58.000]
framework pictures

[00:22:08.900]
from different angles

[00:22:21.500]
I can be a bit funky

[00:22:27.000]
you will have to

[00:22:30.000]
go to smash

[00:22:32.000]
where and then physically model performance

[00:22:34.599]
purposes what's

[00:22:40.200]
the data warehouse

[00:22:52.099]
tell if we looking for the Nativity I

[00:22:54.200]
forgot Electric in

[00:22:56.200]
the platform from

[00:23:08.500]
your system

[00:23:46.799]
know if your ice creams with data

[00:23:48.900]
warehouses Bend you might assume that the additive

[00:23:50.900]
process to someone else's problem

[00:23:52.900]
but modeling

[00:24:24.200]
your truck is because

[00:24:47.000]
what are the 12th

[00:24:49.500]
and then

[00:25:12.099]
explore

[00:25:14.599]
and discover all

[00:25:17.599]
the information you need after a

[00:25:26.200]
year and a million-dollar

[00:25:36.500]
contractor to

[00:25:39.700]
the got an error code for

[00:25:41.799]
the discovering plant bugs that are

[00:25:43.900]
next to each other by three

[00:25:48.700]
Heroes uc.com

[00:25:53.900]
optimization

[00:26:05.500]
steps Bonita

[00:26:20.700]
be our coach allowed to the scale at both of you current

[00:26:23.000]
volume level and

[00:26:27.200]
Compass satisfied on

[00:26:29.900]
your data Discovery activity Santa Margarita

[00:26:32.500]
Lake in space on YouTube

[00:26:37.000]
give me a quick example

[00:26:39.000]
for the star Graham represent in

[00:26:41.299]
Native bi architecture

[00:26:44.099]
accompanied

[00:26:47.900]
with data blending capabilities where

[00:26:50.200]
the other

[00:26:52.599]
modern and traditional nautical

[00:27:10.099]
users data

[00:27:20.599]
directly and the ability

[00:27:23.900]
to store modeling

[00:27:46.599]
remodeling is

[00:27:50.400]
created

[00:28:07.500]
when

[00:28:22.400]
the winner deployed over many users

[00:28:34.000]
lighten the overload in the system

[00:28:37.000]
then

[00:28:40.500]
we

[00:28:49.099]
work like a bad person, and then acceleration

[00:28:52.599]
engine finally

[00:28:57.299]
a white

[00:29:08.900]
out like the field and what are some recommendations to

[00:29:11.500]
be successful with them or there's

[00:29:13.500]
a video on a data warehouse optimization

[00:29:15.799]
to give me some ideas of what can be offloaded

[00:29:17.799]
from a data warehouse on to a big

[00:29:19.799]
data platform and welcome to download

[00:29:22.200]
free software available

[00:29:32.000]
thank you very much to yell at this time

[00:29:34.000]
I'd like to introduce our final speaker

[00:29:36.099]
today Rick galba product marketing

[00:29:38.099]
manager at percona

[00:29:41.299]
thank you and thanks Dylan

[00:29:43.400]
Mark as well company

[00:29:53.599]
make it called into an organization

[00:29:55.900]
to help out really

[00:30:05.700]
seeing value out of

[00:30:07.700]
it and the first thing I want to do

[00:30:09.799]
is draw a comparison between Warehouse

[00:30:12.500]
is

[00:30:17.500]
that the concept of a day Lake

[00:30:19.599]
trying

[00:30:30.299]
to drive from Mesquite

[00:30:41.299]
cuz you got that very

[00:30:55.400]
protective and

[00:31:05.500]
harder to figure out how to get

[00:31:07.599]
on

[00:31:17.599]
read and

[00:31:22.400]
when we go to actually access that

[00:31:24.400]
Jada imposed

[00:31:26.500]
some of this organization on that

[00:31:28.599]
data and start

[00:31:46.500]
out with the best intentions with a Droid

[00:31:48.500]
wallet and to

[00:31:58.299]
locate any specific piece of information

[00:32:00.500]
you're

[00:32:11.700]
getting right the

[00:32:13.900]
current the most accurate information out

[00:32:16.200]
of it I

[00:32:22.400]
have in here that might be more current

[00:32:24.700]
more accurate so is

[00:32:28.799]
it

[00:32:35.500]
is much

[00:32:46.500]
more open ended in terms of the

[00:32:48.700]
way you're putting the data in than what you

[00:32:50.799]
would find in your traditional data warehouse we're

[00:32:53.000]
obviously you know it's going into this record

[00:32:55.200]
is going to go with

[00:32:58.700]
the day like this still a necessity for

[00:33:01.299]
having some for the organization structure

[00:33:03.500]
the

[00:33:10.200]
daylight is

[00:33:20.099]
the necessity for some sort of organization

[00:33:22.299]
normally

[00:33:33.200]
want to have this don't

[00:33:43.799]
want to have a case where your day like this

[00:33:54.599]
you be looking at relational

[00:33:56.799]
and non-relational data normal

[00:33:58.900]
to

[00:34:00.900]
like teradata or Oracle or

[00:34:02.900]
something based

[00:34:06.900]
on the structure that already exist

[00:34:08.900]
but now I have

[00:34:11.800]
all of this non-relational data Internet

[00:34:14.800]
of things that

[00:34:18.800]
we can have all of that information coming in

[00:34:20.800]
and being made accessible and

[00:34:22.800]
that's the next week is being made

[00:34:24.800]
accessible you

[00:34:30.699]
may have non-relational data coming

[00:34:32.699]
in it still needs to be able to

[00:34:34.699]
be located in found in

[00:34:44.800]
and put in places where it is going to

[00:34:46.800]
be acceptable

[00:34:50.099]
come from the analytical

[00:34:52.199]
side and in again with me what they all

[00:34:54.199]
talk about with the bi cool looking at

[00:34:56.199]
a variety of different ways that we could

[00:34:58.199]
go about accessing that data retail

[00:35:02.300]
product that is very very much a

[00:35:04.599]
key piece of it I

[00:35:11.199]
want

[00:35:14.199]
to work at opening it out to some one of them

[00:35:16.300]
and

[00:35:26.199]
this is the idea of the bi-fold evolving

[00:35:28.300]
because when

[00:35:32.000]
we understand a

[00:35:34.000]
lot of how dirty

[00:35:38.800]
I told they're going to allow me

[00:35:41.199]
access even

[00:36:02.599]
need to keep it shut

[00:36:16.400]
down because we could be in danger of failure you

[00:36:26.099]
want to take the action and want to cut that puts

[00:36:28.199]
down right away from

[00:36:33.599]
two months ago 4 months ago when

[00:36:38.400]
they look to just consolidate that out

[00:36:40.599]
and keep that Jada and in some

[00:36:42.699]
ways that is usable and accessible but

[00:36:44.800]
not in the

[00:36:58.400]
video

[00:37:10.800]
of the scalability in the

[00:37:14.000]
world of the cloud so having

[00:37:16.599]
the ability to scale up as needed to

[00:37:19.199]
accommodate what's coming is your day like is

[00:37:21.400]
really cute the other piece that's really

[00:37:23.599]
nice the cloud providers

[00:37:25.900]
normally have a good an

[00:37:34.699]
actress that dated and how it's going to be obviously

[00:37:38.000]
you have lower cost storage in the clouds

[00:37:40.199]
and then in a lot of cases

[00:37:47.199]
is

[00:37:49.699]
the doesn't

[00:38:00.000]
work just get rid of it and

[00:38:02.099]
you don't have the real risk the real

[00:38:04.099]
involvement of going out

[00:38:06.199]
having a piece of Hardware configured

[00:38:08.300]
for you and and having all of that

[00:38:10.300]
additional work done so really

[00:38:12.500]
seeing the cloud providers

[00:38:14.699]
coming to the Forefront with getting

[00:38:17.599]
involved with cloud

[00:38:24.099]
is on the one

[00:38:26.300]
other piece here is

[00:38:29.099]
does the coffin of moving from

[00:38:31.099]
awake and

[00:38:33.900]
again I can

[00:38:36.300]
say

[00:38:42.400]
is

[00:38:53.500]
if you have the date of it's just totally

[00:38:56.800]
unusable manner that

[00:39:18.199]
baggage is Campbell really well

[00:39:29.099]
calpipe

[00:39:31.199]
item of

[00:39:42.199]
the

[00:39:45.099]
lake on

[00:39:53.199]
your hand we get to this point we're just kind

[00:39:55.199]
of mixing all the the luggage

[00:39:57.400]
in my knowledge it together now

[00:39:59.599]
we're moving into the Shelby.

[00:40:05.800]
Christmas time

[00:40:11.400]
State dump luggage out into

[00:40:13.400]
the terminal building and you had

[00:40:15.400]
to go through and you have to find your

[00:40:17.500]
luggage now that's

[00:40:19.800]
not organized easily accessible

[00:40:21.900]
we're looking

[00:40:23.900]
to avoid and one

[00:40:26.000]
of the big pieces of getting

[00:40:28.000]
your day of having your legs moving that

[00:40:33.300]
they feel like they can't trust a

[00:40:35.400]
couple of reasons that users

[00:40:37.800]
may feel like they can't trust the data

[00:40:40.000]
manipulation

[00:40:48.699]
and

[00:40:54.199]
other

[00:40:56.300]
information

[00:41:11.400]
different

[00:41:13.500]
ways to

[00:41:19.199]
make

[00:41:23.400]
it available to a lot of your users

[00:41:25.699]
but also highly trusted

[00:41:27.699]
and this comes back to the

[00:41:29.699]
security to come back to make

[00:41:31.699]
you certain you

[00:41:35.800]
are looking at all of the change data

[00:41:38.000]
looking at all of this information so

[00:41:40.500]
that you and your users have

[00:41:42.599]
the sense that the Jada is really truly

[00:41:45.300]
the current

[00:41:47.300]
time and most up-to-date information

[00:41:55.099]
are you stuck is

[00:41:59.000]
it going to be in probably

[00:42:01.000]
not that's

[00:42:07.500]
the preferred West what

[00:42:12.599]
you need to do you

[00:42:19.400]
might need to put some new structures in the place

[00:42:21.500]
and for a while

[00:42:24.400]
I'm

[00:42:36.800]
going to call it an

[00:42:48.900]
actual but

[00:42:53.400]
it is a gradual process overnight

[00:42:57.300]
into

[00:43:00.900]
the swamp land obviously

[00:43:02.900]
that's the time to look around get

[00:43:05.500]
some help get out of it so

[00:43:07.599]
that you can avoid the bad situation it's

[00:43:13.099]
kind of already happened and the

[00:43:15.199]
situation

[00:43:24.300]
something that

[00:43:26.300]
is is really not in a great and

[00:43:28.400]
sounds take me

[00:43:36.800]
to all of the dork in

[00:43:38.800]
the success of the talks

[00:43:49.500]
about some of these things out

[00:43:53.699]
there thank

[00:44:01.099]
you very much Rick at this point we're going to

[00:44:03.099]
moving to questions more viewers today

[00:44:05.300]
and the first question is

[00:44:07.300]
one that I think all three of you can weigh in on

[00:44:09.400]
and why don't we start with Mark will

[00:44:12.099]
data Lakes be replacing

[00:44:14.099]
Hadoop in the future

[00:44:17.400]
well that's a that's

[00:44:19.400]
a great question Stephen

[00:44:21.500]
is

[00:44:24.300]
often chosen

[00:44:26.300]
as a technology that

[00:44:29.599]
is hosting the day

[00:44:32.099]
tonight but I think it's a

[00:44:41.900]
way of managing day for us to do

[00:44:43.900]
is a technology technology

[00:44:46.599]
that is very efficient what

[00:44:54.199]
is it going to replace artist

[00:44:56.699]
is the data like going to replace

[00:44:58.800]
the do I say no I

[00:45:00.800]
do visit technology a day should I give the

[00:45:03.000]
youth gave the different

[00:45:05.099]
things understood

[00:45:08.000]
mark would you like to win and

[00:45:19.199]
in fact you know over the years what will

[00:45:21.199]
see other Technologies jumping in

[00:45:23.300]
it for that data Lake infrastructure stores

[00:45:34.400]
that all together in creating this

[00:45:36.599]
is Dad awake

[00:45:41.900]
understood into a wreck

[00:45:43.900]
your thoughts that

[00:45:54.000]
one way of accessing data through

[00:45:56.000]
the day like with that but I don't think it's going to

[00:45:58.099]
be a full replacement

[00:46:00.199]
understood our next question

[00:46:02.199]
is for Mark Mark do you support

[00:46:04.300]
data masking and tokenization

[00:46:07.800]
great question

[00:46:10.000]
the cords

[00:46:12.599]
that we talked about the day today and think

[00:46:14.699]
back about what what is the it's

[00:46:17.099]
going to be a defining criteria

[00:46:19.199]
of the day to Lake we

[00:46:22.099]
say we store data in its raw

[00:46:24.199]
form after can

[00:46:31.900]
I see the picture it's not something

[00:46:33.900]
something to

[00:46:36.199]
use the

[00:46:43.800]
ritual basis we done sometime tonight

[00:46:45.800]
Jason the

[00:46:55.900]
context of a day to Lake

[00:47:07.699]
transportation

[00:47:09.500]
modern Christian where we see there

[00:47:11.900]
is the need for security and ensuring

[00:47:14.400]
this is no no data

[00:47:19.099]
access understood

[00:47:22.000]
thanks Mark

[00:47:29.000]
yeah so if you have something

[00:47:31.099]
by to do for your using objects

[00:47:33.099]
doors as far as that that data

[00:47:35.099]
Lake environment then so

[00:47:38.099]
you'll have large files stored

[00:47:40.400]
in the stores

[00:47:44.400]
in the Forum at like

[00:47:47.199]
parquet or or which are are

[00:47:49.400]
really good and analytics

[00:47:53.599]
the other the call Nur their their crap

[00:47:55.599]
and so you know you can park

[00:47:57.599]
in and work a deal ideal

[00:47:59.599]
ways of storing your dad I would like

[00:48:01.800]
for Analytics

[00:48:03.800]
understood reconnect

[00:48:05.900]
questions for you is the cloud the best

[00:48:07.900]
place for your dad awake

[00:48:11.400]
I think the cloud is an

[00:48:13.599]
excellent spot for a day late because

[00:48:15.699]
earlier

[00:48:24.800]
is that ability to expand as needed

[00:48:27.000]
in

[00:48:35.699]
a lot of the house spiders are

[00:48:38.199]
really working to make their likes

[00:48:40.300]
much more accessible and available

[00:48:42.400]
to understood

[00:48:48.300]
thanks Rick Mark we're going back

[00:48:50.400]
to you how would you mean team practice

[00:48:52.699]
transactional consistency in a data

[00:48:54.900]
Lake yeah

[00:48:57.000]
that's a great question Steven

[00:48:59.000]
and that is quite frankly you're going

[00:49:01.800]
back to what the

[00:49:08.800]
date that looks like wood how

[00:49:10.800]
to date a kind of behave in

[00:49:12.900]
our world and we want to make sure

[00:49:15.000]
that we we're appliqued that

[00:49:17.300]
day or that kind of

[00:49:19.300]
behavior into the electrical know

[00:49:32.800]
when the

[00:49:34.900]
technology can see the

[00:49:37.000]
transactional boundaries on the floor then

[00:49:39.099]
that's where I go back to change

[00:49:41.400]
data capture on my

[00:49:44.699]
chain management CRM

[00:49:46.800]
kind of resources

[00:49:56.599]
utilizing and

[00:49:58.599]
then orchestrating

[00:50:00.699]
the publication that

[00:50:08.599]
we can still maintain the transactional

[00:50:10.699]
consistency minutes

[00:50:20.900]
or every few seconds.

[00:50:33.400]
How we can help maintain

[00:50:39.000]
understood thanks Mark Arnett

[00:50:41.599]
questions for dale dale said there be

[00:50:43.599]
a separate data link for business intelligence

[00:50:45.800]
analytics versus a machine-learning data

[00:50:47.900]
Lake really

[00:50:51.599]
I mean different

[00:51:03.000]
types of antelope your

[00:51:14.699]
bi tool and so many

[00:51:34.900]
different user group

[00:51:38.900]
understood Thanks Dale Rick

[00:51:41.199]
or next question is for you are there practical

[00:51:43.500]
size limitations that you could consider

[00:51:45.500]
for your day awake

[00:51:48.199]
that is another one

[00:51:50.199]
of the challenges your

[00:51:56.000]
credit limit what goes into the lake and and

[00:51:58.199]
that is definitely something that you're

[00:52:01.099]
going to want to avoid be

[00:52:07.500]
a problem because that can lead to people

[00:52:09.599]
again if I go back to my friends

[00:52:32.900]
and

[00:52:43.199]
that's different than an unrestrained passion

[00:52:45.500]
so alone but

[00:52:48.599]
not alone keeping

[00:52:56.400]
the indexing and structure

[00:52:58.699]
of the data so that it is always acceptable

[00:53:01.000]
in

[00:53:08.400]
massive massive volumes of data

[00:53:15.800]
understood thanks Rick back

[00:53:18.199]
to you Mark Mark with the technology choice

[00:53:20.400]
of a file system like a CSS

[00:53:22.599]
how would you compare date in potato

[00:53:24.800]
Lake that

[00:53:27.199]
is a great question to ask

[00:53:29.599]
a question often so

[00:53:32.199]
the answer to that question is that

[00:53:34.300]
we going to represent

[00:53:37.199]
the date to a

[00:53:39.300]
technology like 5 and

[00:53:41.300]
how it happens to be available on the

[00:53:43.599]
file system that we commonly used for

[00:53:47.000]
days are like that includes HD essentially

[00:54:03.699]
flag data difference

[00:54:07.199]
between source and destination

[00:54:20.300]
candidata like we're at as I mentioned

[00:54:22.400]
with the norm additive sources

[00:54:24.699]
that are required to complete power

[00:54:26.900]
but because it's high because it runs

[00:54:29.099]
in the cannibal

[00:54:32.599]
and yeah

[00:54:34.800]
we can we can still run

[00:54:37.300]
comparison even though the

[00:54:39.300]
destination is a file system even though

[00:54:41.300]
we keep changes

[00:54:44.800]
on the fire system we can still do

[00:54:47.500]
that comparison with the current state

[00:54:49.800]
of the solar system great question

[00:54:52.800]
understood thanks Mark for

[00:54:55.900]
you what's the learning curve for adding a new

[00:54:57.900]
bi tool to the stack

[00:55:01.599]
there

[00:55:14.300]
are existing. You

[00:55:26.199]
if you wanted to play getaway can get

[00:55:28.199]
a lot of value for you expect much out of

[00:55:30.199]
way cuz then you know people

[00:55:37.800]
but overall

[00:56:03.000]
under good deal for

[00:56:05.900]
you what are some of the common data like

[00:56:08.000]
you use cases that you see I must clients

[00:56:10.400]
this

[00:56:15.400]
area being another thing where

[00:56:17.800]
workout data from multiple

[00:56:19.900]
sources and looking

[00:56:22.099]
to compare

[00:56:27.300]
data from different sources together

[00:56:29.699]
and disallow potato

[00:56:32.500]
like allows for the flexibility of

[00:56:34.500]
accepting the data from different sources but

[00:56:39.699]
still bringing it all together into

[00:56:41.699]
one place where I can really analyze

[00:56:44.400]
and and looked at it and compare one

[00:56:47.500]
against the other one of the big

[00:56:51.099]
lake kind of the replacement for big data

[00:56:53.199]
and

[00:57:00.699]
I think big data is just

[00:57:02.699]
another component of the data

[00:57:04.699]
like there's going to be certainly

[00:57:06.699]
need for that

[00:57:17.000]
concept of big data and

[00:57:26.900]
allowed us to start accepting more

[00:57:36.800]
than one common place

[00:57:41.099]
understood wreck all

[00:57:43.599]
your

[00:57:47.500]
questions but as I stated earlier all questions

[00:57:49.900]
will be answered via email I'd

[00:57:52.199]
like to thank our speaker today Mark Bonneville

[00:57:54.500]
she's technology officer hbr Judith

[00:58:06.300]
presentation or send it to a colleague

[00:58:08.400]
please use the same a URL that

[00:58:10.900]
use for today's live event it will be archived

[00:58:13.099]
and you'll receive an email tomorrow when the

[00:58:15.199]
archivist post it if

[00:58:17.500]
you would like a PDF of the presentation you can

[00:58:19.599]
click on your resource icon on the console

[00:58:22.699]
now as we stated earlier Just

[00:58:25.000]
for participating in today's event someone

[00:58:27.199]
would win $100 American Express gift

[00:58:29.400]
card via

[00:58:34.900]
email so you can claim your prize thank

[00:58:37.599]
you everyone for joining us today and we hope to see

[00:58:39.599]
you again soon safe

[00:58:44.500]
Web cast