Complying with financial regulations doesn't need to be so difficult.
How Nordea Uses Data to Comply with Financial Regulations
Complying with industry regulations is a deep-seated function for financial institutions, but the ever-changing nature and nuanced complexities of these regulations can be a strain on resources. Today’s institutions are turning to big data to dramatically reduce their response time to regulations and free up resources to focus on revenue drivers.
In this webinar, Executive Vice President Alasdair Anderson from Nordea Bank AB, a Global Systemically Important Bank (GSIB), will explain:
- How the combined solutions from Cloudera, Trifacta, and Arcadia Data has allowed Nordea to more quickly comply with regular and ad-hoc compliance reporting requirements.
- The benefits that self-service has had on its business, and how the use of these technologies has spurred innovation within the organization.
- How to leverage these technologies via a short demonstration, with a focus on identifying and resolving financial data quality issues
Moderator: Paige Schaefer
Panelist: Alasdair Anderson
Executive Vice President | Nordea Bank AB
Panelist: Steve Totman
Financial Services Industry Lead | Cloudera
Panelist: Shant Hovsepian
Founder & CTO | Arcadia Data
Paige Schaefer: All right. Hello and welcome to today's webinar. How Nordea uses data to comply with financial regulations. I'm Paige, your moderator from Trifacta.
So today we're going to hear from two speakers with Nordea Bank, one of the largest [00:00:30] banks in Europe with over ten million customers in the Nordic region. First we have Alisdair Anderson, who has over 20 years of big data experience, and leads a 400 plus team as the head of data engineering at Nordea. He has been focused on building the next generation of data platforms at Nordea in order for it to become a market leader in regulatory compliance.
Additionally, we have Paul Lin who has deep expertise in the banking world and is helping execute Nordea's vision as a product engineering team lead. We're excited to hear from them both today.
[00:01:00] Our technology sponsors today are Cloudera and Arcadia Data in addition to Trifacta. With us today we have Steve and Shant. Steve is the financial services lead at Cloudera and Shant is the co-founder and CTO of Arcadia Data. And of course, myself.
So we're just going to start with a high-level overview of what these technologies do before Alisdair and Paul jump in with how they're being used at Nordea. Steve, I'll let you explain a little bit about Cloudera.
Steven Totman: [00:01:30] Also thank you very much Paige. So Steve Totman, financial service leads of Cloudera. I apologize for the British-American accent. I'm U.K.-based originally and I moved to the U.S. So at Cloudera we believe that data can make what is impossible today possible tomorrow. And to that end, the Cloudera EDH, the enterprise data hub hired by Hadoop is a big data platform that enables machine learning and the data analytics. It can then be deployed wherever your data resides. I've got all my programs in the [00:02:00] cloud. So the world's leading organizations are using our fast, easy and secure platform to transform vast amounts of complex data into clear and actual insights. And that grows fast the business decisions and all our costs.
Paige Schaefer: Awesome, thanks Steve. So a little bit about Trifacta, we're the industry-leading data wrangling solution. We make exploring and preparing data a self-service process for any user by leveraging the latest innovations in data visualization, human computer interaction [00:02:30] and machine learning. We support desktop, cloud and big data deployments. Shant do you want to go ahead?
Shant Hovsepian: Yeah. Hi everybody. Thank you so much for joining. I'm Shant Hovsepian, CTO and co-founder of Arcadia Data. We do native visual analytics for big data. This in turn helps companies find insights within their data. Our software runs within modern data platforms such as Cloudera's enterprise data hub as well as the Cloud and it's used for the scale, flexibility, performance and security users [00:03:00] need to glean meaningful and real-time business insights in this modern era of big data. Thank you.
Paige Schaefer: Awesome. All right. So here's our agenda for today. So we'll start with Steve Totman will walk us through some of the factors in resulting these cases that are shaping data in the financial services industry. Then we'll hand it over to Alisdair who will help us understand a little more about regulatory reporting and Nordea. Then we'll have Paul give us a demo of how his team is leveraging Trifacta and Arcadia [00:03:30] Data on top of the clutter of platforms. And finally Alisdair will wrap up with some of the benefits Nordea is already seeing. And then we'll head in to Q and A.
So with that I will hand it over to you Steve.
Steven Totman: Awesome. Thank you very much. So the next slide please. So at this point Cloudera has over 200 friendship testimonies. Our large aesthetic or our fastest growing, including Nordea [00:04:00] who we're very grateful to have as a customer. And there's a couple of key things across all of them that kind of driven this big data adoption. So the first is that data is a natural consequence of life today. Everything from cars to even people who have smartphones and watches are producing vast amounts of data. So that's really driving this massive growth at the panel [inaudible 00:04:23]. The second is that, especially in financial services there's a huge focus on the customer [00:04:30] journey. That is digital in adoption, sort of sweeps across the industry. And it's especially important in financial services because the margins are so thin. And it's increasingly easy for customers to switch, and so customer experience is a key differentiator. Actually one of our other customers not too far from the Nordea team talks about sort of taking banking bank to the 70's. They don't mean in terms of the technology, they mean in terms of the customer experience. You went into a bank in the 70's. [00:05:00] They knew your name, they knew you wanted a mortgage. They knew your family. You know, and they had a very personalized experience. They're using data to do it today.
The third is the sort of fraud in cyber. I don't need to remind everyone just how common fraud in the cyber attacks are today. And the bad guys are increasingly sophisticated. And they have a larger and larger impact. It's just a cost of doing business. And then finally, regulations. And actually the next slide speaks to this even better. So next slide please.
[00:05:30] So there's this old saying, and, "It's impossible in life to be sure of anything but death and taxes." In financial services we can add to that. The only thing which are is death, taxes and more regulations changing more frequently. So if you look globally there's a ... you know this is just, the side of this is just a very small subset of all the regulations that are facing financial services institutions. And then you've got deltas between sort of European regulations, U.S., [00:06:00] Asia-Pacific. So you know the U.S. Dodfire and Concecar is very dominant in Europe obviously and above all, initiatives, etc. You know. This is ... you know those newer regulations coming through. So across the board these regulations are getting more intense. They're changing more frequently. And the larger you get as an organization, the more markets you go into, the more of these regulations you're subject to. So it's incredibly complicated. And you [00:06:30] know you're seeing it's not just the amount of regulations, it's also the bringing of change that you have to comply with as well. Next slide please.
So what we're seeing is those core themes are driving specific use cases inside. You know especially around Cloudera these usage of the enterprise data hub. So those 200 plus center customers are doing things like focusing on customer journey. They're focusing on financial crimes or air mail type of [00:07:00] security. Cyber will group into this as well. There's whole separate area around that. Cloudera's facility in working with Intellisure's project called Apache Spot, which is worth an integer in cyber security.
On the [inaudible 00:07:14] services initiative side, there's a huge service on how you benefit and much better understand which products works well for you. And then the subject of today is kind of Webinar on risk and compliance. So there's all these use cases around risk and compliance, but what's really interesting is [00:07:30] I'm seeing customers, they see risk and compliance and [inaudible 00:07:36], things they have to do. But what's really interesting is the same data set you use at the cost center for risk and compliance, can actually be used for customer journey. And I've actually literally seen customers take billion dollar cost centers and turn them into small profit centers. So there's a massive opportunity there. Next slide please.
[00:08:00] So across these data sets, we see a real blend. So firstly, many of these institutions have done an okay job with structured data, and I'm gonna let the guys from Nordea come on their path to that piece, but what we're also seeing is an increased focus on not just structured data now, but then blending in unstructured data. And really the value comes from not just having [00:08:30] a really good view of structured data, allies, all of it, and get all the value from it, but also blend it with the unstructured data and then to be able to apply machine learning, analytics on top, which is why the [inaudible 00:08:44] can be so relevant.
So really across all the organization, you may be doing an okay job with structured data, most companies are not, but it's actually the value of being able to bring both together. So structured and unstructured into a single platform. And that's why [inaudible 00:08:59] is so special, it [00:09:00] allows you to blend both structures together, and unstructured it allows you to destroy it at a completely different cost point. It has a scheme on re framework, a scheme on lead, which allows you to be much more flexible.
And with that I'm gonna pass over to the Nordea team, so, Alisdair.
Alisdair A.: Thanks, thanks very much, Steve. And good afternoon to everyone in Europe and good morning to the folks in the U.S. So just before we step in, that's the thing, the Nordea [00:09:30] brand, maybe it didn't reach the shores of every country, so Nordea is the largest overseas organization in the Nordic region, so the Nordics is some of the Scandinavian countries, so Denmark, Sweden, Norway, Finland.
We are a global systemically important back, so we're one of the 27, 28, I forget the number, of G-SIB banks who are monitored very, very closely by the regulators. [00:10:00] Systemically important became far more than a show after 2008. The banks in this category came under much greater regulatory oversight, and that continues to be one of the key challenges, but also our responsibilities to our customers and to the companies we're domiciled, so this is really what we're gonna talk about today, is the environment in which we're regulated and how we are adapting [00:10:30] to the challenges that that brings. And next slide, please.
Okay, so the challenges, there's really quite simple, although it's silly tough. We are seeing an increase in both the level of regulatory oversight and the level of regulatory change that the requirements are made of us, and in the Nordic region we have [00:11:00] four regulators including the ECD, who we are under an obligation to report over any financial standings and rest standings to, and this is not uncommon in the G-SIB banks, G-SIB banks almost always, I can't really think of a ... Maybe South-American and Chinese banks, but almost always act over multiple jurisdictions and have multiple regulators. So we're seeing a large increase in the regulatory demand.
[00:11:30] The amount of data that we have to produce, the accuracy of that information as well, so there's a general explosion and growth. Not really due to more data, but more questions being asked of that information. So really our challenge from a regulatory standpoint is how can we answer these questions quickly, accurately enough? At the same time I would probably put this in the same category as most of the financial services organizations [00:12:00] and that's not just includes banks, includes [inaudible 00:12:03] organizations, so on and so far. Steve has mentioned earlier that the margins are being challenged and the profitability of the bank and therefore the cost space is often a source of focus, so can we, in the operational cost as well as the investment cost of producing data solutions, data platforms, regulatory reports, at [00:12:30] the same time as this explosion and demand.
And like all mature companies, and Nordea is over 100 years old with amalgamation of many financial services organizations, we are coming with a level of technical debt through legacy systems, and we don't have just one platform that does one job. We have to bring people together from many, many platforms. In our case that's a number that reaches over a thousand. So next slide, please.
[00:13:00] So we started this effort probably about 18 months ago, and the ambition that we had was really to, Steve mentioned earlier, want a cost center and to [inaudible 00:13:17]. Maybe not profit center, but certainly somewhere we could see data as a strategic asset. So we set ourselves four sort of strategic pillars that we wanted [00:13:30] to build our next generation platform to.
So the first one is cost, so the cheaper box that you see. We wanted to do a complete flip in the cost of storage and processing of data. As [inaudible 00:13:47], almost all of our processing was on proprietary technologies, not open source technologies, and we were using proprietary hardware, not commodity hardware. We wanted to flip that equation and reduce our ... [00:14:00] Significant percentage. I know there's market indicators out there of, you can get somewhere between like 5 and 99% cost efficiencies. That isn't really achievable in the highly regulated and proprietary environment that we have. But we really target something north of 70%.
We also wanted to increase the velocity of our change, so we wanted to deliver, for lack of a better word, in an agile fashion, [00:14:30] we wanted to rapidly reduce our change to market, and move to a higher velocity of insight as well. [inaudible 00:14:39] with real-time data sources, we've always had real-time data sources in the banking world, whether it be trading sources or customer contact systems. But we wanted to have real-time insights coming out of the systems, and our enterprise legacy systems are audited around the batch, and we wanted to move to [00:15:00] processing.
We also wanted to automate a lot of the development process and operational process that today is done by Schuman, so we wanted to reduce our overall people footprint and the work that we did, knowing full well that we had to do more with less, therefore we had to automate stuff that's done by humans today. But we wanted not to just replace lake for lake, I think, if we change technical platform and land it over many [00:15:30] years of investment and effort, and a platform that did the same as current legacy platform, then we really would have failed. We wanted to do something that was much better. We wanted to build a platform that guaranteed the quality of the information it produced, and what on our capability to continually improve the data quality and make sure the accuracy was higher, and therefore that reconciliation became something [00:16:00] that was then built into the data quality framework.
We wanted to have no traceability of the information, and he mentioned the basal regulations, and so BCBS 239, the ability to trace your full providence of information, so detail requirements, is something that we really wanted to do from cradle to grave. We talk about our track and trace features through our platforms.
And then the last part is very a hot topic for us. We need to do and find [00:16:30] more with the data that we have, and the explosion, an electrical potential, with another data sets, whether it be for the downside risk in terms of managing explosions or managing financial possession or a potential financial crime issues. We also wanted to leverage that analytical capability for the good of the branch and the customer. So the digital experience, the customer journey. We wanted to build capabilities on our data sets that leverage, [00:17:00] and those genetic capabilities both to protect the bank and also to do a better job for our customers.
The last one is something that we see must be built into any data platform, is that unless we can guarantee the security of the information, that the right people have access to the right data to do their job but they don't have too much access, and also we have an enhanced auto capability, then [00:17:30] we will be in a far more comfortable position with the platform that we're building.
There's an inherent risk in building big deal platforms that you're putting all your eggs into one basket and creating, I think the cyber term would be a honey pot. So we want to make sure that the platform is silly secure, that we have an ability to raise the security levels for very highly restricted data sets. So financial investigations would be a highly restricted [00:18:00] data set for the [inaudible 00:18:01]. And that we wanted to build not just a sort of ruse engine that monitored the platform, that we were starting to move toward a productive monitoring capability where we were looking for a behavioral pattern of access on the platform.
And again, Steve mentioned the Apache Spot, and that's a really interesting project that does that sort of stuff in the box, but preface by design was a big part of our success criteria. So that's really what our [00:18:30] ambition was, and thanks for the sponsors here, [inaudible 00:18:37], Paul's gonna give a demonstration of the capability that we have brought to market within Nordea.
Paul Lynn: Hi, everyone. I'd just like to introduce myself, my name's Paul. I run the product engineering team here at Nordea, and I'll be using one aspect of the secret regulation to dabble hardware or [00:19:00] utilizing Trifacta and Arcadia today. Could you go to the next slide please, Paige? Thank you.
So Trifacta and Arcadia, they're integral to sorting, and we also use them for delivering other regulations, including basal or [inaudible 00:19:18] issue, and in fact we're using Trifacta and Arcadia in every project our team is running right now. But I'll give you a brief introduction to the SEPA reg. So it's a European regulation [00:19:30] that requires payment information to be sent in a specified econo-form of ISO20022. And Nordea's goal is to be the first Nordic bank that offers a full econo flow within a bank.
The [inaudible 00:19:47] see dependence instructions is held in our payment and accounting infrastructure, and as Alisdair mentioned, Nordea's grown by mergers and acquisitions over the years, and our IT infrastructure very much reflects this. Most banks, we'd have [00:20:00] a single payment system. In Nordea, we have seven payment systems in Denmark alone. And in all 17 different systems supply information, each with a separate support team, each with a different system type, mixed form, mixed times. So I want to give you an idea of the complexity of the legacy environment.
Our end goal for delivering this requirement is, well, firstly, meet the regulation of course, support client retention [00:20:30] and attract clients as well, and we can increase a consistent audit which could be used for approved purposes in the future.
On the right hand side of this aisle, for key results, when we talk to inexperienced RDPM, they estimated this work optimistically at taking 15 business days. And this estimate was very much based on a traditional [inaudible 00:20:57]. Like talking with 17 different heads, getting [00:21:00] on 17 systems that could work, and getting 17 dollars extra. And in this traditional model, most likely the data in ours wouldn't be in the field data set, and we'd have to be making assumptions about the data, and this could be a very strong risk that there would be incorrect assumptions not identified until late in the project cycle, which would cause issues to the project.
I'm going to demonstrate now. We will determine this requirement in under [00:21:30] one day using Trifacta. Paige, could I get control of the slide, please?
Paige Schaefer: Yep, absolutely.
Paul Lynn: So now I'm going to demo Trifacta. We've really realized big productivity gains. We've got the ability to discover and wrangle data skill using this platform, and it really delivers a solution to our power problem of wrangling a data skill.
[00:22:00] So now showing the Trifacta screen, I'll walk through an overview of the screen before doing anything. So as you can see, the data is very clearly presented. And we're getting some metrics straight out of the box. So to take a column at random, we've got a horizontal green bar, which getting a metric solid values, and if I scroll to the right, you can actually see two examples for ... Some columns have mismatched values, [inaudible 00:22:29] values, and missing values.
I [00:22:30] just mean we're getting a breakdown of the distribution of values within each column, so this column we're seeing, percentage of P, W, E values. So provides lots of good information. If there's interest, you can even control time and get more detail in the column, some statistics, the graphical [00:23:00] distribution of the information as well.
So the data set we're using for seminar has got around 110 thousand rows. And we're taking a sample, just over 31,000 rows, for this analysis. And the system we take it from, it can cross current reductions. And now I'm going to do four of the transformations we did in the actual project itself. And in doing this, this is [00:23:30] something new sitting beside me, I'm able to act very interactively, very [inaudible 00:23:35] to get to the right result.
So the first transformation is in the column [inaudible 00:23:41], it's Danish for account number. And first thing there for me, and you can probably see now, is why is there a .5? That's simply incorrect. So we have to remove that value from the column. So by highlighting the number, firstly, what [00:24:00] the action would be, but I'd like to direct your attention to the bottom of the screen, to the suggestions box.
So there's a number of different suggestions. We can extract, which we're doing now, we could replace accounts' appearance, a lot of different options. And even within the same box, we can extract a different criteria. So say for example, I just wanted to extract a CM account number. So in this instance, [00:24:30] we just want to remove the .5 incorrect value. So I add that to the recipe, and then that's connected.
And then just for housekeeping, I'll drop the original column. Additionally, in Danish, you have to rename, so I'll rename the column, and once again this is exceptionally straightforward. And [00:25:00] each time in clicking onto the recipe, it's adding a line to the wrangle script that's held here. So each action is an individual move, very easy to follow. And new technique on the ledgers to do that.
The next transformation is actually on currency code. So we needed to show the three character ISO currency code. [00:25:30] For example, USD. But right now, these are represented as American values. We created a lick-up table, and then apply to that. And after those transformations run, you can very much see that there's a lot more value in the data already, so old column, it's just telling me one, two, four, nine. Here, I'm actually seeing [00:26:00] recognizable business value. I understand the PKK EUR Swedish crime, etc.
And once again, I'll drop the original column and rename. There you go. So in instances where we have a wide volume of data with small significant differences, as is shown, what's really stopping us from making [00:26:30] a consistent data set is the way data's structured, and we're resolving that right now. And additionally, we're looking at quite a substantial slice of the whole data set, and we're really reducing that risk that I raised earlier about making incorrect assumptions about the data.
The SINO transformation like to show is converting a value to string. So the column here is actually an indicator of a debit or credit, a positive or negative value. [00:27:00] So I'd like to replace, so once again I look at the suggestion box for replace, and I can see what the preview would be. And this isn't an item of the box transformation that works, so I'll do modify. And it simply is where we want to represent the debit, I'll add that to the recipe.
And I'll do the same transformation [00:27:30] for the credit. Once again, selecting the replace. And adding the value that is needed. And then finally, I'll change the name of this column. So I [00:28:00] hope I've given an idea of the real acceleration in speed that Trifacta offers us instead of having to get basically an Excel based extract from a system that's typically out of date as soon as it's extracted. Having to go over that Excel extract, obviously in Excel you can find it where there's blank values, but in a data set of 110,000 rows, sometimes that can be quite laborious. [00:28:30] Trifacta really speeds up our analysis.
So finally done the wrangling set I wanted to demonstrate, and we've now got our [inaudible 00:28:43] wrangle script. And we use this for two purposes today. First it's the golden source of all the transformations we did, so in future months when we want to find out why we did what we did or what transformation was needed, we refer to this centralized resource. [00:29:00] And second, we provide this information to our team who are creating a [inaudible 00:29:07] to inform them exactly what needs to be done on the adjust.
So the final step is to generate results, and this applies the wrangle script to the data set, so right now we're doing it on a third of the data set. When we run this process, it'll apply the transformations to the entire data set and make [00:29:30] it ready for use by Arcadia. But just to round off my intro to Trifacta, the advantages my team has seen, it really gives us the ability to have better data skill, clear visibility to data, those assessments are really keen in compared to the previous environment. They're actually quite clouded, the clarity that Trifacta offers them.
And it's [00:30:00] also very easily repeatable wrangle process. So as Alisdair referred, we got in the hundreds of systems that need to be addressed. And doing this, allows us to get patterns so we're very efficient in the adjust. It's also allowed IT to step away from owning the data wrangling process, and we can now deliver data transformation by content.
[inaudible 00:30:26] Arcadia, so in the real world, our generator's results [00:30:30] run the transformations results and got a new transform data set. And now this data set is sitting on our [inaudible 00:30:41], ready for you to spay any of her other applications. So firstly, an introduction to Arcadia.
We use Arcadia for business intelligence and data visualization of skill. It pretty much avoids and resolves the legacy [00:31:00] issue we had where data sets used for data visualization were being held locally on individual machines, because we simply didn't have a tool that allows us to do what Arcadia does. We could create a dashboard for visualization, and pretty much appropriate for the business area that needs to see the data, so we could present the profit on currency transactions to an FX trading team to get them visibility of [00:31:30] client activity pro currency.
Or an ops team that wants to have an idea of potential transaction fits. So on this dashboard, we've got three tabs that I'll go through. Overview, transactions, and data quality. But firstly, I'll walk through overview.
So one system this data set's taken from, the number of debit and credit transactions should roughly equally each business day. There will be a minor delta but not [00:32:00] a major delta. We can see that, in this particular system, there seems to be a major delta. So we can actually see the detailed data using Arcadia first to generate that lead, there's a clue there's something wrong. And now we're actually looking at the transformed data set that we worked on, so we can see the currency codes that were transformed, we can see the debit or credit indicator.
Now I'm looking [00:32:30] at the actual data set, ready to quickly investigate any potential issues. Then we could look at the systems upstream that actually feeds out of it, so we've got five systems, and we've also got five different client types, so for example we could click on a client type. Once again, we could trail into a specific [00:33:00] theme. Let's say, we want to find out what currency transactions client type E is using. Let's do that. And we see it's naturally a Danish [inaudible 00:33:15] transactions are by far the most popular, but we're also seeing activity in Euros, [inaudible 00:33:25].
And this makes the data set [00:33:30] much more accessible, it really allows us to derive value from our latent data. We're excluding it in the appropriate way to different audiences, and once again, they're really getting positive feedback. We can also present the data in a topical way, so firstly the transactions by currency, and I think you can see immediately, there's a potential data quality issue. [00:34:00] All currency transactions should have a [inaudible 00:34:02], so it's something we should be investigating.
Or easily we could be using this for seals purposes, so the use of the top 25 accounts, these are the customers we should be taking out to dinner more, because they're doing lots of business with us. Equally they could be reversed, see our bottom 25 accounts. Why aren't these customers doing business with us? What could we do to improve our level of service?
Using one of [00:34:30] the transactions, and Arcadia is dynamically pulling the data from the underlying tables, and that's why you may see a very minuscule delay when we're populating tables. So we see a transaction point in the currency. Once again, Noy, very easy to interpret, and it gives us clues. Like why are we seeing such a trough here or a peak here? [00:35:00] Was there a microeconomic event that was triggering this, for example? Or once again, is there anything we can do to improve or enhance our level of service?
We can also identify another DQ issue, like what's this right there doing here? Once again, that links back to the previously identified issue of the transactions [inaudible 00:35:20]. We can additionally slice the data at a more refined level by looking at different [00:35:30] product types. These product types have been masked for the demo, there are different products that are looking out at our firm.
So the lows, pretty much a trill to the underlying information. Core diagrams [00:36:00] are another good way of exposing the information. So we can slice by currency, the system name, and client type. Seeing connections and getting clues as to links between why different sets of data that may not have been visible just by looking at the sketches. Arcadia is configurable, [00:36:30] so we can slice and dice very much on the fly. Right now these core diagrams are primarily sorted by currency. It's a matter of sorting by system or client type as well.
The final tap that I'd like to show is data quality. So in this, we're comparing the raw data search, pre Trifacta transformation with the transformed data set, the one [00:37:00] we've just done. And I'll just sort the data. So we'll look at one, so it goes debit, credit, the transform field, and here we go. Trouble key one. We can do just a quick transformation and analysis, so we've got 55,212 credits [00:37:30] and transformed. And there's actually a discrepancy of 20 here, once again giving us a clue that something is worthwhile investigating.
So the cell service aspect of Arcadia and Trifacta means we get productive quickly. We aren't restricted by waiting for a big buying bunch delivery anymore. Once you've got a data set, we immediately start to work on [00:38:00] it. It really means we're not blocked by any traditional business and office persistence, and we can really get on with the quick delivery of business. Thank you. I'll just hand control back to you now, Paige.
Paige Schaefer: Thanks, Paul.
Shant Hovsepian: So for the remainder of the presentation, I think it's me next. [00:38:30] So as wonderful a colleague Paul is, and as brilliant a demo that was, there is a horrible truth here that this stuff is a sort of diluted dishwater. We can't dress this up at all, but there is a hard commercial reality here that we spend an awful lot of memory on these kind of tasks, [00:39:00] so in what has been the business impact and the business payback of rolling out, and not only on Trifacta and Arcadia, but technologies as well, don't wanna call that a platform. So we've been able to ... Paige, there's no slide showing, if you're able to put them up.
We've been able to realize there's some hard numbers on the back of the ambitions [00:39:30] and the targets that we set, and I can't remember them off the top of my head. Paige, if you could put the slide up.
Paige Schaefer: Yeah, if Paul could just pass the presentation back to me.
Paul Lynn: It's stopped showing, Paige.
Shant Hovsepian: Right, I can get my present up.
Paul Lynn: Apologies, it's there now.
Paige Schaefer: Great, thank [00:40:00] you so much.
Shant Hovsepian: Okay, so as I said, we were measuring this by the numbers in terms of cost and number of people, and as Paige mentioned at the intro, we have circa 400 people in IT who are actively engaged in the tasks that I see [00:40:30] one question and one question in the Q is what is wrangling? Well wrangling is really ETL. Okay, so taking raw data and refining it using transformations, in this case for business quality and also for business homogeneity and standard definitions.
What are the rules that we have to apply on data in order to do that? And we have an IT 400 people that do that on the legacy platform alone, and we have 180 just walking in change. So [00:41:00] that combined with the business colleagues that are engaged in this means that we have a huge rationalization opportunity in saving effort and the space, so the comparative results we've seen when we look at [inaudible 00:41:12], so this is traditional ETL platform to delete power platform using the tools that we demonstrated in the data quality environment, we've been able to shift from managing [00:41:30] the structure, which is what ETL folks and the model folks are sort of obsessed with, to managing the content, which is all the business users really care about, is are the actual content of our information consistent, high quality, trusted. So that's been a big change for us.
The fact that in the market space, we've been able to reduce both the change of time from weeks to days and from months to weeks and at the same time we've been able to use the scale [00:42:00] architectures available within Cloudera to take data processing into a couple of hours has been a significant impact for us. And the accredited processing we've pointed there has really taken all the assets on the bank's balance and examining them against all the liabilities. This is a basal-free standard, and we're able to take it out to 120 minutes, that's using quite a small cluster and using the spark technology, down from two days on traditional engineered [00:42:30] proprietary technology.
We've also been able to realize the agile goals that we set ourselves to moving from large monolithic issues to make services architecture a single [inaudible 00:42:43], and deploy components, even deploy them in a path and row things in an out without it being a massive wrench for us.
But the biggest kick in the bum is the Aupex piece that I spoke to earlier on. We've managed to put together an architecture, [00:43:00] and I should be quite open here that we're not replacing all relational technology, far from it. We just want to reduce that relational premium technology, a way to make sense in our data maps rather than using it as something like an ETL and generate a massive data leak. So with that combined hybrid architecture, using both relational technologies, we're still able to realize the 73% decrease in our Aupex, and that's something that we'll be looking to and realize [00:43:30] over the next two years as we move away from one way of doing things into more of a best of architecture.
With that, and with a little bit time constraints, so I'll just pause on that point and I think we move to Cooney, is that right, Paige?
Paige Schaefer: That is correct.
All right, so if anyone has any questions, feel free to enter them into the chat. [00:44:00] All right, we have a couple questions to get started. So first is one that seems like a question for you, Alisdair, what are your plans for the next 12 to 18 months?
Alisdair A.: Well there's a couple strands to that. 12 to 18 months we're basically looking to meet the platform, the default resting place for all our information and the organization required for reporting on analytics. So that [00:44:30] will mean we will be plumbing in the platform to our core banking platform, for example. At the same time, de-leveraging from the legacy platform and moving the transaction loads off of it and onto the new platform. The other thing, in terms of an incremental or an additional capability, our main activity will be in the analytic space. We will be working with both our colleagues in digital and also [00:45:00] our financial compliance program to build a scale of architecture that allows to do analytics in all our data sets.
And we're starting to see, we've already seen great success in that space and fraud space, so we'll be looking to make those analytical components available for genetic use. So what represents a sort of entity resolution, capability for financial crane can be equally issues for [00:45:30] the digital banking guide to not understand our client and bringing web-log information and all the structures that my associate Steve discussed at the start of the presentation.
Paige Schaefer: Great. It looks like we have another question about Cloud. So what is Nordea's Cloud strategy?
Shant Hovsepian: Well, we're just starting in that space. I think along with a lot of [00:46:00] financial services, certainly in Europe and the U.S. is maybe a little bit different, certainly in Europe we are looking to move to Cloud I'd say within the next two years. We are in discussions with all the main vendors, and we would see the instate of this platform for example almost certainly being Cloud-based. But in order to do that, we have to be able to be confident that we are adequately in control of our information, [00:46:30] so we can point to personally identifiable information and make sure either we've protected that such that we can take it to the Cloud sufficiently, or that maybe the bank's risk cap is we're not going to put that kind of data in the cloud.
That's not to say it's good or bad, that's just about an individual institution's risk cap, our acceptance of essentially a lot of people's computers, let's be honest, here. So over the next two years, we will be [00:47:00] starting to move toward a more Cloud-based infrastructure across the bank and be able to see that this platform will be a major beneficiary of that, so maintaining all this infrastructure on premise will not really allow us, I don't believe, to get to our instate of all the bank's data in one place.
Paige Schaefer: Awesome, thanks. And it looks we have another question. The question is, "By getting data to the end users [00:47:30] faster and more interactively, are you saying more value generated by having subject matter experts drive the process?"
Paul Lynn: Hi, Paige, it's a good question. The short answer is, absolutely. So by providing our tooling to the end user, we really are enabling and freeing up the SME to display their knowledge and experience. Nobody knows the data better than the SME. By distracting data [00:48:00] teams away from this process, we're just providing the supporting infrastructure as a means for doing things more efficiently. We're not making the same data assumptions. It's hard to have an assumption when the data's right in front of you in the three.
We're doing things more quickly, 'cause we're getting the data sets from a number of different systems keyed up on Trifacta ready for use, so I have to say absolutely. We really are [00:48:30] seeing a step change in our project velocity.
Paige Schaefer: Great, thanks. And actually, Steve and Shant, if you wanna jump in and answer that question as well just from what you've seen in your experience.
Steven Totman: I mean, from a clatter of perspective, 80% of all workloads are running in the Cloud today, so they already have free dramatic evolution there. Purely, for me, the Cloud deployment as well, they have a [inaudible 00:49:00] deploying [00:49:00] across Google this year, and even partners like CenturyLink. But even what's interesting for us just really two weeks to get a startup, we launched a clatter which was a fully hosted self-serve model as well. So yeah, it is really interesting to see how many customers are moving that way. Initially, I think they started using it more as a way of proving out the technology, but now more and more production is actually set on a panel about two months [00:49:30] ago with the CTO of Microsoft and the CTO of Amazon.
And the testament of that is applying to these by 2018 that 80% are more aware of deployment in the Cloud. It's moving very quickly.
Shant Hovsepian: Yeah, and Shant here too, the point about enabling SMEs with faster, more direct access to data is definitely a huge trend we've seen that really enables success over the last [00:50:00] five or ten years, we've been seeing things like shadow IT and data silos and the constant struggle between the business team, and sort of the back off is, especially in financial services, about governance and data access, and each one trying to do whatever it takes, either not responding to IT support tickets or making copies of data in their own places to get their own productivity.
We're finally entering an era where the example Paul gave dutifully illustrates how if you get an SME in the room together with the technical folks, and instead of always being at odds with each other [00:50:30] they can really work together and get results and solve a problem really quickly. It's a tremendous cost savings and a tremendous realization of the value of different data sources all the organizations have tucked away in a bunch of places.
Paige Schaefer: Great, thanks, Shant. And one last final question for Alisdair and Paul, based on the use case that you discussed today and the technologies that you're using, what have you learned that you think other people could benefit from?
Alisdair A.: [00:51:00] Well, the real value for us is the fact that our data is in one place and we don't have to worry about having to buy extra [inaudible 00:51:13], we know we can scale out to fairly substantial volumes and that follows a linear cost model, actually it tails up at the upper end. So the real advantages of the lessons learned is that when we're selecting our vendors and [00:51:30] we need them all to play nicely with each other, the real value of the continuity that Paul showed was that the meta-data from Trifacta is actually purchased by another tool, by the tool called Waterline, and it's inherited by Trifacta and that method is available within the wrangles cut and that's played back in through the managed navigator and available to Arcadia.
So the fact that all these [00:52:00] components are having a single conversation under the umbrella of a secured platform which is what Cloudera provides, integration with all our [inaudible 00:52:12] and identification systems, really is the big lesson for us. If we have to take data out of the platform in order to do a job and classical ETL is a great example of that, we start to lose visibility of the transformations that happen. We give ourselves a data learning challenge.
[00:52:30] We also give ourselves a data security challenge. So our biggest lesson learned from what we've done is that the value of and achievance to the API's within the platform and really the true value of open sourced based tools and the collaboration that that enables.
Paige Schaefer: Great, thank you. All right, so that does it for our Webinar today. If you're interested in learning more about the technologies [00:53:00] that you've heard about, we have a few resources listed below, so feel free to check these out. But thanks again for joining, and hope to see you again at one of our other Webinars.