Financial regulations require more data. Ready to accelerate your solution?


RegTech: Leveraging Alternative Data for Compliance

Leading banks and asset managers are leveraging internal and external alternative data sources to improve compliance and regulatory oversight.

Join Richard Johnson of Greenwich Associates and Paul Lashmet of Arcadia Data for an informative panel discussion with industry experts from Nordea and RBC Capital Markets discussing how to:

  • Rapidly access and analyse numerous different data sources
  • Develop RegTech solutions while managing total cost of ownership
  • Optimize internal data management through Big Data capabilities
  • Enhance compliance processes using alternative data
  • Meet heightened regulatory expectations around timeliness and quality of data
Nordea Bank
Arcadia Data

Moderator: Richard Johnson

Senior Analyst, Market Structure and Technology | Greenwich Associates

Panelist: Alasdair Anderson

Executive Vice President | Nordea Bank

Panelist: Shailesh Ambike

Associate Director - Institutional Advisory & Surveillance | RBC Capital Markets

Panelist: Paul Lashmet

Practice Lead and Advisor - Financial Services | Arcadia Data


Richard Johnson:

Hi everyone. We're ready to get started. Thank you all for joining, and welcome to the Greenwich Associates webinar, sponsored by the Arcadia Data, on RegTech Leveraging Alternative Data for Compliance.


I'm Richard Johnson. I work in the Market Structure and Technology Group here at Greenwich, and I will be your moderator today.


Before I introduce our panel, let me cover a few housekeeping items.


Questions are welcome throughout the webinar. Please type them into the question box, and we will try to answer them throughout the webinar as time allows. If you have additional questions after the webinar, we'll provide contact information at the end of today's session.


Also, if you hear an echo, please check your audio and make sure that telephone is selected.


Now, for those of you not familiar with Greenwich Associates, we're the leading provider of global market intelligence and advisory services to the financial service industry. For over 40 years, we've been helping our customers understand their competitors' positioning, refined product strategy and development, and increased sales.


We have a new, unique insight base in over 60,000 interviews conducted per year, with participants, asking them about their usage of different brokers, venues, and products. We help our clients reach their target audience, and better understand the factors that drive their buying decisions, and identify new market opportunities.


So let me now introduce our speakers today.


Joining us will be Paul Lashmet, who is the Practice Lead and Advisor for Financial Services at Arcadia Data. Arcadia Data is a visual analytics platform that allows business users to make sense of high volume and varied data, in a timely, secure, and collaborative way.


Also on the line is Alasdair Anderson. He's Executive Vice President of Data Engineering at Nordea Bank, where he is responsible for a big data strategy that supports all core bank functions. Alasdair speaks frequently throughout Europe on the topics of data management, analytics, and innovation, and his perspective includes the revenue generation and how to satisfy ever-growing regulatory and compliance demands.


Shailesh Ambike is the Associate Director at RBC Capital Markets, and most recently focused on big data analytics as it relates to equity, market trading, and compliance. His focus areas include, but are not limited to, best execution, electronic trading control, foreign trading, and high-risk securities through DMA.


So let's get started on the topic today. We're gonna be discussing something that's been getting a lot of attention recently, and that is alternative data. A recent Greenwich Associates research paper looked at the increasing use of alternative data in alpha modeling, and found that 80% of [inaudible 00:02:54] traders and portfolio managers wanted greater access to alternative data.


Today we're gonna be looking at a topic not from the alpha generation angle, but rather from the regulatory compliance angle. Compliance and RegTech is another extremely hot topic in the finance space, with banks and asset managers investing heavily in headcount and technology to ensure they're doing everything they can to comply with current and upcoming regulations.


We're gonna be looking at how to develop RegTech solutions while managing the total cost of ownership, how to optimize your internal data management, enhancing the compliance process using alternative data, and meeting heightened regulatory expectations around timeliness and quality of the data.


So just to kind of level set everyone on what we're talking about here when we say alternative data, we're defining it as data often generated by novel data analysis techniques. Alternative data are unique data sets that, by themselves, or in conjunction with traditional market data, could provide additional insights and competitive advantage.


There are many different types of alternative data, such as web traffic, social media data, satellite imagery, internal company data, private company data, and so forth. And we're gonna be touching on a few of these during today's webinar, as they relate to compliance.


So, Paul, let's get started with you. Do you kind of agree with that definition with respect to alternative data, and what are some of the challenges that a firm may face when trying to integrate such data into their processes?


Paul Lashmet:


Thank you Richard. I think we'll have a great discussion here, of an interesting topic. I think the definition you talked about pretty much covers it. But I want to point out its purpose. The purpose of alternative data is to provide additional insight. So, in compliance, this would be data that will help connect the dots leading to a suspect transaction, or help to mitigate false positives.


With regard to the types of alternative data, there are a bunch listed here, some very novel, like sensor data, and location intelligence. But I would add data organizations already have, like chat, email, and voice, and you can consider it alternative if you have it, but you apply it in a unique way.


To your second question: what are the challenges? And number one, it's getting enough data, making sure it is good quality, and that it is diverse enough to provide actionable insight. So the next slide that we have is an example of diversity. This is an Arcadia Data application. Arcadia Data provides visual analytics that can blend all types of data.


Here we take live data to show real time alerts, and place those alerts into another context to generate actionable insight. So in this case, location intelligence. But it also allows you to look at the historical significance. So the use of alternative data is exploratory in nature, and a regulator could ask for this insight at any time. So to satisfy this requirement, you'd need a visual analytics platform that is able to show real time and historic data in one application, analyze all types of data in place without moving it, and allow business users to do this in a self-service fashion.


Richard Johnson:

Here is a question coming in from the audience. Can you explain what you mean by false positives?


Paul Lashmet:

So in one case, let's say you're getting a lot of alerts. I know with trade surveillance you're ... There's something in the background, it's generating all this ... You know, algorithms, and generating all these alerts. What you can do is pull in all the underlying data, take Arcadia Data to look at it, and then get more insight into it so you can start to identify, okay this alert, we got this alert, but does it really mean anything? So that's what I mean by being able to drill down into the data and be able to get the insight out of it.


Richard Johnson:

Okay. Shailesh, moving on to you. Can you speak more specific about the types of alternative data sources that you're looking at, at RBC, that would be relevant in a compliance context?


Shailesh Ambike:


Sure. So just a little ... Just to step back a bit. We're a sell side integrated broker dealer owned by a bank with global operations, integrated with several jurisdictions, several regulatory requirements globally. So to manage that critical mass and to conduct compliance and surveillance in an effective way, it's becoming more important for us to be able to integrate our transactional data with some of the more non-traditional types of alternative data. That would include any type of news or blogs, or internal electronic communications, such as emails, social media, chat.


The idea here is that we need to better surveil our transactional ... Our transactions that are conducted either by internal traders or by clients, with any type of data or messages such as social media, in order to better surveil and uphold topics such as inside information, information barriers, anti-competition.


Our regulator is requiring now ... They're placing more scrutiny on this type of topic. And so it does require us to use that type of data, and integrate that with our more standard, traditional, transactional data. And we're seeing this a lot in certain topics such as e-communication, so electronic communication surveillance, as well as surveillance of our global direct electronic access client base, which actually has access to local and foreign marketplaces throughout the world.


Richard Johnson:


Okay, thanks Shailesh. I'm turning it over to Alasdair now. Can you tell us a bit about what you're doing over at Nordea?


Alasdair A.:


Sure, happy to Richard, thanks. And so just for people who are maybe not familiar with the brand, Nordea is the largest financial services organization in the Nordic region. So we're covering Denmark, Sweden, Finland, Norway. And we're around about a 800 billion Euro organization balance sheet.


And I'm actually quite new to Nordea. I joined the company just over a year ago. And it was with a mandate to, really, evolve the core data platforms of the bank, and tackle a number of challenges that we were seeing in our current delivery, as well as build something that would protect the bank from what we're seeing as future demand, specifically from the regulatory space.


So we've been working on attacking a number of key issues within the bank, which is, first and foremost, the total cost of ownership of our data estate. And we were using, and we still are using today, and will continue to use, a lot of closed-source, proprietary technology. And the challenge we were finding with that is that, really, the expectation of both our customers and our regulators is that we store, analyze, hold, and go to work with a lot more data. And to be able to do that, we really had to really attack the total cost of ownership. So we've been working on leveraging open-source technologies and big data technologies to radically reduce that total cost of ownership.


And the second thing we've been focused on is really attacking the sort of speed to insight, or the time of delivery of data products. So using a highly structured relational database-style technology. And you're sort of constricted by the relational model, and it takes you quite some time to get your data product to market. And we see that both in the regulatory space, but also we see it in the sort of digital analytics and insight space. And we wanted to, again, leverage more of the SQL-type solutions, to allow us to really change our time to market metrics.


And the last thing we were looking to do is really leverage the new wave of technology and ways of working that people are beginning to bring to a lot of [inaudible 00:11:56] and attack both sides of the balance sheet, both for upside benefit, and look after downside risk.


So the advanced analytics capabilities, the ability to use AI and machine learning techniques, and the ability to use advanced visualization tools, is something that we've been adding into the mix. In the reg space, this is, you know, to better understand our business as it flows in both real time and as a [inaudible 00:12:26] analysis. And be able to identify potential risks to the bank, or systemic risks to the financial market.


Richard Johnson:


Okay, that's interesting. I'm definitely hearing a lot more about AI and machine learning being used in the compliance ... in RegTech and compliance. Mainly around analyzing real time market data, so looking for trading abnormalities, and so forth. So it's interesting to hear you talk about leveraging that technology for other kind of internal types of data sources.


And Shailesh, I want to come back to you. Can you kind of tell us a bit more about some of your specific experiences in leveraging alternative data and compliance?


Shailesh Ambike:

Sure. So one example would be looking at our high-risk trading. So this is trading in sub-dollar, or what's known as penny stocks. They're usually companies that are capitalized, and have been implicated in certain instances of pump and dump. There's a lot of regulatory scrutiny on this, from both a market conduct and an AML standpoint. And so what we're trying to do is integrate our transactional data with our databases, to understand, and to get a little bit more intimate knowledge of our client trade flow. So that, you know, from a [inaudible 00:14:00] sort of 80/20, where should we focus our risk, or our time, when we're reviewing this type of transactional data? Certain clients trade in index stocks, some of the [inaudible 00:14:12] issues.


But there are others, just based on the nature of what they're looking for, is they're transacting in sub-dollar penny stocks. And so what that requires us, from a compliance standpoint, is to better integrate our transactional data, and review that against news. Sort of some of the more non-traditional types of news and news sources. So that we could generate those hits, if you will, and then flag that trading, and flag those clients, and then do more of a due diligence on that, and get ahead of that, so that we could review that either from a market conduct standpoint, or even from an anti-money laundering standpoint. The rules are becoming more stringent on that. There are predicate offenses that we need to be ahead of, and so that's really, from our standpoint, just looking at the market conduct alone won't give us that type of intelligence.


We do need to integrate non-traditional data in that line of business and that trade flow, to get ahead of any type of, you know, reputational risk, or any type of market conduct or issue we have with someone that's trading through our infrastructure.


Richard Johnson::

So it sounds like what you're doing is you're looking at the trades coming in from clients and internal desks, and you're tying that up to things like email communication and chat communication. So you get a fuller picture of what was happening when trades come in. Is that right?


Shailesh Ambike:

That's right. As well as news, as well. So there's more non-traditional news that doesn't come across the more traditional infrastructure, you know, the Reuters of the world. But there are news sources that we need to get ahead of, and integrate with that trade flow to better surveil that penny stock ... Those trades in those penny names.


Richard Johnson:

I think the real time aspect here is a key differentiator of using this type of technology. Previously, a few years ago, and perhaps even still now, I think a desk manager would get a monthly email review folder to look at. And of course it's very hard, you know, finding the context, when you're looking at data that's up to a month old. So I think that seems ... To me, that sounds like one of the key advantages here.


I want to come back to you, quickly, Alasdair. We had a question come in from the audience on the deep learning ... On the AI topic.


Are you using any semantic technologies to parse the alternative data?


Alasdair A.:

Not for the alternative data. It's actually a very good question. We are part of the Enterprise Data Management Council, who are developing a standardized set of semantic ontologies for all trading banks. It's a capital markets focus. But just, we were ... We're really hoping that that steps into the sort of risk management space, and helps us with the BCBS 239 compliance aspect.


To answer the question specifically, we're not actually looking at things like deep learning in the alternative data space. So our use of AI is various sort of permutations. So we're looking at regular automation robotics within the operation space. So that could be processing of things like KYC processes, and just automating things that were done by the back office.


We're also, within the data management space, we're looking to automate the tagging of data that comes through. So we're not using deep learning in this space, but we are developing business [inaudible 00:18:11] and ontologies that will be put into a supervised learning model, so the machine will come up with an actual suggestion of various tags. They could be semantic tags, whether it be a business tag, or a risk tag, or a cyber security tag.


In Europe, we're coming under the auspices of the EU's GDPR regulation, that's General Data Privacy Regulation. So we have to actively scan for all PII information, and protect that data, as well as enforce things like the right to be forgotten. So we're using a supervised learning model there, to develop a profile of the information we're searching for.


There is some experimentation going on in the deep learning space, to do with anti-money laundering. And it's to do with the challenge of how do we tune the models and know that the outcome of the model changes won't have a negative impact, so too many false positives, and maybe we miss an actual AML case. And the challenge we've got there is if we put in a model that has many false positives, we're obliged, legally obliged, to review all of those false positives. So we're starting to experiment with deep learning in that space, but it's very early days. So it's not something I can claim that we've licked right now.


Richard Johnson:


It's a fascinating topic, and I've no doubt we'll be doing a whole webinar on that topic in due course.


So Paul, we got a [inaudible 00:19:41] insight into what Alasdair and Shailesh are doing with respect to using their alternative data. Is that typical of the kind of use case that you see clients of Arcadia Data using it for?


Paul Lashmet:

Yes, it is. And I think two pieces of the lesson Alasdair brought up, are quality of data, like how useful it is, not just if it's correct or not. And using innovative analytics to run models. But you need to validate those models for the regulators.


So, you know, there's two things ... Let's expand on quality of data. There's some on the next slide showing what I call dynamic data quality. So in this set of screens, we're evaluating counterparty data, as you would do for KYC. But also checking to see if there's any material changes that would affect any part of the trade flow. So how does counterparty data then affect transaction data?


So one could imagine a machine learning routine that identifies data quality risks in the background, allows you to see these patterns, and then allows you to go to those details. So this would satisfy some of the heightened regulatory expectations. For example, like with trade reconstruction, and things like that.


And then the other point is model validation. So, verifying your models. So, you know, at one level, you can see how, looking across all your desks across the organization, see how they're responding to a series of models. That could be valuation models, or capitalization models, or even stress tests. And one could visualize the trends and patterns, but in the end, you need to be able to get to the granular details. And this proves modelability, which is that you're using real data.


So if you think about it, in this sort of scenario, and I think when Alasdair was talking about potential use of the deep learning, you could have your best of breed analytics platforms generating huge amounts of data. Here you blend them together on one visual analytics front end, that can summarize billions of rows of data, and allow you to drill down to specific events at any time.


Richard Johnson:

This one just came in. Isn't a black box the wrong paradigm?


Paul Lashmet:

I use that only because, you know, when I have discussions with clients and prospects, you know, we talk about what do you use for big data, what do ... Do you use Hadoop? And a lot of times, there's not really, from the business point of view, they're really thinking of it ... You know, one example I got from a compliance director was that big data to him was a really large, tall server with people around it that make it sing and dance. So he knows that there's a lot of value in it, but doesn't know the intricacies of it. So that's why I just put it as a black box, in this scenario. But, you know, it's not. If you have a front end that gives you all visibility, that black box becomes very transparent. And I think that's the goal that we're trying to go to.


Richard Johnson:

Okay. Thank you. So I think one of the things about alternative data is that it's alternative. And that can make it harder to access and harder to use. In a recent study, we asked a number of asset managers and hedge funds, what were their obstacles to using alternative data.


High fees was the top answer. And many other firm-level constraints topped the list of obstacles, such as internal procurement procedures limiting the adoption of new data sources. Also, difficulty working with [inaudible 00:23:25]. Not customized, or not standardized to what people are used to was also ranked as a key obstacle.


I'm wondering if you guys can kind of talk about some of the challenges that may need to be overcome, and how people looking to implement alternative data for compliance ... How they can solve some of these obstacles. Maybe start with you on that, Shailesh?


Shailesh Ambike:

Sure. I think the key thing is having the right set of people: the subject matter experts on the business side, as well as the technology folks. So, I mean, it does require collaborative efforts to make this work and succeed. But that would probably be the key thing, the key initial thing. And then, I think, have kind of a roadmap, and look at the low-hanging fruit. Go for the easy wins. There's just a lot of data to consume, and you don't want to be lost in the myriad of data and trying to map everything. It could be ... Try to have some type of simple goal that you first want to achieve and win, then achieve with the data that you have, as a functional tool, when you're trying to integrate non-traditional data with the more traditional data.


Richard Johnson:

Thank you. And Alasdair, what advice would you give to a bank, or financial services company, looking to implement an alternative data solution?


Alasdair A.:

So, I mean, the chart really resonates with me. The high fees to buy some different data set, would really challenge any business model. It would make it quite prospective, and would make it potentially high-risk. The advice I would give is just based on the experience we've had.


So we started off with public data sets, which have also got a very good cost to them, which is that they're free. So an example we had is that we brought in seismology data to overlay that with a real estate portfolio we had. And then build a client view of the potential exposure, should there be a seismic event in that region.


But I think the cost of the external data is missing the real point. The value is the data that your bank has, or your insurance company has, is the real value. And all you're doing is augmenting that data to give it additional insight. So in that case, we augmented it with seismology data, and we had that risk view.


We then took that use case and expanded it, to look ... We added payments information. And looked at, you know, the sort of supply chain views that we had developed as well, to make an analysis of, should there be a seismic event in this region, what would be the exposure to the overall supply chain, and therefore the credit lines that those individual customers had?


So it's a good example where we used alternative data, but which was external and free, and it took us in a different direction that we originally proposed, which it started with a real estate analysis.


So my advice to people on the call is, you know, there's a lot of information out there that you don't have to pay for. It really gets your business thinking about what the potentials of the technology are. And it will allow you to build a much more robust business investment case, should you wish to expand the use of the platform or the data technology.


Richard Johnson:

Well that's interesting. I didn't think I'd hear about asset managers using seismological data for their portfolio. Very interesting.


I've got another question coming in here. Have you had a compliance audit from a regulator yet? What was their feedback?


I'm guessing ... What do regulators say about alternative data?


Shailesh Ambike:

I think from a broker dealer standpoint, any way, sort of tool, that we could use to better surveil for market conduct, or potential manipulative and/or deceptive trading, or any type of anti-competitive, collusive ... Collusion, or anything in that regard, from a regulatory standpoint, I think our regulators applaud that. They see that we're taking proactive steps and measures, and that we're embracing the technology. And we're embracing the fact a lot of our trade flow now is outside of the traditional human being entering an order, and it's algorithmic, it's based on machines. And so to that extent, you know, what type of ... When we can have a surveillance system that can monitor that type of trading, and from a global standpoint, they'll definitely support that, based on their audits.


Richard Johnson:

Any comments on that, Alasdair, with respect to the regulators?


Alasdair A.:

Well, I can't speak to a specific audit. However, we have had very favorable feedback in the [inaudible 00:28:44] that we've developed. You know, they buy into the fact that we've produced a TCO and therefore we can use a lot more of our own information, as well as other sources of information.


They're also quite positive, and very positive, actually, about our intent to reduce time to market and the latency for data production. And really, one of our primary use cases that we tested in Nordea was on liquidity risk. And liquidity risk, you know, is about [inaudible 00:29:12] regulation that's coming in. And it's something that was probably measured at the end of the month. That's now moving to a daily measure, with an expectation that at some point that will move to intra-day. And the ability to develop real time architecture alongside your heavy-duty, batch-driven analytics architecture, they get some very positive response from the regulators.


Richard Johnson:

Thank you. Another question here.


What type of skills do you need in-house in order to analyze all the data?


Maybe Paul, you want to start with that one?


Paul Lashmet:

Yeah. I just want to run off what we were talking about earlier about it being a collaborative process.


So in terms of skills, you're gonna need those data scientists, say, on the right hand side. But you need the subject matter experts that are gonna know about data, how to apply that data, and see if there's anything wrong with it. They all need to have accessibility to that data.


So just to say that, looking at organizational changes, risk and financial functions may start to blend a little bit. And so each group has to have access to the data to make their own insight available.


Richard Johnson:

Thank you.


Shailesh, how big of a team do you have over at RBC working on this? Is is a lot of people? Or can you do it on a fairly lean basis?


Shailesh Ambike:


Well, I mean, we have a team of about 20. But that's sort of broken down by specific lines of business. So from an equity standpoint, we have a team of four, including myself. And so we do both advisory and surveillance. We support the business, but we also do monitor all the transactions. And I think the idea here is that we need to equip our compliance officers with better tools, as opposed to additional headcount. I mean, headcount is important. But where you need it, and where resources prevail. But I think it's empowering those officers with the right tools, which is ... It's becoming increasingly required.


Richard Johnson:

Thank you. We're gonna ... We got about a couple minutes left, so if anyone has any more questions, please, please put them on the chat.


Another question I have here is ... And maybe we'll start with you, Alasdair.


How do you integrate these new sources of alternative data with existing compliance reporting?


Alasdair A.:

Well, it's a gradual process. So we have a standard data lifecycle. And we use the same lifecycle for internal data, whether it be ... You know, wherever it may be sourced from. So we learn data without transforming or touching it, and that's one of the advantages of the big data platform, is you don't have to pre-define and scheme out a structure.


And then we run a profiling algorithm over it. The supervise [inaudible 00:32:20] that we spoke about. And that goes to our data operations team for them to then tag and start to put semantic meaning onto the data structures.


And that ultimately forms ... That can go two ways. One, it can stay within that sort of self-service user environment, and just be used for ad hoc analytics. But if it's to be used for something in the regulatory space, it then has to go through a defined IT interface, the change lifecycle, and as part of any regular agile development. But the first step in involving the data is to take it through that profiling process.


Richard Johnson:

Okay. And this will be our last question.


Just briefly, Alasdair and Shailesh, what has been the response like within your organization to these efforts?


And I assume the response means senior management that we're talking about here.


Shailesh, you want to start?


Shailesh Ambike:

Sure, yeah. It's positive. It's, you know, whatever we could do internally and utilize our infrastructure in a more efficient and a more intelligent way, all the more better. We have the tools, or we have the data here. It's just a matter of integrating it in an intelligent manner for us to utilize, from a regulatory standpoint.


And as I mentioned before, getting ahead of those issues sooner rather than later. And I think that's really key with any type of tool, is your ability to get ahead of potential issues, and to navigate through that to see, you know, what are the humans involved, what's the technology involved, and what's the regulatory impact, and getting ahead of that to mitigate it. So it's been-


Richard Johnson:


Sorry, and Alasdair, has there been positive feedback at Nordea also?


Alasdair A.:


I mean, it's positive. I would say that there has been a level of impatience as well. They want us to go faster. And so the note of caution I would pass out to the audience is maybe think about under-promising. Take it easy, because there's a lot of new technologies here, and there's maybe a lot of elevated expectations that have to be managed. And when you've got executive sponsors, you have to be careful what you promise.


Richard Johnson:


Under-promise and over-deliver is always good advice, I think.


So, it appears there's no more questions at this time.


I'd like to thank Paul, Alasdair, and Shailesh, as well as everyone on the line for your presentation ... For your participation. If you'd like to reach out with any additional questions about this, please feel free to contact me using the information on the screen.


And we would also appreciate it if you could take a moment to complete the brief survey that will appear shortly after you end this session, to let us know what you liked about the webinar, and what you would like to see next time.


Have a fantastic day, everyone.