As firms prepare for new regulation, capturing and storing transaction data will no longer be enough. Regulators will also want to see granular data on quotes, accurately timestamped. This will present substantial technological hurdles for firms, in terms of both scale and complexity. What kind of cost-effective solutions cans firms put in place, and what are the benefits to be gained by taking a more strategic approach to data capture?
Willam Garner, Charles Russell Speechlys
David Grocott, Financial Technology Advisors
Stephen Taylor, Stream Financial
Hirander Misra, GMEX
Clive Posselt, Instrumentix
Emiliano Rodriguez & Dan Joe Barry, Napatech
Hosted by The Realization Group
Mike O’Hara: Hello, and welcome to Financial Markets Insights. One of the main aspects of the new Markets in Financial Instruments Director, MiFID II, due to go live in January 2018, is the need for much more granular data capture than has been required previously. This is presenting firms with some serious challenges around exactly what data they need to capture, how to capture it and how to make meaningful use of it in context, particularly when faced with any kind of compliance investigation. First of all, from a regulatory perspective, what has actually changed?
William Garner: In terms of the details and what has changed, a key element of detailed change comes in both MiFID II, the directive, and MiFIR and other related elements. In big-picture terms, what we’re looking at is Article 16 of MiFID, which relates to overall data storage requirements on firms, that flows down through to MiFIR. In particular, Articles 25. What that changes, in particular, the language is very similar to what’s already in MiFID I, in the directive. What changes there is it doesn’t link just to transactions, it links to orders and transactions.
In terms of what you need to keep, there are also changes there and it is now mandatory that all information relating to regulated services or activities are covered. For example, in relation to orders, it will be telephone recordings, it’ll be emails, it’ll be instant messages. What a firm has to do is to ensure that all of that is captured and it’s stored in an accessible way, both that they can access and so that they can give full access to national competent authorities when required.
The key and the difficult thing will be linking, say, voice messages to emails, to instant messages so that you are linking all of those means of communication through to particular orders or to particular transactions.
Mike O’Hara: This linking together of the data that makes up orders and trades is where the main challenge seems to lie, particularly when the data is coming from multiple sources.
Clive Posselt: I think the fundamental difficulty is, with any trading infrastructure, you have multiple applications running across an environment. The difficulty in that is each point captures data, maybe in a different format, in a different way, using a different protocol. The difficulty with that is actually understanding, in the totality, in a holistic manner, exactly what’s going on across that environment at any one time and also then, historically, being able to pull that data back into one place to really do something meaningful with it.
Stephen Taylor: In the simple case, you have a lot of data from the source that’s being stripped off the wire, lots of trades, and you need to be able to find them. The first thing you need to be able to do is to be able to accrue that with good performance because there is a lot of data, a lot of history. Copying it into a centralised database can be very complex for all sorts of reasons. The main one is who owns that data. If you’re going across multiple jurisdictions, then you might not have the permission to get that, to look at that other data. The person that really should make that decision should be the owner of that data. They shouldn’t just copy it to somebody else and hope that they make the right decisions for them. That’s very important.
By using a federated model, where you can query one data source and another data source but leaving the data under the original ownership in its original location, is by far the best way of doing things.
Mike O’Hara: This idea of using a federated model that goes to the source of the data certainly seems to make sense. This is why, increasingly, firms are going right down to the network packet level, capturing data there at its most granular source and then making it available for harvesting via APIs.
David Grocott: Obviously that data is the real, genuine live data, isn’t it, that’s come in and out of the bank? You’re not going to get a cleaner source of the data than that. That is actually what was sent and what was received, so from a pure data point of view, you’ve got everything. Then it’s how you choose to filter it and analyse it and clean it up so that you can reconcile it with the rest of your systems. You’ve obviously got the application level data, which is the view of that particular application, but it’s obviously nice to be able to reconcile that against a holistic view of the whole firm.
It also, potentially, gives risk and compliance a view that they didn’t have before. They’re able to pull it together. They probably will need some federated queries or other things to actually pull the data together when they’ve retrieved it from the network level, but it’s a lot easier to do that than it would be to write hooks into all the applications, and it’s probably a lot quicker as well.
Emiliano Rodriguez: The key thing is to have the right data and all the data. Afterwards, data is data. Everybody is going to look at the same data. If everybody can get into that data from different places, be it REST API or whatever, then everybody can look at the same data at the same time, different places. Then that way, you can save a lot of money and also the return on investment is going to be better because you don’t need to change it. You just need to maybe deploy more packet captures [that you’ve got 0:05:18] on your network, but just the packet-capture part. The analysis can actually be centralised in one place.
Mike O’Hara: How can this type of data capture approach best be implemented? What sort of architecture is required, not just at the network level, but also higher up the trading stack?
Daniel Joseph Barry: We envisage a three-layer approach, so there’s very reliable capture on the bottom, which can service many different needs, best-of-class software that can be interchanged or exchanged and is totally independent, can scale independently of the other layers. Another layer then can actually go in and get data from all of those different sources and pull it together and maybe even combine it with some other sources, like other logs or other databases of information so it makes more sense at a higher level. That’s the concept that we’re looking at.
One of the things that we succeeded in doing is actually making some very affordable packet-capture solutions, which are just focused on packet capture, so the ability to get all the information off the network without losing any of that information, even at very high speeds. Then either providing it in real time or actually recording it to disk, storing it, basically, with full integrity so that you can look at it later.
Now, when you have that capability at different places in your network it starts opening up some possibilities. We’ve built some very nice software on top, which provides an API which is really easy to integrate with and it can support multiple applications at the same time. That means that you can have different applications which are dedicated to a specific task that can go in and access the data at the same time and actually do what they need to do with it.
Mike O’Hara: This tiered best-of-breed approach certainly seems to be gaining traction, for a number of reasons. What are some of the key considerations for firms looking to adopt this type of architecture and what are some of the longer-term benefits that it can offer?
Stephen Taylor: Clearly, if you’re going to start looking at data that comes off the wire, that’s one very, very specialist piece of work. The storage of that data is another piece of very specialist work. The ability to be able to query across multiple systems with high performance is another specialist piece of work. There isn’t really one person in the industry that can do all of that together.
Clive Posselt: There are people who do this day in, day out and supply those types of products. It’s increasingly that banks, given their resources, don’t want to be software companies, which, typically, in the past they may well have done and done a lot in-house. Where you’re able to leverage a third-party software solution or hybrid solution and not have to employ people to write code and write similar types of products, then your whole solution becomes scalable.
Daniel Joseph Barry: This particular solution that we’re looking at for the financial industry is actually quite a good stepping stone because it gives you an opportunity to try and work within the existing paradigm, if you want to. because you can certainly take elements of this and recreate a more integrated solution for a very specific need if you want to do that. It gives you that freedom to add onto that too, even with cloud incidences because the captured data can be provided to a cloud instance if you want to do that.
By just aggregating these things and giving the flexibility at least to be able to start experimenting with some software that doesn’t have to be an appliance, but just be software running somewhere, you’re taking the first steps towards preparing for moving things to the cloud. It’s a good stepping stone in that strategy to help you move towards the cloud in a very controlled manner.
Mike O’Hara: The message is, as well as helping to satisfy immediate regulatory and compliance needs, taking a more strategic approach to data capture can deliver some serious long-term benefits by helping firms to move away from more traditional, inflexible, siloed approaches. Thanks for watching. Goodbye.