Azure for Executives

Real time data-driven decisions and responses with Jamshed Patel and Nick Leimer

Episode Summary

In this episode, we talk about making data-driven decisions using DataVisor with Jamshed Patel, Vice President of Solution Engineering. Nick Leimer, Principal Industry Lead in Microsoft Azure for Insurance, explains how seeing more fraud-based solutions in the insurance industry means moving to real-time analysis, alerting, and remediation with DataVisor.

Episode Notes

While we’re seeing early adopters embrace technologies like machine learning and artificial intelligence to help drive their businesses forward with autonomous systems, the uptake is much slower than one might expect.

The number of companies willing to turn over their operations to a real-time-system vs. a report-analyze-act sort of model is surprisingly small.

Some organizations have taken the approach of capturing all kinds of data in hopes they will glean insights after capturing anything and everything they can get their hands on.

Jamsheed explains how using this technique has panned out for companies trying to find insights from their data estates or data lakes and shares interesting data sources he looks at to make determinations about threat vectors and real-time remediation.

Episode Links

Episode Transcript

Microsoft Azure Synapse Analytics

DataVisor.com

DataVisor on LinkedIn and Twitter

Azure Marketplace

AppSource

Guests

Jamshed Patel is Vice President of Solution Engineering at DataVisor. DataVisor is a Microsoft partner and makes powerful and comprehensive fraud and risk solutions for various industries.

Follow him on LinkedIn.

Nick Leimer is Principal Industry Lead in Microsoft Azure for Insurance.

Follow him on LinkedIn or Twitter.

Hosts

Paul Maher is General Manager of the Marketplace Onboarding, Enablement, and Growth team at Microsoft. Follow him on LinkedIn and Twitter.

David Starr is a Principal Azure Solutions Architect in the Marketplace Onboarding, Enablement, and Growth team at Microsoft. Follow him on LinkedIn and Twitter.

Episode Transcription

DAVID STARR: Welcome to the Azure for Industry Podcast. We're your hosts, David Starr and Paul Maher. In this podcast, you hear from thought leaders across various industries, discussing technology trends and innovation, sharing how Azure is helping transform business. You'll also hear directly from Microsoft thought leaders on how our products and services are meeting industries' continually evolving needs.

Hello, listeners. This is your host, David. Sorry, I don't have my co-host, Paul, with us today. Unfortunately, he's busy elsewhere, but that's okay. We're going to move right ahead. And today we're going to be talking with Jamshed Patel who is Vice President of Solution Engineering at DataVisor. DataVisor is a Microsoft partner, and they make really powerful and comprehensive fraud and risk solutions for various industries. We're kind of looking at it from the insurance point of view today. And we aren't talking as much about insurance as we are real-time data-driven decision-making, which is applicable across multiple industries in any case, Jamshed, welcome to the show.

JAMSHED PATEL: Thank you, David. I appreciate you inviting me to the show, and I'm looking forward to it.

DAVID: Absolutely, so am I. Nick Leimer is Principal Industry Lead in Microsoft Azure for Insurance. Welcome, Nick.

NICK LEIMER: Thanks for having me here, David

DAVID: Nick and I get to work together every day with multiple partners, and it's a really good group that we get to work with together. So, Nick, it's good to have you on the show finally. It's been quite a while.

NICK: Yeah, it's great to be back.

DAVID: All right. Well, I think it would be really great if we could just start off, Jamshed, by understanding a little bit more about what you and your team at DataVisor do, so if you could share that with our listeners, I think that would be a great place to start.

JAMSHED: Sure, David. And I can give you a really simple answer, which is that DataVisor's mission is to stop fraud before it happens. If you think about most fraud vendors today, they focus on detecting what I would call known patterns of fraud, and that's based on historical data. And what DataVisor has done is it has transformed this approach by delivering the world's most advanced AI solutions. What that enables for our customers is that it enables us to stop both the known as well as the unknown fraud patterns before they actually impact our customers.

DAVID: So deep integration with AI, that's going to be interesting as we have our conversation here. I'm really intrigued by that. That's good stuff. So in talking about data-driven decision-making, it was interesting when we set up the show, when you and I initially contacted each other, we had very different impressions or ideas about what data-driven decision-making means. We've moved past the notion of static analysis, if you will, of data and systems that are monitoring that data and now more into real-time systems, and so you educated me on that. And when you consider data-driven decision-making, what does that mean to you?

JAMSHED: Yeah, David. So if I asked you to picture what fraud looks like, you might imagine an individual that's walking into a bank to cash a bad check as an example, or you might picture a hacker that's operating alone in a dark room to purchase goods using a stolen credit card, as an example. And the traditional way to deal with this type of fraud was to flag these types of suspicious transactions (and oftentimes this was done after a user complains about it) and then review the transaction manually and take action to correct that fraud. That world has now changed and we have sophisticated criminal groups, some of whom may even be state-sponsored who conduct large scale attacks that can steal very significant amounts of goods or money in a very short period of time. And this can not only cause significant financial loss, but it can also severely damage a company's reputation. And the sheer scale of this kind of fraud is why it shows up so often in the front pages of our news media.

So in order to counter this type of high-end high-scale fraud, companies typically utilize rules-based systems where they hire data and fraud analysts to write rules to filter incoming transactions. Some companies even use a machine learning approach most typically a supervised machine learning approach to augment the detection rules and so on. However, both of these systems have a problem which is that they are manually updated, which means that they are slow and they are expensive. And fraud rings are typically evolving faster than these companies can keep up. For example, criminals will use a certain pattern to perpetuate the fraud and while the company develops countermeasures, they will develop new patterns and therefore stay one step ahead of the companies. So the solution to this is to use this incoming stream of data and counter these new attacks in real-time. And that is why DataVisor was created and was based on a foundation of what is known as unsupervised machine learning, which is able to detect and neutralize these new known and unknown fraud patterns before they are even able to complete the transaction. So we're effectively transforming the field of fraud detection by being more proactive and stopping this new emerging fraud before they can actually damage a company.

DAVID: For those who may not be familiar listeners to the show, I wonder if you could talk just a little bit about the difference between supervised and unsupervised training.

JAMSHED: Yeah, that's a great question. So supervised machine learning is founded on the notion that you build these models, and you train these models on data sets with known fraud patterns. So you may have labeled the data as good transactions or bad transactions, and the model is trained based on this knowledge of knowing which transactions are good and which transactions are bad. However, unsupervised machine learning derives the intelligence from the data itself. So you don't need to have any known historical labels on the data. You don't need to know which transactions were good, which transactions were bad. Unsupervised machine learning that DataVisor offers is able to look at the data and mine the intelligence from the data itself because it uses very sophisticated technology that allows it to cluster in very high dimensional space and to be able to detect these very subtle patterns based on not just the current event but based on all of the events that have occurred in the past and that are occurring at that point in time. And so unsupervised machine learning is really what you would call the state of the art, and it is what you might call a self-learning system. So it learns these patterns from the data itself; it doesn't need to be taught these patterns.

DAVID: That's fascinating. Nick, I wonder are you seeing in insurance solutions that the industry is moving more to real-time analysis, alerting, and even remediation? That's the really interesting one, taking action based on data in real-time.

NICK: Yeah, I think I've seen more and more of that with companies. And I think with COVID-19, anytime you have a major stressor on businesses and employees, there's historically been an upsurge in fraud. You think about a company that's having hard times that has a quote-unquote 'fire' or someone knows there's going to be a layoff at the plant, and they find a way to have a workers' comp claim. Historically, that has been sort of a pattern that has been repeated over and over, so that's something that people are aware of. But having this additional level of information coming in and sort of organized fraudsters instead of just sort of the one-off onesie-twosie kind of things that happen, and they have these sort of spikes because of impacts on changes in economic conditions. So there's that piece that's kind of a known area of fraud that occurs. And now with so much other information that comes in from social media, from IoT sensors, et cetera, you can interpret did this accident or this event actually happen from other sensors that were around at the time of the event? We have partners that are capturing this information at every car accident. What other cameras were available? What other sources happened at the time of that accident? So you can really determine who's at fault from the information surrounding that vehicle at the time of the accident. So lots of our partners are pulling in information, and we provide lots of additional information both on companies and on geospatial information, telemetrics, et cetera. So there are a lot of new inputs that we're providing as a platform company and then our partners are pulling those together to answer new questions.

DAVID: There are a lot of companies out there that are looking at these real-time systems versus a report-analyze and then I've got my report now I'll go act on it. And so the number of people pulling in systems that do remediation in real-time is pretty small in comparison to the report-analyze-act thing. And I wonder if you're seeing the adoption, Jamshed, of more of these real-time remediation engines like Nick said he's seen. Are particularly companies relenting control in favor of faster outcomes that they leave to an AI?

JAMSHED: That's a great question, David. So this is, as you can imagine, a big decision for companies, and they're definitely trying to evolve in this direction. When you think about our systems, when we look at an incoming stream of transactions, we assign a risk score to that transaction. So if we detect something is fraudulent, we will assign a score, and typically, if the score is high, let's say on a scale from 0 to 1, if we assign a score of more than 0.7, we recommend that the company auto-decision based on that score. And that's a pretty big decision because you're potentially impacting a user that is if you deny a good user a transaction, that can make for a bad customer experience. On the other hand, if you don't do an auto-decisioning there and you wait till after the fact, and if it turns out to be fraudulent activity, it could be bad for the company as well as bad for the users that are being cheated out there. But typically, what we see is that as companies deploy our technology and get more comfortable with the precision and accuracy of the results that we can provide, they start to be more comfortable with auto-decisioning and with taking automatic decisions based on the output from our systems. And the reality, David, is that companies recognize that they have to move at the speed of business. So with the move to online commerce which as you know has been accelerating especially recently with all the COVID-related lockdowns and so on, online commerce is exploding. And the rate of fraud in online commerce is also increasing pretty dramatically. And so the ability for companies whether you're an e-commerce company or you're a financial services company or an insurance company, the ability to stop this fraud in real-time where you don't have the time to make a manual judgment or decision is going to be key for that company to be able to prevent fraud from occurring.

DAVID: So I'm curious, based on that really interesting story, do you have an example of decisions that are being made in real-time maybe with your fraud product? Tell us a little bit more about the story behind that. What types of fraud are you detecting, for example?

JAMSHED: So we detect fraud in many different scenarios. So one of the areas that we really excel in is we provide full account life cycle protection, which means that all the way from a user applying for an account to them being provisioned into the system, to them, actually running transactions, the post-transaction followed through and in the case of financial systems, for instance, you might have money laundering type transactions that we detect. In the case of e-commerce systems, we may detect bot attacks that are trying to purchase a new product. So a typical scenario in an e-commerce company would be where a company releases a new product into the market and there's a bunch of bots that go ahead and buy off those new products. Or it could be a new set of tickets that come on sale, and these bots go in and buy off these new tickets and then they sell those at a higher price in the secondary market. And obviously, that's not good because it shuts the real fans of the product out of being able to bid on it because they're essentially bidding against these automated bots that can run these transactions at massive scale and massive speed. And so our systems will detect these types of large-scale bot transactions and prevent those from occurring so that actual users, the users that the company wants to develop and grow into its loyal fan base, so to speak, are actually able to compete for those initial tickets or for those initial products at launch time without having to pay exorbitant amounts in the secondary markets where they're sold.

DAVID: And Nick, I'm curious, from a Microsoft point of view, what sort of technologies do we have that companies are building on maybe to enable these types of solutions?

NICK: Well, there's a lot of tools that we have that help monitor the safety and resilience of our environment. So our partners are using our platform to build all the different solutions, and we give them tools to monitor behavior risks from other cyber attacks and things that are happening to their infrastructure. So we have tools like Azure Monitor, which helps provide sort of an end to end view of everything they have running on Azure. And then also tools like Azure Stream Analytics, which to the point that was brought up earlier, it's really are you getting a big spike in activities or something that's happening that's not normal? And you can track that in real-time. And one of the things that we're seeing a lot of companies are using just from a risk score perspective is a secure score which is part of the Azure Security Center; it's part of Office 365. It's great for analyzing what a company is doing from a security perspective. It actually tracks whether people are updating their passwords on a regular basis, the Office environment for the entire company: Are they having the latest versions? Are they behind on some of the security updates? Have they not updated the password? How many passwords are dual-factor authenticated? And those sorts of things. So you can get a score on that at the company level about the activity for the entire company in Office 365, and companies are seeing that as a real benefit to understanding the risk of doing business with a specific company or where they have vulnerabilities. So it's really, how do we assign a risk score for the entire environment? How do we monitor it? How do we act proactively? And there are tools that are available on our platform to help support that.

DAVID: Very cool. And now let's take a moment out to listen to this very important message.

[COMMERCIAL:]

Did you know the Microsoft commercial marketplace allows you to find and purchase leading Microsoft certified solutions from Microsoft partners? The Microsoft commercial marketplace includes Microsoft AppSource and Azure marketplace. Each storefront serves unique customer requirements and different target audiences, so publishers can ensure solutions are available to the right customers. For applications that integrate with Microsoft 365 products, visit appsource.microsoft.com. Get solutions tailored to your industry that work with the products you already use. For B2B Azure-based solutions, visit azuremarketplace.microsoft.com. Here you can discover, try and deploy the cloud software solutions you want.

DAVID: So it seems like some organizations have taken the approach of dumping all their data into a data lake. And one of the most popular episodes we've ever had is data, data everywhere. [Laughs] And folks are bringing together their various data sources into a single store from which they can use products like Azure Synapse Analytics, for example, to glean insights into the data that they've got in their data lake and see things across multiple datasets that they would otherwise not have seen. So, Jamshed, I'm particularly curious how this technique of analysis is panning out for companies who are trying to get those insights from their data lakes.

JAMSHED: David, so this area has evolved quite a bit in the last few years. What we've noticed is that data that has no intelligence can still be useful. And the reason is, again, if you think about the early days of machine learning which was primarily what I've described as supervised machine learning, the general approach was just to throw a lot of data at this problem and hope that it solves itself, and it didn't really work. And what happened was that with the advent of unsupervised machine learning that was introduced and has been really taken to a production level by DataVisor, we are able to take these seemingly unrelated sources of data and able to identify patterns of user behavior not just with the data but by also combining it with third-party data that could provide additional context. And so if you think about an internal company system, we help large companies manage risk, and they may look to detect employee fraud. And typically when they look to do that, they look at systems in a very siloed manner. But what we try to look at is across the company, we look at the travel and expense system, we look at the gifts and entertainment system, we look at systems that can track tender processes, and bribery, and corruption and kind of a very broad financial set of statements that can allow us to draw correlations between all of these different data sources and derive intelligence from that. And we also have a feature platform that allows us to build intelligent features based on these very large disparate sources of data. And those features can then be used to power our machine learning engines as well as our rules engines to create better decisions. We also are able to use technologies like visualization, something that we call Knowledge Graph that can allow you to view this data that might seemingly be unrelated. But when you look at it in the context of this really dynamic graph, you can start to see the relationships between those patterns of data and start to draw conclusions from that. So we've definitely seen this area evolve, and we are finding that data can have intelligence where you don't expect it, especially when there are multiple different sources of data that you have access to.

DAVID: Nick, I think you had a question for Jamshed too.

NICK: So of the data that you're bringing in from sources both internal and external to a company, what do you find is sort of the most interesting data source that kind of gives you unexpected results? You talked about sort of non-decision making things. But of all the data you bring in, what were sort of the most interesting fields of data that you were able to pull in?

JAMSHED: That's a great question, Nick. So there's actually a lot of them. I wouldn't say that there are two or three top data sources. It depends really on the use case that we are trying to solve whether we're trying to solve an internal risk management problem for a company or a transaction fraud in the financial services space or e-commerce fraud, or account application fraud, and so on. But what I will say is that the data that we utilize include things like edge data. So for instance, if a user is accessing a particular system through an app, we have a very powerful SDK that we can embed in the application code that allows us to detect and verify that the user is using the same system even if they tried to use emulators or try to otherwise mask their trail. So we can look at these different transactions that seem to come from different systems that we can say, "Oh, these are actually being generated by the same emulator code." Or we can look at it and if the device is rooted, we can still identify with very high precision that it's actually being used out of this particular device. So that's one type of data.

The other type of data is user behavioral data. So we look at the transactions, but we also look at all of the metadata that surrounds this transaction, and that allows us to look at user behavior and allows us to look at, for example, is the user accessing the system more in the last 7 days than they have accessed it over the last 365 days? As an example. Or we could look at correlations between users. When you look at a particular user's interaction, it might look perfectly normal but then we compare it to all of the other users in the system, and we start to realize that there is a pattern of access. It could be timing-related patterns or based on velocity-related features, or it could be patterns that are related to the amounts that are deposited. In banks, for example, when you have money laundering type operations, typically you have these mule accounts, and you might see transactions that are all transferring a fixed amount like $1,500 at regular intervals into a third-party account. So when you start to look at it across these different systems, you start to detect these subtle behavioral patterns, you detect transactional patterns, and you detect access patterns that allow you to put all of this information together to determine an overall risk score for a particular transaction or a particular user or a particular entity in the system. And in the case of insurance, for example, it could be not just the end-user, but it could also be an agent. It could be agent initiated fraud there. And so when you look at all of these different sources of data, including transactional data, including access data, we take all of that into account including network traffic as an example and make decisions based on that.

DAVID: That's just incredible. So I do have one last question for you, and that is you've done a great job of helping us understand what unsupervised learning is. How does that distinguish from a rules engine? Why isn't a rules engine simple enough to drive some of this decision-making?

JAMSHED: So a rules engine is definitely helpful, and we actually have a rules engine as part of our comprehensive fraud and risk management stack. But the problem with the rules engine is when you have new forms of attack, it takes a while for the systems to detect them, and then you usually have to manually update these rules and then deploy them into production. And this entire process could take weeks or even months to update the systems. So there's a long period of exposure where the rules are not up to date enough to be able to detect these new patterns. So what our unsupervised machine learning does is when it detects a new pattern, it automatically generates rules to update the rules engine as well. And then we combine the result of our unsupervised machine learning with the results of our rules engine and supervised machine learning, and we use an ensemble model to determine the final risk score. And what we see is that our precision goes up dramatically when we are able to combine the results from these different models, and that's what is unique about our product and our approach is that we not only use standard rules engine and supervised machine learning approaches, but we also use unsupervised machine learning in order to be able to combine these risk scores and generate an even more accurate picture of what is fraudulent and what is not. And that's why we achieve the kind of amazing results that we're able to deliver to our clients.

DAVID: Very cool. Thanks for that explanation. It makes a lot of sense. And the idea that you're generating rules for your rules engine on the fly is just a fascinating conversation that we could dig into and probably never get out of, but it's time to close out. And accordingly, I want to mention that we'll include social links for both Jamshed and Nick, so you can follow them online and links to DataVisor, so DataVisor's social and also a link to their marketplace offers and listing so that you can see what DataVisor solutions are available to you through the Azure Marketplace. And then finally, we'll link to some of the Microsoft resources that were mentioned throughout the show. And with that, Jamshed, I want to thank you so much for being on the show. This was a fascinating conversation.

JAMSHED: Thank you, David, and thank you, Nick. I enjoyed the conversation.

DAVID: And thank you, Nick, for joining us today. It's good to have you back.

NICK: Yeah, it's my pleasure.

DAVID: Thank you for joining us for this episode of the Azure for Industry Podcast, the show that explores how industry experts are transforming our world with Azure. For show topic recommendations or other feedback, reach out to us at industrypodcast@microsoft.com.