Azure for Executives

A Rapidly Changing World of Data with Todd Singleton.

Episode Summary

The ways companies are managing data is rapidly changing. New patterns and technologies focused on data management are emerging to help. Join Todd Singleton, General Manager of Solutions in the Azure Data Group at Microsoft, to hear how Microsoft is bringing solutions to help with the explosion of data.

Episode Notes

The ways companies are managing data is rapidly changing. New patterns and technologies focused on data management are emerging to help. Join Todd Singleton, General Manager of Solutions in the Azure Data Group at Microsoft, to hear how Microsoft is bringing solutions to help with the explosion of data.

In this episode we discuss how to gain insights that can fundamentally change your business. Further, we talk about with the value of Data Lakes. Finally, a very topical subject: We touch on the use of data in the pursuit of social justice. 

Show Links

Episode Transcript

Show Guest 

Todd is the General Manager of Solutions in the Azure Data Group at Microsoft.

He has twenty years of entrepreneurial leadership experience in high growth, technology-driven markets. He is adept in understanding the pains of enterprise customers, designing solutions to deliver impactful business outcomes, and operationalizing the delivery of those solutions at scale.

Follow Todd on LinkedIn

Your Hosts

Paul Maher is General Manager of the Industry Experiences Team at Microsoft. Follow him on LinkedIn and Twitter.

David Starr is a Principal Azure Solutions Architect in the Industry Experiences Team at Microsoft. Follow him on LinkedIn and Twitter.

 

Episode Transcription

David Starr:

Welcome to the Azure for Industry Podcast. We're your hosts, David Starr and Paul Maher. In this podcast, you hear from thought leaders across various industries discussing technology trends and innovation, sharing how Azure is helping transform business. You'll also hear directly from Microsoft thought leaders on how our products and services are meeting industry's continually evolving needs.

 

David Starr:

So in this episode, we're talking about a subject near and dear to the hearts of many technology leaders, and that is, what is the current and future for data management? Data is ubiquitously available everywhere now. How do we store it? How do we keep it secure? How do we derive actionable insights from it? These are all very important questions as we look at the data that we're storing.

 

David Starr:

Now data management has become an incredibly important topic in modern IT solutions, and here to talk with us about that is Todd Singleton. Todd is General Manager of Data Solutions in the Azure Data Group. Todd, welcome to the show.

 

Todd Singleton:

Thank you, David. Really glad to be here.

 

David Starr:

We're glad to have you. So just to get started, I wonder if you could tell us just a little bit about your team and what it is that you do?

 

Todd Singleton:

Right. So we're a newer team. We're called the Data Solutions Group, and we sit across the Azure data portfolio, we cut right across. Within the Azure data portfolio, we have about 43 different products and services, from our operational databases to our analytic solutions and our emerging governance portfolio.

 

David Starr:

When you talk about your emerging governance portfolio, could you drill down on that just a little bit and tell us about it?

 

Todd Singleton:

Yeah. We're talking about data governance. As you look at across the data estate for any enterprise, the ability to govern the data is increasingly an issue because you have a tremendous amount of data sprawl. So about a year ago, we acquired a company called BlueTalon, and they have an enterprise data catalog. So that's the leading product within our data governance portfolio, and we're continuing to build around that such that organizations can properly govern their data end-to-end, if you will.

 

David Starr:

And that's a big chain.

 

Todd Singleton:

That's a huge chain, right, it really is, but we're well positioned to address that here in Microsoft, given the footprint we typically have in IT organizations.

 

David Starr:

Now I understand that your team focuses in three large primary areas: the end-to-end solutions for large and strategic enterprise, data estates, we've talked abut that a little bit already, but ecosystems of system integrators and independent software vendors. Can you talk a little bit about that?

 

Todd Singleton:

Yeah, sure. Great question. So if you look at my team, we have a broad charter around the idea of data solutions. Now that's a very broad charter, and we certainly cut across the portfolio of products. But the reason we've set this team up, increasingly, we see Microsoft, who has a phenomenal approach to IT-centric solutions, that's a core competency for us, but more and more, our customers are asking for business outcome-driven solutions.

 

Todd Singleton:

So we're starting to set ourselves up, Microsoft all up, around industry priority scenario. We even launched the industry cloud recently, the healthcare cloud. So that's all very exciting. What that means is, we have more narrow, but deeper value propositions to drive business outcomes.

 

David Starr:

Sort of outcomes over technology, if you will.

 

Todd Singleton:

Yeah. What's the actual business outcome as opposed to just the IT outcome, right? So if I'm migrating a database from point A to point B, that's more of an IT solution, but if I really need to reduce... for example, I'm a financial services enterprise, and I need to reduce the number of false positives when I'm looking across my data for potential fraud, that's more of a business outcome. So as we start to address different personas outside the IT stack but into the business units, we have more narrow yet deeper value propositions that we need to drive.

 

Todd Singleton:

So as a product and engineering group, but cutting across our portfolio, we really want to make sure that our differentiated value is surfaced in those business outcome-driven solutions. So for us to do that, we really need to understand the supply chain as it relates to who's actually creating those solutions and what are the key ingredients to those solutions. Now if you look at that, that includes our products and services, but it also includes an ecosystem of co-creators, like ISVs and system integrators. It also includes raw material in the form of an ecosystem of data providers.

 

Todd Singleton:

So my team is really looking at the end-to-end supply chain necessary to drive business outcome-driven solutions or industry priority solutions at scale, particularly those that are data-centric.

 

Paul Maher:

Yeah, no, that's awesome. Thank you, Todd. Let's switch gears a little bit, and super exciting to hear the remit of the team. I love the fact that you're kind of thinking about business scenarios as well as technology, helping drive innovation and solve those problems.

 

Paul Maher:

Let's talk a little bit about data, data and more data. I'd love to take you back a little bit and then we'll fast forward to today and how we think about the opportunities. So from your perspective, Todd, how is data management different today than it was 10 years ago? I'd love to hear your thoughts on landing some of the problems as we tease forward and talk about how we're driving new innovation and opportunity.

 

Todd Singleton:

Well, you ask a great question there. When I think about the need for management, I always go back to dimensions of scale. So 10 years ago, if you think about the number of environments I needed to manage just to manage my data, maybe I was just [inaudible 00:06:12], but now today, I'm in the cloud, I'm multi-cloud, and perhaps on the Edge. So just there I've scaled a number of environments in which I need to manage and deploy my data. So that dimension of scale creates complexity.

 

Todd Singleton:

The second dimension of scale is really the type of data. So data has taken on several different dimensions in the past 10 years. It needs to move faster. There's streaming data I have to think about. I need to think about operational data. I need to think about analytics. I need to deliver my analytics and those insights quicker than I've ever had to before. I need to actually run analytics against my operational data. That poses challenges.

 

Todd Singleton:

So my point being is that things have just gotten increasingly more complex. Not to mention even the fact that I have to... compliance. Compliance has got way more complex. I was talking to a customer not too long ago and I asked him, I said, "10 years ago"... and this customer is an [inaudible 00:07:07]... "10 years ago, how many audits did you have [inaudible 00:07:13]?" He gave me a number. I forget the exact number. But he said, "Now it's five times that amount." So it's just a completely different scale factor in terms of compliance and audits and the cost around that.

 

Todd Singleton:

So things are way more complicated, and so companies really need a portfolio that really can take that into consideration, can scale end-to-end, and cover more scenarios than ever before.

 

David Starr:

Organizations are collecting more and more data all the time and having to evolve their data management strategies to keep up with that practice. How are they meeting the current needs and how are they getting future ready, if you will?

 

Todd Singleton:

A number of companies are having a hard time keeping up, but in terms of those that are getting future ready, I think migrating to the cloud is something that's become a must. The cloud really offers the scalability, the elasticity, the pay-as-you-go model. Some of the basic tenets of the cloud are really important for companies that are trying to manage their data estate and get more value from that data estate.

 

David Starr:

The next part of that question that I have is, many companies struggle with what to keep, what data do I want to keep around, because there are storage costs, of course. Then there's the question of, what data do I need to derive insights? So what do you see companies doing with regard to storing everything versus strategic data collection and storage?

 

Todd Singleton:

Yeah, that's a great question as well. Step one in that is figuring out, what do I have? It could be surprising as you talk to a number of customers, they're really having a hard time just answering that question: What do I really have? Then, obviously, where's it stored? Is this the best place for it to be stored? Is it cost-efficient? And that's where something like a data catalog is really important.

 

Todd Singleton:

Then increasingly I need to consider my storage options based on the nature of the data itself, from a privacy standpoint, from a security standpoint. It's really going to affect how I think about where to store that data and how to store that data. So there are a lot of questions that go into answering that question, but the very first question is, what do I have? I think a number of companies are still in that phase.

 

Paul Maher:

Awesome. So just playing that back, just thinking aloud from what you've said, Todd, which is great, so we're seeing over the last 10 years just really the complexities of data has increased. Obviously there is ever-increasing demand to derive meaningful insights from that data. Obviously with the advent of cloud as well data's becoming much more accessible and a little bit more of a commodity, and so I think there's opportunities to really open up and unlock the data in new and interesting ways. But, as you said, we need to be also super cognizant of being very thoughtful of the data in terms of thinking about privacy concerns and compliance and so on.

 

Paul Maher:

So with that, and as we thought about this data explosion, this opportunity for driving new insights and new innovation with data, let's think a little bit about and talk a little bit about the practical opportunity when we think about the Azure products that are at the forefront of data management. So let's think about the how can people get up and running, and it's obviously not just databases or storage but also data management technologies and techniques.

 

Paul Maher:

So for the listeners out there, I think they can totally empathize and I'm sure are tuning in listening to all of this opportunity and all of the challenges. How can we help? How do they make it real? How do they get up and going?

 

Todd Singleton:

One thing I'm particularly excited about, and many of us have heard stories of companies getting great insights out of their data, and that continues to happen, more and more of that's an imperative versus a nice to have right. Companies really need to take their data and make it a tangible asset and tie that asset to real returns on investment. Particularly as I talk to different chief data officers, that's more and more a part of their mandate. It's no longer just about protecting their data or just about compliance, but it's how do I turn this incredible enterprise asset into real tangible value?

 

Todd Singleton:

So obviously analytics, AI, machine learning are a key part of that. I'm really excited about the overall architecture and what we're doing around Synapse. Synapse is an end-to-end solution, if you will, for analytics. It was announced in November of 2019. It's in public preview now, and we continue make a tremendous amount of progress. The demand for it was tremendous. Part of the that reason was the product team spent a lot of time really trying to understand the problem: What are some of the problems, some of the felt pains that enterprises feels as they extract value out of their data, as they try to extract intelligence and insights out of their data?

 

Todd Singleton:

One of the key things, which is not so glamorous, is the fact that... the data wrangling issue. Wrangling all the data together, putting it in the right state, cleaning it, making sure it's quality, this, that and the third, takes a long time. It's very expensive, typically, for an enterprise. In terms of the tools necessary during that data wrangling phase, it's a mismatch, a hodgepodge of pulling together different systems.

 

Todd Singleton:

One of the core notions behind Synapse was the idea of let's bring all those tools together in one integrated solution so the end-to-end experience of pointing at the sources of data, to extracting value out of that data is one continuous experience within one product, within one security context. So that's really the power of Synapse. So that's one product, to answer your question, that is focused on helping companies derive value from their data asset.

 

David Starr:

I'd like to check and see if I heard the whole picture of Synapse properly, and that would be that you're able to point at some different, disparate data sources, maybe run that through some ETL or transformation to get it into a shape that you'd like, storing it maybe in a data lake or some analogous data store, and then finally maybe making a more structured data warehouse out of that for analysis purposes. Is that what I'm hearing?

 

Todd Singleton:

That is what you're hearing. So we have products like Azure Data Factory, which really helps with creating those pipelines from various data sources into the lake, for example, and performing transformations and the like. Synapse is certainly a big bet on the idea of an enterprise data lake, where that data lake is a limitless scale. Synapse is not forcing you to put your data in a proprietary format. Synapse is not trying to be a gatekeeper to your data, if you will. It's really more about how do we take those really popular execution environments like [SQL 00:14:36], like Spark, how do we take that to your data lake?

 

Todd Singleton:

So it's a big bet on a data lake as the center, that's your data, but then bringing all those value added services like Azure Data Factory, like those execution environments, to the data. So, yes, you heard that correctly.

 

Paul Maher:

What are the types of trends, the hot topics, the challenges they're facing? I'd love to hear that, and I'm sure our audience would as well. And what's your perspective of how to start thinking about solving some of those point of mind challenges?

 

Todd Singleton:

There are several challenges. I'll point out a few that are particularly interesting to me. My team, when we think about solutions, we think about data as the raw material. So I think what's really exciting is to see companies who know to derive insights from their own first party data, they often have to enrich that data with third party data, but where can they find that data? What's the value of that data? How much do they pay for that data? Is it open and free data? How do they bring that together such that they can derive greater insights needed to make business decisions?

 

Paul Maher:

Imagine I have an estate of data across different data stores. What are your thoughts on either rationalization and migration strategies to move the data around per se, versus using a strategy of trying to access the data where it's at? So it's thinking about data wrangling, manipulation and migration, and then dealing with it in one location, and the cost of that, versus the opposite which is, can you traverse different data stores and manipulate the data that way? What are your thoughts on that?

 

Todd Singleton:

Yeah. I hear where you're coming from. We hear that from customers a lot: "I really want to move my data less. I'm constantly moving my data around to support different use cases, to support different vendor systems," et cetera and so forth. That's why we're making this big bet on the idea of the enterprise data lake, and in many ways the extended data lake, where it's logical. We're not necessarily talking about moving all the data into one place, but the fact that you can access the data lake and reach data where it resides. So there are a number of efforts in-house to address that. I can talk about some of them.

 

Todd Singleton:

One, for example, is the idea... it's something that's really, really exciting. You have the operational data stores versus the analytics data stores. So this has been a huge problem for a number of companies: Perhaps I need to run analytics against my operational data store. So I have two options, typically and historically. I can move the data, like we just talked about, to my analytics store, so there will be some latency. There's obviously some costs associated with moving that data. Then I can run analytics across that data, no problem.

 

Todd Singleton:

Or, if I don't want to incur that cost and that latency because perhaps I need real time or near real time insights from the operational data store, I can run my analytics right on that operational data store. Now the problem with that is, typically, I will incur a performance hit, and some companies can't afford that performance hit to their operational store.

 

Todd Singleton:

So one thing that we've done, which is absolutely mind-blowing to me, is we have the first ever HTAP, Hybrid Transactional Analytical Processing engine, meaning we've solved that problem. You can run analytics against an operational store without incurring that performance hit. So you get near real time analytics against that operational database. Right now that's been implemented. It was announced in May at Microsoft Build. It's in private preview, I believe. It's called the Synapse Cosmos DB Link. [inaudible 00:28:23] DB is that operational store, Synapse is that analytics processing engine.

 

Todd Singleton:

So really, really exciting, and moving forward you can expect us to support additional operational stores like SQL, like [Postgres 00:28:38], like MySQL. So, to me, that is an absolutely phenomenal step in the right direction as it relates to not having to always move and copy that data. So that's just one example.

 

Paul Maher:

Yeah, that makes a lot of sense. I'm just going to say out loud again what you've just said, which makes a lot of sense, is that logical data lake. So it's not necessarily a data lake that has [inaudible 00:29:02] data that's been moved. It could be traversing across into [inaudible 00:29:06] data where it resides, which is fantastic. The HTAP announcement, that's being able to minimize or mitigate that overhead, is fantastic.

 

Paul Maher:

Now you mentioned Cosmos DB there. I just want to put a plug in there. So Cosmos DB, our geo hyperscale solution. Could you talk a little bit more about that for listeners who maybe aren't familiar with Cosmos DB?

 

Todd Singleton:

So the simple way to think about Cosmos DB is that that is our most equal store, massively scalable. It's also multi-model. So it can act as a graph. There's some MongoDB AI. There's [inaudible 00:29:46].

 

David Starr:

I have an observation about Cosmos as a developer that I've really enjoyed, and that is that no matter what stack I'm on and no matter what database I'm used to working with, there's probably a similar way to work with Cosmos. So I can come to it as a SQL database. I can approach it as a graph database. I can approach it as a Mongo database. But I'm getting access to the same data regardless.

 

Todd Singleton:

Yeah. The multi-model aspects of Cosmos DB is certainly powerful and we're seeing... I believe the MongoDB API is a fast-growing way of consuming Cosmos DB. [Cassandra 00:30:25] is something we're really excited about. Cassandra has a huge on-premise and store base. So as companies think about migrating to the cloud and if they have a huge... if they're big Cassandra users, they can just quickly adapt Cosmos DB in the cloud. So it's a really powerful engine, and highly performant, highly scalable, and multi-model, as you suggested. So it's one of our more impressive offerings.

 

David Starr:

Now let's take a moment out to listen to this very important message.

 

Speaker 4:

Did you know the Microsoft commercial marketplace allows you to find and purchase leading Microsoft certified solutions for Microsoft partners? The Microsoft commercial marketplace includes Microsoft AppSource and Azure Marketplace. Each store front serves unique customer requirements and different target audiences so publishers can ensure solutions are available to the right customers.

 

Speaker 4:

For applications that integrate with Microsoft 365 products, visit appsource.microsoft.com. Get solutions tailored to your industry that work with the products you already use.

 

Speaker 4:

For B2B Azure-based solutions, visit azuremarketplace.microsoft.com. Here you can discover, try and deploy the cloud software solutions you want.

 

David Starr:

As data management has evolved and matured over time, what Azure services specifically maybe that we haven't mentioned do we see customers, and particularly partners, increasingly leveraging in their solutions? I'm thinking about things like... I know we have streaming analytics, and we touched on that just a little bit, but can you tell me how that visualizes for me? Does that come in through Power BI, for example? Can I see, say, near time analytics on my fleet of vehicles out there that are sending data from my OT devices, something like that?

 

Todd Singleton:

Yeah, that's a really good question. We have a big portfolio with over 40 products and services. They're in different stages of maturity, if you will. SQL DB is 25 years old, incredibly mature, has a huge [inaudible 00:32:32]. We have a number of emerging services like Azure streaming, data streaming, like Azure Data Share, Data Factory, and those products sometimes don't make the headlines as much but they're being consumed in an increasing fashion. A number of customers are making use of the streaming product, for example, for IOT-type scenarios, and that's really exciting.

 

Todd Singleton:

I mentioned Azure Data Share earlier, but we're seeing a tremendous need for companies to have a strategy and plan for sharing data, both top-in data as well as sharing data in place. So that's an exciting one as well. Then Azure Data Factory just continues to grow, as a way of really creating CICD pipelines and transforming data into the correct state from one place to another.

 

Todd Singleton:

So it's a very rich portfolio, to the point that you're making, and it's been really interesting to see customers consume a wider set of products and services from our group.

 

David Starr:

Can you share maybe some specific instances where Microsoft data technologies have helped make a real difference for a client partner, or maybe even across an entire industry?

 

Todd Singleton:

You know, there's so many. I'll share about a few that have really been exciting over the past week even. One I'll quickly share about, and then I'll elaborate on another.

 

Todd Singleton:

I guess one customer, a retail customer out of southeast Asia, it was just really interesting to see how, in this COVID environment, they were leveraging data that we were providing, obviously their own data, and it completely transformed how they think about... they were able to extract insights from the data by using our data warehouse, in combination with HDInsight. They were able to find insights that told them, in this type of environment, they needed to do something, which was an annual sale that they run every year. Instead of running it every year, they're now running it every month. So just the fact that it really helped them with their decisioning process was something that we were particularly proud of.

 

David Starr:

And it drove a fundamental change to their business.

 

Todd Singleton:

A fundamental change to their business. Insights directly from data, and those insights found by leveraging our tools. So that's something we're really proud of.

 

Todd Singleton:

That's a commercial example, but I live in northern California. I'm from the east coast, from Maryland. What's going on the United States in terms of... obviously COVID is a really challenging time for us, but also, if you look at what's going on around social justice, I feel completely excited about working here in Microsoft. The work that we're doing in Microsoft all up, as it relates to equal justice, in the Equal Justice Initiative. [inaudible 00:35:30] put out a letter about a month ago saying, "Hey, we're really trying to help with equal justice for all," which is baked into some of our founding documents as a country.

 

Todd Singleton:

So some of the work I see that's happening is around data because when it comes to equal justice for all, the role of data is paramount. Paramount. Let's look at equal pay, for example. This is a well-known problem. But let's say we have a gender inequity in pay. If we look at a single instance of this, it's hard to derive that this is really a problem, but if we zoom out and aggregate the data around pay, it is clearly a problem, and you can't argue against it.

 

Todd Singleton:

Well the same thing occurs when we look at racism here in the United States. If we look at a micro instance of bigotry, or what have you, it is what it is, it's a bad situation, a bad instance. But to really illustrate systemic racism, you have to zoom out and aggregate data, and then the counter arguments quickly go away.

 

Todd Singleton:

So the work that's being done around the Justice Reform Initiative, JRI, we're leveraging data to illustrate these inequities in the system, like these disparate arrest rates, or the sentencing inequity that occurs across different demographics. When you're able to zoom out and aggregate the data, there's a huge problem, and it's more difficult to argue against that. So once we get agreement about the problem, then it's a lot easier to forge solutions.

 

Todd Singleton:

So we're working with a number of partners who focus in this area, and we're saying, "Hey, we have this incredible asset, in terms of our products, analytics product, database product, you name it." We also have expertise and a lot of really great people that want to raise their hand and volunteer and help because most of these organizations are underfunded and have a hard time finding resources.

 

Todd Singleton:

So working with some of these groups, leveraging our products, leveraging our expertise, to just bring out the data, not trying to be partisan in any way, let's surface the data and see what the data has to say about inequities here in the United States. So that's probably the most exciting work I see happening right now, at least for me personally. So I appreciate you asking that.

 

David Starr:

It is for me too, I have to admit. I'm really proud to be working for Microsoft, a company that's making such an effort in these areas. It's made me feel good about my job and what we do. So I really join you in feeling good about what we're doing as a company.

 

Paul Maher:

Thanks both, and great discussion of course. Todd, we've come to an end, but this should be the beginning of the conversation for our listeners. So what we like to do is get a close on things that are point of mind for you, and a call to action so we can continue the conversation.

 

Paul Maher:

So for our listeners out there, of course we'll share some links and so on when we publish the podcast, and so you'll see that on our site, but what's point of mind for you, Todd, for our listeners? If they were to continue the conversation and go there more, any call outs that are point of mind for you?

 

Todd Singleton:

What's point of mind for me is really trying to, as I stated earlier in the call, how do we really help facilitate the development and the go to market for data-centric solutions? I think the brave new world is one were those business outcomes are key for enterprises. They want to know: Can you move my KPIs? Can you affect my OKRs, my key results?

 

Todd Singleton:

So making sure that our portfolio, which is phenomenal, the differentiated value proposition can bubble up through those business outcomes is top of mind for me, and that's a wide charter, but that's what we're constantly thinking about: How do we work with our system integrators, how do we work with the KISVs, how do we work with the data providers so they can leverage these amazing tools that we have and drive very clear business outcomes for our customer?

 

Todd Singleton:

So in terms of what are some of those outcomes that we're prioritizing, that takes us right back to those industry priority solutions that we publish right there on our website. That gives you an idea of the priorities as it relates to these outcomes across various industries, be it financial services, healthcare, retail, manufacturing, government. There are many, and a lot of great work's being done across the board.

 

Paul Maher:

Awesome. I agree. The industry priority [inaudible 00:40:39] solutions are fantastic. They provide really Microsoft's vision of our perspective on the industry, and also the opportunity that we have from a technology and a partnering point of view. So thanks so much for sharing that, Todd.

 

Todd Singleton:

Absolutely. I really appreciate you having me on the show. It's been a good time here. We've had a good conversation. I'd love to come back as my team continues to progress, and I can share an update as we work towards any number of the projects that we're engaged with, FaceTime FaceTime David. We're going to get there.

 

David Starr:

[inaudible 00:41:13]

 

Todd Singleton:

So I appreciate it. Thank you very much.

 

David Starr:

I want to also thank you for being on the show. It's been an absolute pleasure, and I look forward to next time.

 

Todd Singleton:

Great. So do I.

 

Paul Maher:

Thanks, everyone. Bye, bye.

 

David Starr:

Thank you for joining us for this episode of the Azure for Industry Podcast, the show that explores how industry experts are transforming our world with Azure. For show topic recommendations or other feedback, reach out to us at industrypodcast@microsoft.com.