Podcast, Part III: Bridging the Gap Between the Art and Science of Data Analytics
In the final episode of this three-part series, Jacey Heuer returns to continue the conversation on Data Science and how it needs both a scientific and artful approach.
Science is the iterative testing, results change over time with variables. For data science, what’s true today could dramatically or incrementally change tomorrow based on one variable. The art of it is accepting that there will be exponential opportunities to discover more, learn more, and communicate more to find value and purpose in data.
This final episode with Jacey Heuer provides insights into how individuals can seek opportunities in this field and how organizations can purposefully mature data science and advanced analytics.
Read the Transcript
0:00:58.1 ME: In our third session with Jacey Heuer, he helps us bridge the gap between the art and science of data analytics. We discuss what is required of people and organizations to explore, adopt, implement, and evolve today’s data science practices for themselves and their organizations.
0:01:18.2 Jacey Heuer: And so I really look at this as, again, bringing it back to science and art. Science gets you to the insight, the art then is how you tell that story and paint that picture to create comfort with some of that uncertainty that you’re now revealing in your data.
0:01:35.2 ME: So as it relates to individuals and organizations and the adoption of a more formal data behavior, through your experience and your perspective, the study, the work that you’ve done, how do we make this a normal, common daily conversation for people and companies instead of this emerging knowledge area that some people are studying?
0:02:05.9 JH: You’re right, the passion is a key component of this, right? I think passion across anything you’re engaged in is important to be able to find that and it’s a true driver motivators finding your passion. Mine is learning, happens to be with data science, and those kind of come together well for me. Just going a bit deeper into my personality with this too, is data science, as much as there’s science involved in it, there’s a lot of art involved in it. Personally for me, my background, I have an art background as well, in my past. When you think about left brain, right brain, creativity, logical, all that kind of stuff, it’s usually more binary, more definitive, and for whatever reason, I have some bit of a crossover in that. I can find enjoyment in both sides of that and it works well for me with data science, but what I think about from the standpoint of trying to wrap your brain around, what does this mean, how do I gain comfort in sort of the mindset that it takes to deal with and feel okay with ambiguity, uncertainty, right?
0:03:11.9 JH: I think so much and so often in business, which rightly so, it’s, I want to know definitively, 100% accuracy, what’s gonna happen in the future and so on. That’s a fair mindset, and I think there’s a lot of good leaders and people realize that’s not possible, and you make your own decision too. Given the information I have at hand, what’s the best decision I can make, and you go with that. Data science is really taking that human decision process, which you’re already dealing with, uncertainty, whether you’re aware of it or not, and just putting more support to quantify some of that unknown through data. And in that does require a new mindset of, the information I’m taking in may become more broad because I’m getting more data supporting the breadth of my decision-making, but then that also then becomes the realization and vulnerability of really seeing the uncertainty that the decisions I’m making, distilling those in my mind, that uncertainty in a way that I may not be aware of, but now because that data’s present, I’m aware of that uncertainty and becoming more potentially concerned with that uncertainty.
0:04:23.9 JH: And that’s where the side of the data scientist becomes vital and important, it’s a storytelling. And so how do you tell that story and manage the uncertainty that you’re now highlighting to a leadership or an individual that they might not have been aware of before? At least consciously aware of that is maybe the better way to state that. And so I really look at this as, again, bringing it back to science and arts. Science gets you to the insight, the art then is how you tell that story and paint that picture to create comfort with some of that uncertainty that you’re now revealing in your data.
0:04:56.1 ME: That’s similar to just about any career, I imagine, but I know explicitly in the technology side of things where there can be absolutely fabulous software developers who have not yet discovered that they have to also be able to communicate the goal and the journey and the value and manage that message and I wonder if that’s not a learned behavior for any human, but the fact that you’ve articulated the relationship between art and science all as the same collective responsibility, that’s really powerful.
0:05:37.1 JH: Science inherently is journeying into the unknown. Science is meant to constantly test and retest and so on. That’s what good science is, but there’s rigidity, there’s a tool belt that can be applied to that testing. It’s a known set of tools, generally. The art side, the learning there comes through experience, comes through vulnerability, comes through the willingness to test out, does this… From a data science perspective, does this plot with the dots on it mean more than the plot with the lines on it, does the bar chart mean more than the pie chart and so on and so forth, and how do I combine those together to get that message across, and at the same time, beyond the visual, it’s… Your written and verbal communication as well becomes essential ’cause you’re the one creating the confidence in this new idea that you’re bringing to the business.
0:06:37.7 JH: You’re bringing across… A good example I have would be the concept of distribution density plots, so it’s a very statistical term, basically all it is, you think about a normal distribution bell curve, it’s putting some statistics to that bell curve, just for example. How do you convey what that means to someone that has no statistics background? When you say the word density plot, their eyes glaze over. Being able to distill that down to elementary terms, do it in a way that gets your point across and drives the decision that, I think requires just stepping into the arena, finding and seeking out bits of that opportunity to challenge an idea, challenge a mindset with some data-driven visual, some data-driven insight and put it out there and see what happens. Again, science versus art, science, I think you can practice, you can get through history of defined techniques. Art is more, what works, I just have to try it.
0:07:47.4 ME: So I will amplify that to walk into my next question. Your statement was just, “I have to try it.” And part of my curiosity from your perspective is, let’s talk about someone in an organization who’s just now discovering the whole field of data on purpose. Doing data on purpose. So we’re not talking about just your historical typical, “Let’s create a 2D plot in Excel and call it a day.” We’re talking about trying to understand multiple dimensions of many seemingly unrelated things that when put together may actually reveal something that would never have occurred to our minds, we wouldn’t have seen because we weren’t looking for those types of things. For someone that’s just now figuring things out saying, “Hey, I really think that this might be a thing, I want to look into this.” We’re assuming that they’re starting in kindergarten, they’re starting with near zero. Where would they go? How should people get involved, get their feet wet, jump in? What do you see? What do you know? What would you recommend?
0:08:57.0 JH: Luckily, especially within the last decade or so, the learning options online, the open free learning options online have accelerated vastly. Like with a lot of things, a Google search for data science is a good starting point. There’s a number of open free coding academies. Coursera’s a great one, Udacity, things like that, not to market for anything individual, but it’s starting there as just this data science road map. What do I need to learn? What are the foundation skills to kind of build on? And getting a sense of what the scope looks like, I think starting with just that Google search can help define what are some of these terms and areas of this space that pop up and begin to emerge, things like statistics and programming, R and Python and SQL and kind of this whole space, just starting there with that cloud of what’s out there, to me, is always a good way to begin any project. What is my space that I’m living in? Really then what’s probably been most useful to me, it comes down to learning some of the core concepts and technologies, and then seeking out opportunities to practice and apply those, even if you’re stumbling your way through practicing, applying those, start trying to force those into whatever you’re working on right now, and it may not be the solution for your project at hand, but can I take a sliver of it and make it work from a data science lens to build up my skill set?
0:10:34.7 JH: To really give a maybe more concrete answer to things to focus on, I think it’s… Traditional statistics is a great place to start, and again, there’s a number of resources that are great for that, just through a Google search, statistics being what is the difference between mean and mode and what’s your range, min and max, how do I define a distribution? Things like that. Starting there, then moving into probability, probability is a big concept in data science, machine learning, so getting your mind around that space. You don’t have to be an expert in it, but at least becoming familiar with terms of probability. Probability Bayesian inference is another area that’s out there that goes hand-in-hand with probability as well, those three areas, traditional statistics, probability and then Bayesian inference, which has a lot of probability in it, are three sort of core foundational areas of this spaces, stats to be involved in. And then it’s moving into the technology side, so now you’ve learned and got a grasp on some of these statistical ideas, pick up R, Python. I’m an R guy.
0:11:46.3 JH: Python tends to dominate. Depending on your source, Python might be a little bit in front of R, it could go back and forth. Either one, the mindset I have is become an expert in one, but be familiar across both of them. ‘Cause you need to be able to operate on both sides, and either one of them, you can be working in R and you can leverage Python, you can be in Python, you can leverage R and go back and forth. There’s a lot of capability in the libraries and packages that are out there. And then as you develop the skill set of your technology, some of the base statistics, now start venturing into your machine learning, your AI. And depending on your source and your mindset, all of this really comes back around to developing the skill set to be an expert line fitter is what it comes down to. I say that kind of tongue and cheek, but really, anything you’re doing from a modeling perspective, it’s your taking your data set, which may be X number of columns wide, you can re-imagine that as being X dimensions in space, you have one dimension, two dimension, three-dimensional space, which is what we all live in. You can plot three dimensions on a plot relatively easily, but as you go up into higher dimensions, you can’t really plot that.
0:13:06.4 JH: That’s where a lot of the mathematics come into play then it’s how do you navigate a multi-dimensional space of data and be able to, out of that, to kind of, your thoughts earlier math, you distill meaning from something that in this multi-dimensional space, you can’t visualize and there’s no simple way to get your mind around it. That’s where machine learning and AI and stuff comes into play then. It’s those tools are effectively putting a pattern, finding the pattern in that multi-dimensional space that lets you either split it up or pinpoint a data point and so on. So that’s kind of the foundational skill set I think I would focus on, thinking about it. And then from that, there’s subsets and offshoots, you get into TensorFlow and PyTorch and all these other things into the cloud, all that, but that’s the core of where you really started when you’re talking about “What do I need to get into and start learning to go down this path?”
0:14:01.9 ME: So you led with, “Look for opportunities,” and then after that, I believe you said, “You need to go learn some fundamental elements of statistics.” And there were three different areas you were focusing upon. Then, “Go learn about some of the technology.” Then after that, you were talking about how you can start to take the statistics plus the technology and start discovering, seeking or otherwise applying that. So you’re starting to become operational at that point. So the first two steps are really classes of preparation, if you will, classes of data, prepare steps, but you start to become operational after you have those two classes of things under your belt in terms of familiarity, experiential pursuit that type of thing. So really three big steps. What you just communicated is a time-based journey of course, but I think one of the most valuable things you may have said there is, ultimately you have to seek the opportunities, or this was just an academic exercise of reading about this, then reading about this and then tomorrow there’s new subjects.
0:15:11.2 JH: Very true, and really, the reason for that is, space is so broad. I don’t think it’s unique to data science and this discipline, but there’s so many methods, so much research out there, problems are… There’s no standard, typically no standard problem. And so it’s really that process of, “I have a problem, now what are some methods that I can maybe force on that problem?” I tell you, I think the power… And again, I think this is common across many skills and disciplines, but it’s as you add breadth to your knowledge base, really a lot of the power you bring to your role as a… Your emerging role as a data scientist is not necessarily the expertise you have in a particular method or approach, but it’s the knowledge base you contain of what are alternatives to solving this problem. So now I have instead of one tool that I try to force onto this problem, I’ve got a selection of 10 tools that I can explore that space. I may not be an expert in all 10, but at least I know I can try 10 of those and find the one that seems promising and then really dig into that and become a deeper expert to solve that particular problem. That’s where, again as you step further into this career, your breadth of knowledge becomes greater and a lot of that skill set and value comes from, “I’m not a one-trick pony.” For lack of a better term, “I can pull from this tool set and find a better answer, the best answer.”
0:16:47.6 ME: Well, that is consistent with what you said earlier, which is, you’d like to be an expert in at least one, but functional and useful in both or all. To some extent, I can be an expert and a generalist, and that will take me further down the road than, “I have a hammer.”
0:17:05.0 JH: A lot of that, I think is just tied to the availability of information in this space. So I have the tools at my disposal to go and learn, and again, going back to some of the prior comments, having the passion to learn, being driven by some learning, identifying when you have that knowledge gap and then going, seeking out and learning that new tool set that previously you may have just been, kind of aware of, but now I know I might need it to answer the questions, so let’s go dig into that. Capitalizing on that motivation and building that knowledge from there, I think is essential as well.
0:17:44.7 ME: If I’m an individual, regardless of where I am on my career path, I’m new in my career, or I’ve been around for a while, or I’m in the later third of my journey, whatever it is, is really irrelevant. And if I’m an individual and I’m in a company and they’re not asking me, they’re not talking about any type of analytics, they’re not talking about BI, they’re not talking about any of this stuff. And I’m interested in doing this stuff, it’s probably on me to figure out, “Okay, where is my company? Where are they wanting to go? What problems do they want to solve? And how can I apply these things I’m exploring to proactively propose and find and encourage opportunities? And that might actually be a wonderful journey, it could be a wonderfully educational journey, or it could be a tough journey in the event that you stand alone with that appetite to learn like that.
0:18:37.0 JH: That’s the reality. Whether you’re in a role that isn’t defined traditionally as a data scientist or data analyst, and you’re trying to spark your journey into that, and the organization hasn’t adopted yet, or you’re in a role that, you’re a data scientist in a larger data science team and the organization is fully invested in it. I think for many organizations, there’s still an education gap of what really is advanced analytics and data science and what are the questions that we need to leverage them to solve for us? How do we ask that question? When do we bring them in?
0:19:15.5 JH: I think that’s a universal continuous thing, and it requires to solve that, it requires again, the term vulnerability, is the vulnerability and the willingness to push the idea forward as you continue to gain your knowledge, continue to gain insight and learnings, bring those up to those in the organization who are the decision-makers, the project owners, whatever it might be as, “Here’s a new way of thinking about this.” Likely, they may have heard of it, probably haven’t heard of what ML or AI actually means, wouldn’t say imposing, but putting that perspective out there, making them aware of it becomes as much of your role as anything, if you want to bring that… Develop that skill set, and bring that impact to your organization, you really need to drive that thinking and drive the mindset shift that it requires to incorporate advanced analytics data science into an organization.
0:20:11.3 ME: So if I’m a C-Suite leader, and I have all kinds of amazing responsibilities that go with my role in the organization, just like your role in the organization, and I’m feeling the pressure to make my numbers, and manage my market, and address the current economic situation, all of the things. And you’re the aspiring data person, and you come to me and say, “Hey, Matthew, I’ve been looking at this stuff. I’ve been studying some things. I have a couple of thoughts.” How would you approach me? What would you say to me? Not that I’m belligerent and stubborn and cranky, but rather I’m just on the move, and I’m looking for concrete chunks, if you will.
0:20:48.2 JH: It’s a great, great thought exercise and an important one. What’s been powerful for me, it’s showcasing… As you call it, showcasing out of the possible, but doing it in comparison to current state. So being able to… Whatever your question is, just for the example here, showing, here’s the report, the current process, the current output, what it looks like now, and I’m delivering that to you, so I’m maintaining my relationship with you. I’m not falling short or anything like that, but I’m taking some of these new learnings, and it takes a time commitment, but passion should drive that, to now, let’s layer in a slice or two of something new on the side of that. Maybe I’m forecasting for next quarter for you. And traditionally, it’s just been… What happened last year, we’re gonna add some percentage to that year over year, and something very simple.
0:21:43.8 JH: And now I’m gonna go in and at its current state when I’m enhancing it, by putting some confidence intervals on it, and giving better scenario analysis around if you do X, we see Y. And start to tell that story of what’s the next level. And it may not be perfect, but you’re at least creating awareness of the capability that you’re developing, and bringing to the organization. And hopefully, through that beginning to create excitement around “Hey, I’m the leader, the executive. I could see the improvement here, let’s dig into that further.” And you start to get the wheel spinning and that progress rolling from that.
0:22:20.6 ME: That’s very tangible. Here’s what we’re currently doing, here’s what we’re using it for, and what it seems to mean to us. Here’s what we could be doing, and here’s how it may actually add additional dimension or insight or view or value. That’s really good, that’s very concrete.
0:22:37.3 JH: It’s powerful, and I’ll say, what can be scary in that, fearful in that is, you have to put yourself out there again. I go back to this just because I’m not the stereotype, IT mindsets or data science mind… Personality, and things like that. But again, it’s not waiting for the business direction sometimes, but just taking a chance and stating, “I think if we did this, this could be the improvement.” And at least starting that conversation. It’s that awareness, that seed of awareness that becomes powerful and that it might not be right, but at least you’re creating visibility to a capability that either exist in your skillset or it can exist, and now starting that conversation.
0:23:24.6 ME: Well, let’s shift it a little bit then. So these companies that are starting to realize, “Hey, we need to be a little more aggressive, a little more assertive about what data, how data, when data. How can we get to where we really want to go, and how do we make this data thing work for us?” But if I’m a company, and I’m looking for people, where am I going to find people? If I don’t have people saying, “I’ve been thinking about this, I want to do this, and I’m starting brand new.” Where am I going to find these folks? Are there data conventions? And you guys are all hanging out like, “Pass the tea. Let’s talk about this.”
0:24:03.5 JH: Candidly, I don’t know if I have a proper answer for that or a great answer for that, other than I think in the space… Data science as much as… We’ve talked about the hard skills of data science, the art of data science, I think the other piece in there to be aware of it’s the subject matter expertise for that organization that becomes essential. You could think of a diagram of this with those three elements in it. That subject matter knowledge becomes essential to really developing impact out of advanced analytics and data science for the organization. I think often for an organization to define success in this, it’s finding individuals that are again, driven by learning, have curiosity, and motivated to learn, preferably in this space, but having in place mechanisms that allow them to ramp up the business knowledge that they bring, that organizational knowledge. What product are you manufacturing in the nuances of manufacturing that product? How does thes sales team sell that product? That business knowledge and the nuances of that are key to success in data science.
0:25:24.0 JH: Using myself as an example, when I turned into an organization, I tried to focus the first few months on just strictly relationship building. Finding that conduit into who are the people that represent the space in the organization, that can become my source of… My vessel of knowledge that I can tap into. Because when I’m working with data and trying to build a model, there’s endless questions around “Do I pull in column A or column B? Do I combine them? Do I create something entirely new? Does this mean anything?” Because what I think is meaningful in the data may be statistically significant, all this kind of stuff, when it actually goes out to the field, and you get feedback and that expert knowledge on, “Well, we actually don’t operate like that, so your insight is meaningless.” If I can get that knowledge, or at least a representation of that, that’s where a lot of power exists, that my underlying skill sets, technical knowledge, storytelling abilities, all that stuff can come together, and leverage that subject matter knowledge. So I don’t know if I answered your question well Matthew, or not, but I think organizations developing pipelines or… Pipelines isn’t the best word… Environments that are conducive to that transfer of knowledge between the subject matter expert, and the…
0:26:38.3 JH: Data scientist, the advanced analytics, and those using the data. That knowledge sharing, I think is where a lot of that power resides.
0:26:47.2 ME: So that’s a way they can discover the value and use and help grow and foster a culture that grows people, but you didn’t yet tell me if there are conventions where there are data scientists like you all sitting in smoking jackets, having tea, discussing the latest algorithms of the breakfast.
0:27:07.4 JH: So those do exist depending on your space and need and so on, right?
0:27:13.6 ME: Right.
0:27:14.9 JH: The term data science is just over a decade old in formality. If I’m remembering correctly, I think it’s credited with originating at LinkedIn as kind of where it started with formerly, and don’t quote me on that. A lot of the build-up and hype to this sort of where we are now with data science… Let me rephrase, not build-up in hype, but growth in this discipline and the rate of growth in this discipline overtime started with the technology companies latching onto researchers that were presenting on neural nets, artificial intelligence, machine learning at their dedicated conferences. So one of the conferences that has been around for decades, is called NIPS N-I-P-S, it’s now NeurIPS is the new term given to it, but it’s all… What was up until a decade ago, a conference attended by maybe a couple of hundred researchers off in kind of the corner, to now it’s annually attended by thousands of people that come to this. That’s where a lot of the original poaching occurred, these researchers brought from academia into practical application data science going forward now. That’s a extreme example.
0:28:37.8 JH: I think there’s many different organizations out there. I think of TWI is one, IIA, Institute of International Analytics, and so on. There’s all these different organizations that, again, to your point, Matthew, it’s maybe not sitting around in smoking jackets and so on, but gatherings of analytics and analytic mindsets that bring a lot of talent together and a lot of skill sets together that can be sources of experienced skill sets, experienced individuals in these resources. And then to give credit to the universities. Again, over the last decade or so, more universities are offering more programs related to business analytics, data analytics and so on. That pipeline is filling up, becoming more robust, becoming more refined as well, and there’s a quality, new grads beginning to come out of universities as more learnings are applied there.
0:29:35.8 ME: It’s a normal, normal problem. So educational institutions are themselves businesses or else they cease to exist. It’s not a free world here, so these folks have the responsibility and the desire and the goal to enable and equip and educate and all of the types of things. A reality though is the gap between learning these concepts to… Even illustrated by your earlier point, go learn about statistical things, whether it’s statistics in and of themselves, probability base, and all of those things. Then learning the tools that are the Python and anything else that makes sense, and then figuring out how to operationalize that and then starting to get into splinters. That’s a journey that has to be lived. Journeys aren’t ordinarily lived in college or university. Journeys can be enabled. The fact that universities are offering more and more data education is outstanding.
0:30:28.9 ME: But it’s fun to see how this is evolving. It’s fun to see where it’s going. To your point, 10 years, thereabouts, plus or minus, plenty of places to go on the web, many conventions to go to, seeing how it’s evolved from a small subset of researchers to a more populated thousands and thousands of people who are interested now. What a wonderful evolution of an idea that we’re getting to watch, unfold right now. And then as far as what does it mean? Heck, that’s part of the whole challenge. What is it? When is it? What does it mean? How we make use of it? This has been a phenomenal conversation with you, good sir. Thank you very much for taking the time to teach us about so very many, just aspects of the journey of data and your journey with data, and even very much thank you for taking the moment to just give some pointers to people who want to learn how to have a journey like the one you’re having. Thank you.
0:31:28.4 JH: Thank you, Matthew, and I couldn’t agree more with those thoughts that are… Right. It’s a great journey that this whole space and discipline is on, and there’s a lot of runway left in it. And because of the uncertainty, there’s a lot of room for creativity and impact to be had as more people venture out and become skilled in this space, as well. So it’s been a… I’ve enjoyed the conversation and learned more about myself and hopefully be able to share some good thoughts as well along the way, so thank you.