In a nutshell, here is the problem: There are approximately 200,000 Data Scientist graduates available each year, and there are almost half a million new job postings for data scientists each year. Do the math. This shortfall has companies poaching top people from one another, trying to fill all those empty chairs. Money is not much of an object to get the best people because no matter how much you pay them, they will generate much more. The base salary in this area is ~$120,000
Did you miss it?
If you have somehow remained oblivious to what has happened in the last few years, all that data that you have about your customers’ preferences, sales curves, trend plots, innovation, growth, recession, and R&D are better than cash. To put it succinctly, data is the new currency.
Since there are not enough data scientists to go around, there is no real choice. You are going to have to hook up with a firm that provides data analysis. These forward-looking companies have secured a stable of the most brilliant minds in the data analysis field. Combined with Artificial Intelligence (AI) analysis, they can sort and sift data at a phenomenal rate to reveal all the things that are too subtle for the human eye.
As a graphic example, data scientists can program AIs to examine x-rays, CAT scans, MRIs, PETs, and any other form of medical imaging you can imagine. In this form, they are known as Expert Systems. The AIs are many times more accurate than human eyes at finding anomalous correlations, primarily because if a couple of subtle oddities are more than a few inches apart, the human brain fails to make the connection. The AI, on the other hand, is relentless, never blinks, sleeps, or gets tired; it can easily differentiate between a couple of spots that are the same shade of gray of opposite edges of the x-ray.
Who is going to help you mine your data?
I bet you were thinking Google, and you are correct that they are experts in data analysis, because that is exactly what search engines do. So as part of that, it only makes sense to include Yahoo as well. And for those people that don’t like “oo”s in their search engine names, you can add Bing or Ask.
So there are four companies, right off the top, that are really good at sorting data—they have to be—their whole business revolves around sorting information. They are pretty good candidates to consider for data sorting. Are they the best?
Surprisingly, a really strong contender for data sorting is Facebook—and by extension—LinkedIn as well. Both of these companies have a strong presence in artificial intelligence and data science. Don’t tell me you have not wondered how Facebook places all those ads which interest you on your home page, or how LinkedIn keeps presenting you with people to hire or job opportunities which make sense. For that matter, how does YouTube know which video you want to watch next?
All that information comes from data science, but, and believe it or not, most of those companies are newcomers to data science. Other organizations using data science include the familiar Tumblr, Instagram, Twitter, and so on. So who led the charge in the days of yore?
Another organization, Wildfire provides consulting services to help you work out your whole Data Strategy. They’re no strangers to Predictive Analytics, Machine Learning, Deep Learning, Artificial Intelligence, and manipulating Big Data to generate enhanced correlations to give your data true business value.
Wildfire even possesses wide-range experience using the Google Cloud, Amazon Web Services, and the MS Azure platforms. This versatility, combined with highly capable Open Source tools makes it much easier to cope with disparate data, whether video, audio, web-scraped, or your own databases. With OCR (Optical Character Recognition) even your paper documents can be included in the analysis. Learn more about Wildfire here.
That’s right…good old International Business Machines (IBM) started by wondering if they could build a computer (Big Blue) that could beat a chess master. After thousands of hours and many dollars, they did it, back in the 1990s.
Towards the year 2004, they decided to take on the much more complicated task of building a sophisticated supercomputer that could understand the puns, double entendres, and the subtleties of the clues experienced while playing the TV game show Jeopardy!, and more particularly, against the world’s best human player, Ken Jennings. The result was a 2011 contest between Jennings and the WATSON computer—and a victory for machine over man.
IBM provides some free basic instantiations of WATSON for you to use on the internet. They are programmed differently, depending on need, or you can buy your own copy and program it however you wish. In fact, one of the best things to come out of the advances in data science is publicly shared data, open source programming, shared technology, and cooperation in the AI and data science community.
Chip manufacturers have jumped on board such as ARM, Intel, AMD, Qualcomm, and many more. They’re cooperating to build compatibility platforms between systems instead of fighting about who will dominate. It is a really pleasant change of pace. This tends to mean that people who learn on one system will quite likely be able to use other systems with relative ease.
SAP and Oracle are also integrating data science components into their software packages, partly to integrate with AI packages that already exist, and partially to make data science more accessible to their customers.
Actual Data Handling
You have almost certainly heard of SaaS (Software as a Service), PaaS (Platform as a Service), and IaaS (Infrastructure as a Service). Well, get ready for DSaaS (Data Science as a Service). Some companies and organizations process data on your behalf, or act as intermediaries to get the job done.
That means that you don’t have to pay “an arm and a leg” to hire your own data scientists. DSaaS will allow you to get your Big Data sorted, manipulated, massaged, and analyzed without a single additional hire. Outsourcing will be the “new normal” until we finally have enough data scientists to go around. Worried about who gets to look at your data? Don’t let that worry you.
These companies have integrity agreements that your information remains your own, and is analyzed in context. You upload it to either a cloud installation, or a specific Big Data platform and then the company’s data scientists can get to work on it.
Your data is not shared, however, sometimes they will offer lower costs if it is anonymized and shared in aggregate form because it helps cultivate more accurate results for people with related questions. You, in turn, benefit from more widely sourced data, too. You’ll have to work out what you or your organization is most comfortable with.
In any case, they can analyze any data, say for marketing campaigns, online advertising effectiveness, deep-diving into your customer records to see what offering are most likely to generate a response on an individualized basis, or determining why a competitors product sells better than your own. Between them, AIs and data scientists can figure out just about anything you want to know
Ultimately it boils down to what we began with. The simple fact of the matter is that there aren’t enough data scientists. That situation is not going to be alleviated anytime soon by the looks of things. There hasn’t been a lot of promotion of the idea that we need more data scientists. It takes time to convince high school graduates that they want to get into data science in the first place. Then it can take a few years for them to work their way through college, further delaying their arrival on the business scene.
It’s a situation much like when humans first started to live in small communities. Every family did not freeze their own meat or bake their own bread; they did not grind their own grain; they didn’t saw their own wood. There were community ice houses, grain mills, sawmills, and bakeries. It was simply impractical for everyone to have their own because there weren’t enough resources.
That is what is happening right now. One day we’ll all have our very own data scientists, but until they become less scarce, we’re going to have to learn to share. And if the truth were told (or admitted), the economy of scale means we’re all saving money in the meantime.
So let’s not covet our competitors’ data scientists. We can upgrade the skills of people we already have on hand, and that means they’re already integrated into our company culture which (once again) saves us money. Meanwhile, it’s time for everyone to find a DSaaS provider because, as we all now know, Data IS Money!