16 Useful Advice for Aspiring Data Scientists

Why is data science sexy? It has something to do with so many new applications and entire new industries come into being from the judicious use of copious amounts of data. Examples include speech recognition, object recognition in computer vision, robots and self-driving cars, bioinformatics, neuroscience, the discovery of exoplanets and an understanding of the origins of the universe, and the assembling of inexpensive but winning baseball teams. In each of these instances, the data scientist is central to the whole enterprise. He/she must combine knowledge of the application area with statistical expertise and implement it all using the latest in computer science ideas.

“What advice would you give to someone starting out in data science?”

1 — Chris Wiggins, Chief Data Scientist at The New York Times and Associate Professor of Applied Mathematics at Columbia
“Creativity and caring. You have to really like something to be willing to think about it hard for a long time. Also, some level of skepticism. So that’s one thing I like about PhD students — five years is enough time for you to have a discovery, and then for you to realize all of the things that you did wrong along the way. It’s great for you intellectually to go back and forth from thinking “cold fusion” to realizing, “Oh, I actually screwed this up entirely,” and thus making a series of mistakes and fixing them. I do think that the process of going through a PhD is useful for giving you that skepticism about what looks like a sure thing, particularly in research. I think that’s useful because, otherwise, you could easily too quickly go down a wrong path — just because your first encounter with the path looked so promising.
And although it’s a boring answer, the truth is you need to actually have technical depth. Data science is not yet a field, so there are no credentials in it yet. It’s very easy to get a Wikipedia-level understanding of, say, machine learning. For actually doing it, though, you really need to know what the right tool is for the right job, and you need to have a good understanding of all the limitations of each tool. There’s no shortcut for that sort of experience. You have to make many mistakes. You have to find yourself shoehorning a classification problem into a clustering problem, or a clustering problem into a hypothesis testing problem.
Once you find yourself trying something out, confident that it’s the right thing, then finally realizing you were totally dead wrong, and experiencing that many times over — that’s really a level of experience that unfortunately there’s not a shortcut for. You just have to do it and keep making mistakes at it, which is another thing I like about people who have been working in the field for several years. It takes a long time to become an expert in something. It takes years of mistakes. This has been true for centuries. There’s a quote from the famous physicist Niels Bohr, who posits that the way you become an expert in a field is to make every mistake possible in that field.”

2 — Caitlin Smallwood, Vice President of Science and Algorithms at Netflix
“I would say to always bite the bullet with regard to understanding the basics of the data first before you do anything else, even though it’s not sexy and not as fun. In other words, put effort into understanding how the data is captured, understand exactly how each data field is defined, and understand when data is missing. If the data is missing, does that mean something in and of itself? Is it missing only in certain situations? These little, teeny nuanced data gotchas will really get you. They really will.
You can use the most sophisticated algorithm under the sun, but it’s the same old junk-in–junk-out thing. You cannot turn a blind eye to the raw data, no matter how excited you are to get to the fun part of the modeling. Dot your i’s, cross your t’s, and check everything you can about the underlying data before you go down the path of developing a model.
Another thing I’ve learned over time is that a mix of algorithms is almost always better than one single algorithm in the context of a system, because different techniques exploit different aspects of the patterns in the data, especially in complex large data sets. So while you can take one particular algorithm and iterate and iterate to make it better, I have almost always seen that a combination of algorithms tends to do better than just one algorithm.”

3 — Yann LeCun, Director of AI Research at Facebook and Professor of Data Science/Computer Science/Neuroscience at NYU
“I always give the same advice, as I get asked this question often. My take on it is that if you’re an undergrad, study a specialty where you can take as many math and physics courses as you can. And it has to be the right courses, unfortunately. What I’m going to say is going to sound paradoxical, but majors in engineering or physics are probably more appropriate than say math, computer science, or economics. Of course, you need to learn to program, so you need to take a large number of classes in computer science to learn the mechanics of how to program. Then, later, do a graduate program in data science. Take undergrad machine learning, AI, or computer vision courses, because you need to get exposed to those techniques. Then, after that, take all the math and physics courses you can take. Especially the continuous applied mathematics courses like optimization, because they prepare you for what’s really challenging.
It depends where you want to go because there are a lot of different jobs in the context of data science or AI. People should really think about what they want to do and then study those subjects. Right now the hot topic is deep learning, and what that means is learning and understanding classic work on neural nets, learning about optimization, learning about linear algebra, and similar topics. This helps you learn the underlying mathematical techniques and general concepts we confront every day.”

4 — Erin Shellman, Data Science Manager at Zymergen, Ex-Data Scientist at Nordstrom Data Lab and AWS S3
“For the person still deciding what to study I would say STEM fields are no-brainers, and in particular the ‘TEM ones. Studying a STEM subject will give you tools to test and understand the world. That’s how I see math, statistics, and machine learning. I’m not super interested in math per se, I’m interested in using math to describe things. These are tool sets after all, so even if you’re not stoked on math or statistics, it’s still super worth it to invest in them and think about how to apply it in the things you’re really passionate about.
For the person who’s trying to transition like I did, I would say, for one, it’s hard. Be aware that it’s difficult to change industries and you are going to have to work hard at it. That’s not unique to data science — that’s life. Not having any connections in the field is tough but you can work on it through meet-ups and coffee dates with generous people. My number-one rule in life is “follow up.” If you talk to somebody who has something you want, follow up.
Postings for data scientists can be pretty intimidating because most of them read like a data science glossary. The truth is that the technology changes so quickly that no one possesses experience of everything liable to be written on a posting. When you look at that, it can be overwhelming, and you might feel like, “This isn’t for me. I don’t have any of these skills and I have nothing to contribute.” I would encourage against that mindset as long as you’re okay with change and learning new things all the time.
Ultimately, what companies want is a person who can rigorously define problems and design paths to a solution. They also want people who are good at learning. I think those are the core skills.”

5 — Daniel Tunkelang, Chief Search Evangelist at Twiggle, Ex-Head of Search Quality at LinkedIn
“To someone coming from math or the physical sciences, I’d suggest investing in learning software skills — especially Hadoop and R, which are the most widely used tools. Someone coming from software engineering should take a class in machine learning and work on a project with real data, lots of which is available for free. As many people have said, the best way to become a data scientist is to do data science. The data is out there and the science isn’t that hard to learn, especially for someone trained in math, science, or engineering.
Read “The Unreasonable Effectiveness of Data” — a classic essay by Google researchers Alon Halevy, Peter Norvig, and Fernando Pereira. The essay is usually summarized as “more data beats better algorithms.” It is worth reading the whole essay, as it gives a survey of recent successes in using web-scale data to improve speech recognition and machine translation. Then for good measure, listen to what Monica Rogati has to say about how better data beats more data. Understand and internalize these two insights, and you’re well on your way to becoming a data scientist.”

6 — John Foreman, Vice President of Product Management and Ex-Chief Data Scientist at MailChimp
“I find it tough to find and hire the right people. It’s actually a really hard thing to do, because when we think about the university system as it is, whether undergrad or grad school, you focus in on only one thing. You specialize. But data scientists are kind of like the new Renaissance folks, because data science is inherently multidisciplinary.
This is what leads to the big joke of how a data scientist is someone who knows more stats than a computer programmer and can program better than a statistician. What is this joke saying? It’s saying that a data scientist is someone who knows a little bit about two things. But I’d say they know about more than just two things. They also have to know to communicate. They also need to know more than just basic statistics; they’ve got to know probability, combinatorics, calculus, etc. Some visualization chops wouldn’t hurt. They also need to know how to push around data, use databases, and maybe even a little OR. There are a lot of things they need to know. And so it becomes really hard to find these people because they have to have touched a lot of disciplines and they have to be able to speak about their experience intelligently. It’s a tall order for any applicant.
It takes a long time to hire somebody, which is why I think people keep talking about how there is not enough talent out there for data science right now. I think that’s true to a degree. I think that some of the degree programs that are starting up are going to help. But even still, coming out of those degree programs, for MailChimp we would look at how you articulate and communicate to us how you’ve used the data science chops across many disciplines that this particular program taught you. That’s something that’s going to weed out so many people. I wish more programs would focus on the communication and collaboration aspect of being a data scientist in the workplace.”

7 — Roger Ehrenberg, Managing Partner of IA Ventures
I think the areas where the biggest opportunities are also have the most challenges. Healthcare data obviously has some of the biggest issues with PII and privacy concerns. Added to that, you’ve also got sclerotic bureaucracies, fossilized infrastructures, and data silos that make it very hard to solve hard problems requiring integration across multiple data sets. It will happen, and I think a lot the technologies we’ve talked about here are directly relevant to making health care better, more affordable, and more distributed. I see this representing a generational opportunity.
Another huge area in its early days is risk management — whether it’s in finance, trading, or insurance. It’s a really hard problem when you’re talking about incorporating new data sets into risk assessment — especially when applying these technologies to an industry like insurance, which, like health care, has lots of privacy issues and data trapped within large bureaucracies. At the same time, these old fossilized companies are just now starting to open up and figure out how to best interact with the startup community in order to leverage new technologies. This is another area that I find incredibly exciting.
The third area I’m passionate about is reshaping manufacturing and making it more efficient. There has been a trend towards manufacturing moving back onshore. A stronger manufacturing sector could be a bridge to recreating a vibrant middle class in the US. I think technology can help hasten this beneficial trend.

8 — Claudia Perlich, Chief Scientist at Dstillery
“I think, ultimately, learning how to do data science is like learning to ski. You have to do it. You can only listen to so many videos and watch it happen. At the end of the day, you have to get on your damn skis and go down that hill. You will crash a few times on the way and that is fine. That is the learning experience you need. I actually much prefer to ask interviewees about things that did not go well rather than what did work, because that tells me what they learned in the process.
Whenever people come to me and ask, “What should I do?” I say, “Yeah, sure, take online courses on machine learning techniques. There is no doubt that this is useful. You clearly have to be able to program, at least somewhat. You do not have to be a Java programmer, but you must get something done somehow. I do not care how.”
Ultimately, whether it is volunteering at DataKind to spend your time at NGOs to help them, or going to the Kaggle website and participating in some of their data mining competitions — just get your hands and feet wet. Especially on Kaggle, read the discussion forums of what other people tell you about the problem, because that is where you learn what people do, what worked for them, and what did not work for them. So anything that gets you actually involved in doing something with data, even if you are not paid being for it, is a great thing.
Remember, you have to ski down that hill. There is no way around it. You cannot learn any other way. So volunteer your time, get your hands dirty in any which way you can think, and if you have a chance to do internships — perfect. Otherwise, there are many opportunities where you can just get started. So just do it.”

9 — Jonathan Lenaghan, Chief Scientist and Senior Vice President of Product Development at PlaceIQ
First and foremost, it is very important to be self-critical: always question your assumptions and be paranoid about your outputs. That is the easy part. In terms of skills that people should have if they really want to succeed in the data science field, it is essential to have good software engineering skills. So even though we may hire people who come in with very little programming experience, we work very hard to instill in them very quickly the importance of engineering, engineering practices, and a lot of good agile programming practices. This is helpful to them and us, as these can all be applied almost one-to-one to data science right now.
If you look at dev ops right now, they have things such as continuous integration, continuous build, automated testing, and test harnesses — all of which map very well from the dev ops world to the data ops (a phrase I stole from Red Monk) world very easily. I think this is a very powerful notion. It is important to have testing frameworks for all of your data, so that if you make a code change, you can go back and test all of your data. Having an engineering mindset is essential to moving with high velocity in the data science world. Reading Code Complete and The Pragmatic Programmer is going to get you much further than reading machine learning books — although you do, of course, have to read the machine learning books, too.”

10 — Anna Smith, Senior Data Engineer at Spotify, Ex-Analytics Engineer at Rent the Runway
“If someone is just starting out in data science, the most important thing to understand is that it’s okay to ask people questions. I also think humility is very important. You’ve got to make sure that you’re not tied up in what you’re doing. You can always make changes and start over. Being able to scrap code, I think, is really hard when you’re starting out, but the most important thing is to just do something.
Even if you don’t have a job in data science, you can still explore data sets in your downtime and can come up with questions to ask the data. In my personal time, I’ve played around with Reddit data. I asked myself, “What can I explore about Reddit with the tools that I have or don’t have?” This is great because once you’ve started, you can see how other people have approached the same problem. Just use your gut and start reading other people’s articles and be like, “I can use this technique in my approach.” Start out very slowly and move slowly. I tried reading a lot when I started, but I think that’s not as helpful until you’ve actually played around with code and with data to understand how it actually works, how it moves. When people present it in books, it’s all nice and pretty. In real life, it’s really not.
I think trying a lot of different things is also very important. I don’t think I’d ever thought that I would be here. I also have no idea where I’ll be in five years. But maybe that’s how I learn, by doing a bit of everything across many different disciplines to try to understand what fits me best.”

11 — Andre Karpistsenko, Data Science Lead at Taxify, Co-Founder and Research Lead at PlanetOS
“Though somewhat generic advice, I believe you should trust yourself and follow your passion. I think it’s easy to get distracted by the news in the media and the expectations presented by the media and choose a direction that you didn’t want to go. So when it comes to data science, you should look at it as a starting point for your career. Having this background will be beneficial in anything you do. Having an ability to create software and the ability to work with statistics will enable you to make smarter decisions in any field you choose. For example, we can read about how an athlete’s performance is improved through data, like someone becoming the gold medalist in the long jump because they optimized and practiced the angle at which they should jump. This is all led by a data-driven approach to sports.
If I were to go into more specific technical advice, then it depends on the ambitions of the person who is receiving the advice. If the person wants to create new methods and tools, then that advice would be very different. You need to persist and keep going in your direction, and you will succeed. But if your intent is to be diverse and flexible in many situations, then you want to have a big toolbox of different methods.
I think the best advice given to me was given by a Stanford professor whose course I attended a while ago. He recommended having a T-shaped profile of competence but with a small second competence next to the core competence, so that you have an alternative route in life if you need it or want it. In addition to the vertical stem of single-field expertise, he recommended that you have the horizontal bar of backgrounds broad enough so that you can work with many different people in many different situations. So the while you are in a university, building a T shape with another small competence in it is probably the best thing to do.
Maybe the most important thing is to surround yourself with people greater than you are and to learn from them. That’s the best advice. If you’re in a university, that’s the best environment to see how diverse the capabilities of people are. If you manage to work with the best people, then you will succeed at anything.”

12 — Amy Heineike, Vice President of Technology at PrimerAI, Ex-Director of Mathematics at Quid
“I think perhaps they would need to start by looking at themselves and figuring out what it is they really care about. What is it they want to do? Right now, data science is a bit of a hot topic, and so I think there are a lot of people who think that if they can have the “data science” label, then magic, happiness, and money will come to them. So I really suggest figuring out what bits of data science you actually care about. That is the first question you should ask yourself. And then you want to figure out how to get good at that. You also want to start thinking about what kinds of jobs are out there that really play to what you are interested in.
One strategy is to go really deep into one part of what you need to know. We have people on our team who have done PhDs in natural language processing or who got PhDs in physics, where they’ve used a lot of different analytical methods. So you can go really deep into an area and then find people for whom that kind of problem is important or similar problems that you can use the same kind of thinking to solve. So that’s one approach.
Another approach is to just try stuff out. There are a lot of data sets out there. If you’re in one job and you’re trying to change jobs, try to think whether there’s data you could use in your current role that you could go and get and crunch in interesting ways. Find an excuse to get to try something out and see if that’s really what you want to do. Or just from home there’s open data you can pull. Just poke around and see what you can find and then start playing with that. I think that’s a great way to start. There are a lot of different roles that are going under the name “data science” right now, and there are also a lot of roles that are probably what you would think of data science but don’t have a label yet because people aren’t necessarily using it. Think about what it is that you really want.”

13 — Victor Hu, Head of Data Science at QBE Insurance, Ex-Chief Data Scientist at Next Big Sound
“First is that you definitely have to tell a story. At the end of the day, what you are doing is really digging into the fundamentals of how a system or an organization or an industry works. But for it be useful and understandable to people, you have to tell a story.
Being able to write about what you do and being able to speak about your work is very critical. Also worth understanding is that you should maybe worry less about what algorithm you are using. More data or better data beats a better algorithm, so if you can set up a way for you to analyze and get a lot of good, clean, useful data — great!”

14 — Kira Radinsky, Chief Scientist and Director of Data Science at eBay, Ex-CTO and Co-Founder of SalesPredict
“Find a problem you’re excited about. For me, every time I started something new, it’s really boring to just study without a having a problem I’m trying to solve. Start reading material and as soon as you can, start working with it and your problem. You’ll start to see problems as you go. This will lead you to other learning resources, whether they are books, papers, or people. So spend time with the problem and people, and you’ll be fine.
Understand the basics really deeply. Understand some basic data structures and computer science. Understand the basis of the tools you use and understand the math behind them, not just how to use them. Understand the inputs and the outputs and what is actually going on inside, because otherwise you won’t know when to apply it. Also, it depends on the problem you’re tackling. There are many different tools for so many different problems. You’ve got to know what each tool can do and you’ve got to know the problem that you’re doing really well to know which tools and techniques to apply.”

15 — Eric Jonas, Postdoc at UC Berkeley EECS, Ex-Chief Predictive Scientist at Salesforce
“They should understand probability theory forwards and backwards. I’m at the point now where everything else I learn, I then map back into probability theory. It’s great because it provides this amazing, deep, rich basis set along which I can project everything else out there. There’s a book by E. T. Jaynes called Probability Theory: The Logic of Science, and it’s our bible. We really buy it in some sense. The reason I like the probabilistic generative approach is you have these two orthogonal axes — the modeling axis and the inference axis. Which basically translates into how do I express my problem and how do I compute the probability of my hypothesis given the data? The nice thing I like from this Bayesian perspective is that you can engineer along each of these axes independently. Of course, they’re not perfectly independent, but they can be close enough to independent that you can treat them that way.
When I look at things like deep learning or any kind of LASSO-based linear regression systems, which is so much of what counts as machine learning these days, they’re engineering along either one axis or the other. They’ve kind of collapsed that down. Using these LASSO-based techniques as an engineer, it becomes very hard for me to think about: “If I change this parameter slightly, what does that really mean?” Linear regression as a model has a very clear linear additive Gaussian model baked into it. Well, what if I want things to look different? Suddenly all of these regularized least squares things fall apart. The inference technology just doesn’t even accept that as a thing you’d want to do.”

16 — Jake Porwar, Founder and Executive Director of DataKind
“I think a strong statistical background is a prerequisite, because you need to know what you’re doing, and understand the guts of the model you build. Additionally, my statistics program also taught a lot about ethics, which is something that we think a lot about at DataKind. You always want to think about how your work is going to be applied. You can give anybody an algorithm. You can give someone a model for using stop-and-frisk data, where the police are going to make arrests, but why and to what end? It’s really like building any new technology. You’ve got to think about the risks as well as the benefits and really weigh that because you are responsible for what you create.
No matter where you come from, as long as you understand the tools that you’re using to draw conclusions, that is the best thing you can do. We are all scientists now, and I’m not just talking about designing products. We are all drawing conclusions about the world we live in. That’s what statistics is — collecting data to prove a hypothesis or to create a model of the way the world works. If you just trust the results of that model blindly, that’s dangerous because that’s your interpretation of the world, and as flawed as it is, your understanding is how flawed the result is going to be.
In short, learn statistics and be thoughtful.”
Data is being generated exponentially and those who can understand that data and extract value from it are needed now more than ever. The hard-earned lessons and joy about data and models from these thoughtful practitioners would be tremendously useful if you aspire to join the next generation of data scientists.

Source: Medium

Big Data & Higher Education: How Are They Connected?

Over the past few years, big data has become a significant part of our lives. Its influence is continually rising. Big data plays a role in nearly everything we do, from our navigation systems, to our Netflix recommendations, to our healthcare systems. At this point, you could probably pick any aspect of your life and find a way that big data has influenced it.

For example, think about higher education. Big data is playing a much bigger role in our choices of colleges and our experiences while there than we probably even realize. Universities are using big data in all sorts of interesting ways that benefit both their profit margins and the students attending the school.

Although there are certainly some substantial costs associated with the use of big data in the classroom, many would argue the benefits far outweigh them. Here are just a few of the ways big data is regularly being incorporated to help students become more successful in their learning environments.

Accounting for Learning Styles

Psychologists are learning more every day about the important differences in personality and how each of us sees and interacts with the world around us. Small changes in how we react and respond to things can make a huge difference in our learning styles and ultimately in how much information and positivity we get out of our college experiences. Because of this, many people take well-known personality tests rather seriously.

To that end, big data can change the way personality tests are used. Nowadays, this personality test information, along with other social indicators and performance measures, can be used to identify learning environments that are going to improve success for individual students. Educators can use this information to customize educational plans or identify where some students will do better online versus in a traditional classroom setting.

Students can also use this information to determine which types of education and career paths are right for them. For instance, students who are considering a doctoral program can assess the requirements and compare those rigorous requirements to their personalities. With the help of big data, personality tests, and past academic achievements, information can be provided that can help students assess their likelihood of completing the program.

Targeting Specifics

Outside of helping both teachers and students understand the best environment to learn in, big data can also work its way into the students who are targeted for specific programs. For example, some university athletic programs are using big data trends to predict which students are likely to be stars on the court. Recruiting star players can give colleges a boost in enrollment the following year — apparently everybody wants to go to a school with a good sports team.

Likewise, colleges can use big data to promote their online programs. They can use this statistical information to determine where some students could be recruited online rather than in a traditional campus setting. With advances in technology, students are able to connect more effectively than ever from the comfort of their own homes. They can use new AI tools to improve their educational experience.

They are also using big data to target perspective on campus students. Once they know a student is potentially interested in attending, they can delve into their test scores, high school academic history, and other information to determine if a student is likely to succeed. They can also use algorithms to better understand how much information they should send students without overloading them.

Increasing Retention

Finally, universities are capitalizing on big data to help them with one of their biggest problems — student retention. Student enrollments have been declining for some time, and actual graduation rates are not that great for many colleges. In order to improve the likelihood of students being successful and more tuition dollars being paid, colleges are working overtime to solve this issue.

By using predictive analytics, universities can track the progress of students within their given major and career goals and assess the likelihood of them being successful versus dropping out. If students are flagged early on in their educational career, advisors and professors can reach out and attempt to correct the course before it becomes a major problem. Sometimes this involves providing customized learning plans and other times a suggested change of major is in order.

Big data can also help universities identify non-academic factors that make students want to drop out. Some of these can include things like being homesick, missing a pet, or not feeling engaged in the culture of the campus. From there, administrators can work towards developing programs that work to improve identified problems on campus and make the entire environment better for everybody.

There are many ways that higher education is incorporating big data to make the educational experience better for students. Big data is helping to improve responses to learning style, benefit targeted enrollment, and increase retention rates. There are plenty of other ways big data is being used in education, what are some of them?

Source: InsideBIGDATA

Greatest Threat Of AI is Not What You Think

Stop worrying about AI becoming your overlord. The real threat is much more obvious and interesting.

We’ve all heard the prophetic apocalyptic predictions for AI’s future. Elon Musk has said that it’s our “biggest existential threat” and has likened it to “summoning the demon.” Other great minds are similarly vocal about their fears. The late Stephen Hawking said that AI could wipe out human race. Author James Barrat wrote a book whose title, Our Final Invention, has become a mantra of the anti-AI movement.

While we are clearly well advised to move forward with eyes wide open as we develop generalized AI, there’s an even greater near term danger, imbuing AI with biases that perpetuate old attitudes and social norms.

A recent UNESCO report, I’d Blush If I Could, developed with the government of Germany and EQUALS (an organization encouraging the development of women’s skills in technology) posed a simple question, “Why do most voice assistants have female names, and why do they have submissive personalities? .”

The title of the UNESCO report was the standard answer given by the default female-voice of Apple’s digital assistant, Siri, when responding to derogatory gender-specific statements. According to UNESCO the reason that digital assistants have these biases is that there are “stark gender-imbalances in skills, education and the technology sector.”

The concern here is that we are building technology that does more than simply perform calculations and execute a series of pre-scripted commands. The applications, digital assistants, and AI bots that we are currently using are carrying with then the social norms that they have learned through the data they are fed and the interactions they encounter.

Sometimes that’s a relatively mild form of bias that has nothing more to do with AI than how it’s marketed.

For example, the names of today’s digital assistants, which are all females. Although Alexa’s name refers to the library of Alexandria, it could just as easily have been named Alex. Siri’s namesake is a Norse (Scandinavian) female translated as “a beautiful woman who leads you to victory.” Microsoft’s Cortana is named after an AI character from the game Haylo, which appears as a naked female holograph.

Marketers who decide what resonates best with users most often claim that a female voice is much more engaging and marketable. They’ll also claim that the personalities of their assistants are meant to come across as intelligent and funny.

An Amazon spokesperson quoted in a Quartz article said “Alexa’s personality exudes characteristics that you’d see in a strong female colleague, family member, or friend–she is highly intelligent, funny, well-read, empowering, supportive, and kind.”

However, looking at the responses that Alexa gave when tested by the article’s author, all of the digital assistants came back with responses that could, at best, be characterized as either apologetic or apathetic. I can tell you unequivocally that it’s not how I’d want my daughter to respond to the same comments.

While it’s easy to understand why a digital assistant that puts up a fight, becomes indignant, or calls someone out for harassing it (her?), it’s worth asking the obvious question, “What sort of social norms are we trying to perpetuate or create in the increasing interactions we and our children have with digital assistants?”

The author Jim Rohn once worth that “You’re the average of the five people you spend the most time with.” In the case of digital assistants we could just as easily say that they are the average of the handful of developers, or the content that those developers have used to train the AI.

Based on that it would seem that AI bias is inevitable, since it will only be as unbiased as the social context within which it learns.

However, this is where it gets interesting.

What we seem to be missing in all of this conversation about bias is that social context is not just about gender bias. There are numerous global differences, some nuanced and some pronounced, that shape what’s culturally acceptable as we go from one part of the world to another. We may not agree with these differing attitudes towards how we treat each other, but we clearly put up with all but the most egregious of them under the doctrine of national sovereignty.

The bigger question, at least in my mind, is, “Should digital assistants be created with one set of values and norms that are exported from silicon valley and expected to be used around the world, or should they be fine tuned to localized behaviors, even the ones we consider aberrant?”

Navigating that question forces us through a minefield of controversy–and it darn well should.

Maybe it’s just the eternal optimist in me but the way I see it, AI raises an entirely new set of ethical conundrums that will up our game as humans. We will have to face the fact that we are training a new species that embodies who we are and what we value, and then holds it up to us like a mirror into ourselves and our beliefs. In many ways it is an opportunity for us to ask questions that push us towards what may be the final frontier of globalization.

That may not seem to be an apocalyptic threat, it’s certainly not an uber-intelligent AI overlord that sets out to eradicate the human race as our last invention.

But it may well put us on an even more fascinating and challenging path; one that helps us evolve into better humans.

Source: Innovation Excellence

A Peek into the Future of Higher Education – Can Artificial Intelligence Drive Remote Learning?

AnalyticsAnywhere

Image Source: Statista
In its current form, Higher Education suffers from an exclusivity complex in many developed countries worldwide. While it’s possible to anyone to attend university in the UK, the £9,250 per year cost of tuition fees has left students feeling alienated. Elsewhere, universities can feel wholly inaccessible for young adults in nations with a weaker transport infrastructure.
As part of this article, I spoke to the head of one of very few universities that aim at utilizing AI to its fullest potential in education.

Chatbots and customized courses
Further developments in Artificial Intelligence may soon change Higher Education forever. Already AI is beginning to make its presence felt on campus, with universities like Staffordshire introducing Beacon, a chatbot designed to act as a 24-hour digital assistant for students. However, Deakin University in Australia has recently set a new standard, using IBM’s Watson AI technology to pre-empt over 1,600 student questions in real-time surrounding the topics of student life, admissions, local directions, financial aid and much more.
While chatbots don’t sound like they’re the kind of technology to swoop in and make university more inclusive for more disadvantaged students, the money institutions will save utilising AI to deal with queries instead of piling workloads on tutors could play a role in making HE more affordable.

Chatbots point to an exponentially larger role to be played by AI. When University 20.35 was developed, the emphasis was firmly on utilising Artificial Intelligence to provide bespoke Higher Education programs to thousands of remote learners. Speaking about this, Dmitry Peskov, head of the university explained: “When we started dealing with this challenge, we saw that educational programs in traditional universities and the teaching methods applied therein didn’t correspond to the needs of either private companies or the state. Everything is changing very quickly, new specialisations are appearing, and the requirements for traditional ones are constantly expanding. We realised that we need a flexible, digital data-driven educational platform where everything would be personalised as much as possible through the use of AI.”
Although it seems highly ambitious, the notion of optimising the personal experience of each student through advanced technology isn’t necessarily new. Writing for EdTech, Dave Doucette acknowledged that delivering a ‘highly individualised experience’ would be every university’s top priority if funding was unlimited.
AI has the potential to bridge the gap between students and their course material – offering personalised tutoring as well as video captioning as a means of making course content more accessible.

Mass remote learning leveraged by AI
Today we’re used to favoring courses that have lower student-to-tutor ratios, because they’ll be able to offer the best level of personal support, right? “University 20.35 is not a university in the traditional sense. We don’t have classrooms, permanent staff, lecturers, rectors and deans – and we don’t teach students based on programs that are available at other universities. It is a digital platform driven by artificial intelligence – the Pushkin AI. In fact, we are an experimental training ground, where advanced EdTech and techniques are being developed that will be commonplace not just tomorrow, but ‘the day after tomorrow’ – in the year 2035. Hence the name of the University,” explains Peskov.
What does this mean for the HE classrooms and lecture halls of the future? If AI delivers on the promise it’s continually showing, learning will transition into the realm of the remote. Students will be able to study at home, complete assignments that a combination of Artificial Intelligence and machine learning has determined is suitable for them, before submitting their work for the technology to automatically assess.

AnalyticsAnywhere2
Image Source: Statista

So does this mean that we’ll no longer be looking out for in-house courses with a student-to-tutor ratio of under 25 in the future? Peskov believes so: “We initially wanted to build a scalable digital platform through the use of AI. Therefore, a potentially unlimited number of participants will be able to get enrolled in the future at the university. But, practically, we plan that by 2020 up to 100,000 people will be connected to our system.”

The Higher Education revolution
University life is ever-changing. The disruptive power of the internet enabled students to learn in just about any location – whether it was through the use of library computers, on laptops at home, or on their mobile phones while traveling to take a morning exam.
Artificial Intelligence enables a logical evolution to take place here, where entire courses can take place from home, with comfort and suitable scalability. Does University 20.35 represent the start of a much wider movement? Dmitry thinks so: “This is a revolution.

The technology groups that are underlying this revolution can be applied in completely different fields: from how we teach children at the preschool age to the mass retraining of older generations. They can be used to fundamentally change the models of universities. They can be used for online courses, for working professions.
We want to make sure that people in the new digital economy are in demand and can easily adapt to any changes and requirements. And this can only be done with the help of AI and a complete revision of existing educational models.”

To Sum Up
Educational institutions have existed in a relatively familiar form throughout the centuries now. As it bids to adapt to the new millennium’s rapid advancements in computing and technology, universities of today can still be found guilty of commanding levels of tuition fees that can fly in the face of inclusivity.

The development of Artificial Intelligence has now offered the world a chance to revolutionize the way students access Higher Education – with affordable home learning and bespoke course content to suit each pupil’s needs. The notion of a ‘revolution’ may cause pre-existing institutions to baulk – but if it’s a revolution that brings greater inclusivity, then it’s a revolution worth doing.

Source: Hackernoon

Artificial Intelligence: Salaries Heading Skyward

While the average salary for a Software Engineer is around $100,000 to $150,000, to make the big bucks you want to be an AI or Machine Learning (Specialist/Scientist/Engineer.)

AnalyticsAnywhere

Artificial intelligence salaries benefit from the perfect recipe for a sweet paycheck: a hot field and high demand for scarce talent. It’s the ever-reliable law of supply and demand, and right now, anything artificial intelligence-related is in very high demand.

According to Indeed.com, the average IT salary — the keyword is “artificial intelligence engineer” — in the San Francisco area ranges from approximately $134,135 per year for “software engineer” to $169,930 per year for “machine learning engineer.”

However, it can go much higher if you have the credentials firms need. One tenured professor was offered triple his $180,000 salary to join Google, which he declined for a different teaching position.

However, the record, so far, was set in April when the Japanese firm Start Today, which operates the fashion-shopping site Zozotown, posted new job offerings for seven “genius” AI tech experts, offering annual salaries of as much as 100 million yen, or just under US $1 million.

Key Sectors for AI Salaries

Scoring a top AI salary means working in the “right” sector. While plentiful, AI jobs are mainly in just a few sectors — namely tech — and confined to just a few big and expensive cities. Glassdoor, another popular job search site, notes that 67% of all AI jobs listed on its site are located in the Bay Area, Seattle, Los Angeles, and New York City.

It also listed Facebook, NVIDIA, Adobe, Microsoft, Uber, and Accenture as the five best AI companies to work for in 2018, with almost 19% of open AI positions. The average annual base pay for an AI job listed on Glassdoor is $111,118 per year.

Glassdoor also found financial services, consulting and government agencies are actively hiring AI engineering and data science professionals. This includes top firms like Capital One, Fidelity, Goldman Sachs, Booz Allen Hamilton, EY, and McKinsey & Company, NASA’s Jet Propulsion Laboratory, the U.S. Army, and the Federal Reserve Bank.

However, expect that number of jobs and fields to expand considerably in the near future. A recent report from Gartner said that AI will kill off 1.8 million jobs, mostly menial labor, but the field will create 2.3 million new jobs by 2020, such statement is emphasized by a recent Capgemini report that found that 83% of companies using AI say they are adding jobs because of AI.

Best Jobs for AI Salaries

The term “AI” is rather broad and covers a number of disciplines and tasks, including natural language generation and comprehension, speech recognition, chat-bots, machine learning, decision management, deep learning, biometrics, and text analysis and processing. Given the level of specialization each requires, not many professionals can master more than one discipline.

In short, finding the best AI salary calls for actively nurturing the right career path.

While the average pay for an AI programmer is around $100,000 to $150,000, depending on the region of the country, all of these are in the developer/coder realm. To make the big money you want to be an AI engineer. According to Paysa, yet another job search site, an artificial intelligence engineer earns an average of $171,715, ranging from $124,542 at the 25th percentile to $201,853 at the 75th percentile, with top earners earning more than $257,530.

Why so high? Because many come from non-programming backgrounds. The IEEE notes that people with Ph.Ds in sciences like biology and physics are returning to school to learn AI and apply it to their field. They need to straddle the technical, knowing a multitude of languages and hardware architectures, with an understanding of the data involved. The latter makes engineers rare and thus expensive.

Why Are AI Salaries So High?

The fact is, AI is not a discipline you can teach yourself as many developers do. A survey by Stack Overflow found 86.7% of developers were, in fact, self-taught. However, that is for languages like Java, Python, and PHP, not the esoteric art of artificial intelligence.

It requires advanced degrees in computer science, often a Ph.D. In a report, Paysa found that 35 percent of AI positions require a Ph.D. and 26 percent require a master’s degree. Why? Because AI is a rapidly growing field and when you study at the Ph.D. level and participate in academic projects, they tend to be innovative if not bleeding edge, and that gives the student the experience they need for the work environment.

Moreover, it requires multiple disciplines, including C++, STL, Perl, Perforce and APIs like OpenGL and PhysX. In addition, because the AI is doing important calculations, a background in physics or some kind of life science is necessary.

Therefore, to be an effective and in-demand AI developer you need a lot of skills, not just one or two. Indeed lists the top 10 skills you need to know for AI:

1) Machine learning

2) Python

3) R language

4) Data science

5) Hadoop

6) Big Data

7) Java

8) Data mining

9) Spark

10) SAS

As you can see, that is a wide range of skills and none of them is learned overnight. According to The New York Times, there are fewer than 10,000 qualified AI specialists in the world. Element AI, a Montreal company that consults on machine learning systems, published a report earlier this year that there were 22,000 Ph.D.-level computer scientists in the world are capable of building AI systems. Either way, that is too few for the demand reported by Machine Learning News.

Competing Employers Drive Salaries Higher

With so few AI specialists available, tech companies are raiding academia. At the University of Washington, six of 20 artificial intelligence professors are now on leave or partial leave and working for outside companies. In the process, they are limiting the number of professors who can teach the technology, causing a vicious cycle.

US News and World report lists the top 20 schools for AI education. The top five are:

1) Carnegie Mellon University, Pittsburgh, PA

2) Massachusetts Institute of Technology, Cambridge, MA

3) Stanford University, Stanford, CA

4) University of California — Berkeley, Berkeley, CA

5) University of Washington, Seattle, WA

With academia being raided for talent, alternatives are popping up. Google, which is hiring any AI developer it can get its hands on, offers a course on deep learning and machine-learning tools via its Google Cloud Platform Website, and Facebook, also deep in AI, hosts a series of videos on the fundamentals of AI such as algorithms. If you want to take courses online, there is Coursera and Udacity.

Basic computer technology and math backgrounds are the backbone of most artificial intelligence programs. Linear algebra is as necessary as a programming language since machine learning performs analysis on data within matrices, and linear algebra is all about operations on matrices. According to Computer Science Degree Hub, coursework for AI involves study of advanced math, Bayesian networking or graphical modeling, including neural nets, physics, engineering and robotics, computer science and cognitive science theory.

Some things cannot be taught. Working with artificial intelligence does not mean you get to offload the work on the computer. It requires analytical thought process, foresight about technological innovations, technical skills to design, the skill to maintain and repair technology and software programs as well as algorithms. Therefore, it is easy to see why skilled people are so rare — which will drive AI salaries only higher.

Source: Medium

Can artificial intelligence help society as much as it helps business?

The answer is yes—but only if leaders start embracing technological social responsibility (TSR) as a new business imperative for the AI era.

AnalyticsAnywhere

In 1953, US senators grilled General Motors CEO Charles “Engine Charlie” Wilson about his large GM shareholdings: Would they cloud his decision making if he became the US secretary of defense and the interests of General Motors and the United States diverged? Wilson said that he would always put US interests first but that he could not imagine such a divergence taking place, because, “for years I thought what was good for our country was good for General Motors, and vice versa.” Although Wilson was confirmed, his remarks raised eyebrows due to widespread skepticism about the alignment of corporate and societal interests.

The skepticism of the 1950s looks quaint when compared with today’s concerns about whether business leaders will harness the power of artificial intelligence (AI) and workplace automation to pad their own pockets and those of shareholders—not to mention hurting society by causing unemployment, infringing upon privacy, creating safety and security risks, or worse. But is it possible that what is good for society can also be good for business—and vice versa?

Innovation and skill building

To answer this question, we need a balanced perspective that’s informed by history. Technology has long had positive effects on well-being beyond GDP—for example, increasing leisure or improving health and longevity—but it can also have a negative impact, especially in the short term, if adoption heightens stress, inequality, or risk aversion because of fears about job security. A relatively new strand of welfare economics has sought to calculate the value of both the upside and the downside of technology adoption. This is not just a theoretical exercise. What if workers in the automation era fear the future so much that this changes their behavior as consumers and crimps spending? What if stress levels rise to such an extent as workers interface with new technologies that labor productivity suffers?

Building and expanding on existing theories of welfare economics, we simulated how technology adoption today could play out across the economy. The key finding is that two dimensions will be decisive—and in both cases, business has a central role to play (Exhibit 1). The first dimension is the extent to which firms adopt technologies with a view to accelerating innovation-led growth, compared with a narrower focus on labor substitution and cost reduction. The second is the extent to which technology adoption is accompanied by measures to actively manage the labor transitions that will accompany it—in particular, raising skill levels and ensuring a more fluid labor market.

AnalyticsAnywhere

Both of these dimensions are in sync with our previous bottom-line-focused work on AI and automation adoption. In our research, digital leaders who reap the biggest benefits from technology adoption tend to be those who focus on new products or new markets and, as a result, are more likely to increase or stabilize their workforce than reduce it. At the same time, human capital is an essential element of their strategies, since having the talent able to implement and drive digital transformation is a prerequisite for successful execution. No wonder a growing number of companies, from Walmart to German software company SAP, are emphasizing in-house training programs to equip members of their workforce with the skills they will need for a more automated work environment. And both Amazon and Facebook have raised the minimum wage for their workers as a way to attract, retain, and reward talent.

TSR: Technological social responsibility

Given the potential for a win–win across business and society from a socially careful and innovation-driven adoption strategy, we believe the time has come for business leaders across sectors to embed a new imperative in their corporate strategy. We call this imperative technological social responsibility (TSR). It amounts to a conscious alignment between short- and medium-term business goals and longer-term societal ones.

Some of this may sound familiar. Like its cousin, corporate social responsibility, TSR embodies the lofty goal of enlightened self-interest. Yet the self-interest in this case goes beyond regulatory acceptance, consumer perception, or corporate image. By aligning business and societal interests along the twin axes of innovation focus and active transition management, we find that technology adoption can potentially increase productivity and economic growth in a powerful and measurable way.

In economic terms, innovation and transition management could, in a best-case scenario, double the potential growth in welfare—the sum of GDP and additional components of well-being, such as health, leisure, and equality—compared with an average scenario. The welfare growth to 2030 that emerges from this scenario could be even higher than the GDP and welfare gains we have seen in recent years from computers and early automation.

However, other scenarios that pay less heed to innovating or to managing disruptive transitions from tech adoption could slow income growth, increase inequality and unemployment risk, and lead to fewer improvements in leisure, health, and longevity. And that, in turn, would reduce the benefits to business.

At the company level, a workforce that is healthier, happier, better trained, and less stressed, will also be more productive, more adaptable, and better able to drive the technology adoption and innovation surge that will boost revenue and earnings. At the broader level, a society whose overall welfare is improving, and faster than GDP, is a more resilient society better able to handle sometimes painful transitions. In this spirit, New Zealand recently announced that it will shift its economic policy focus from GDP to broader societal well-being.

Leadership imperatives

For business leaders, three priorities will be essential. First, they will need to understand and be convinced of the argument that proactive management of technology transitions is not only in the interest of society at large but also in the more narrowly focused financial interest of companies themselves. Our research is just a starting point, and more work will be needed, including to show how and where individual sectors and companies can benefit from adopting a proactive strategy. Work is already underway at international bodies such as the Organisation of Economic Co-operation and Development to measure welfare effects across countries.

Second, digital reinvention plans will need to have, at their core, a thoughtful and proactive workforce-management strategy. Talent is a key differentiating factor, and there is much talk about the need for training, retraining, and nurturing individuals with the skills needed to implement and operate updated business processes and equipment. But so far, “reskilling” remains an afterthought in many companies. That is shortsighted; our work on digital transformation continues to emphasize the importance of having the right people in the right places as machines increasingly complement humans in the workforce. From that perspective alone, active management of training and workforce mobility will be an essential task for boards in the future.

Third, CEOs must embrace new, farsighted partnerships for social good. The successful adoption of AI and other advanced technologies will require cooperation from multiple stakeholders, especially business leaders and the public sector. One example involves education and skills: business leaders can help inform education providers with a clearer sense of the skills that will be needed in the workplace of the future, even as they look to raise the specific skills of their own workforce. IBM, for one, is partnering with vocational schools to shape curricula and build a pipeline of future “new collar” workers—individuals with job profiles at the nexus of professional and trade work, combining technical skills with a higher educational background. AT&T has partnered with more than 30 universities and multiple online education platforms to enable employees to earn the credentials needed for new digital roles.

Other critical public-sector actions include supporting R&D and innovation; creating markets for public goods, such as healthcare, so that there is a business incentive to serve these markets; and collaborating with businesses on reskilling, helping them to match workers with the skills they need and with the digital-era jobs to which they could most easily transition. A more fluid labor market and better job matching will benefit companies and governments, accelerating the search for talent for the former and reducing the potential transition costs for the latter.

There are many aspects to TSR, and we are just starting to map out some of the most important ones. But as an idea and an imperative, the time has come for technological social responsibility to make a forceful entry into the consciousness and strategies of business leaders everywhere.

Source: McKinsey

What is Natural Language Processing and How Does it Benefit a Business?

We use natural language processing every day. It makes it easier for us to interact with computers and software and allows us to perform complex searches and tasks without the help of a programmer, developer or analyst.

What is Natural Language Processing (NLP) Driven Analytics?

Natural language processing (NLP) is an integral part of today’s advanced analytics. If you have clicked in the search window on Google and entered a question, you know NLP! When NLP is incorporated into the business intelligence environment, business users can enter a question in human language. For example, ‘which sales team member achieved the best numbers last month?’ or ‘which of our products sells best in New York?’

The system translates this natural language search into a more traditional analytics query, and returns the most appropriate answer in the most appropriate form, so users can benefit from smart visualization, tables, numbers or natural language descriptions that are easy to understand.

How Does NLP-Based Analytics Benefit a Business Organization?

Perhaps the most important benefit of NLP is that it allows the business to implement augmented analytics in a self-serve environment with very little required training and ensures that users will adopt business intelligence and analytics as a tool to use every day.

NLP allows the enterprise to expand the use of business intelligence across the enterprise by offering business users an intuitive tool to ask for and receive crucial data and to understand the analytical output and share it with other users.

NLP opens and expands the data repositories and information in an organization in a way that is meaningful, and easy to understand, so data is more accessible and answers are more valuable. This will improve the accuracy of planning and forecasting and allow for a better overall understanding of business results.

Natural language processing helps business users sort through integrated data sources (internal and external) to answer a question in the way the user can understand, and will provide a foundation to simplify and speed the decision process with fact-based, data-driven analysis.

The enterprise can find and use information using natural language queries, rather than complex queries, so business users can achieve results without the assistance of IT or business analysts.

NLP presents results through smart visualization and contextual information delivered in natural language. Because these tools are easy to use and to understand, users are more likely to adopt them and to add value to the organization.

With NLP searches and queries, business users are free to explore data and achieve accurate results and the organization can achieve rapid ROI and sustain low total cost of ownership (TCO) with tools as familiar as a Google search.

Users can combine NLP with plug n’ play predictive analysis or assisted predictive modeling so the organization can achieve data democratization.

NLP and the advanced data discovery tools it supports can provide important, sophisticated tools in a user-friendly environment to suggest relationships, identify patterns and trends, and offer insight to previously hidden information so business users can ‘discover’ subtle, crucial problems and opportunities.

NLP is an integral part of today’s advanced analytics. It establishes an easy-to-use, interactive environment where users can create a search query in natural language and, as such, will support user adoption and provide numerous benefits to the enterprise.

Source: dataversity.net