“It was true that I needed to know statistics and how to write code to function effectively in these roles, but that knowledge was a given. It turned out that the differentiating points between a great data scientist and an average one were in the researcher’s ability to deal with that same uncertainty that had driven me from the humanities and into quantitative research in the first place. In other words, the scientific methodologies had all the same epistemological concerns and issues as the humanities — they just tackled those problems with different tools.
My experience has lead me to believe that graduate humanities work is in fact one of the most useful backgrounds for an industry data scientist. While there’s often a lot of focus on data scientists being experts in statistics or coding, these tools are simply a means to an end — they’re necessary but insufficient for doing great data science. If you’re a humanities graduate student and are interested in data, I’d feel confident in your ability to succeed in the field based on your less technical skills. Specifically, experience as a graduate researcher in humanities makes you an expert in:
Going deep into topics and teaching yourself anything
Stating research questions and supporting your answers with evidence
Communicating the limitations and assumptions of your approach
In my mind, these broad research skills are more valuable (and rare) than knowledge of the specifics of any particular quantitative methodology.
*”Data scientist is just a sexed up word for statistician.’ Nate Silver
Linked by the indefatigable people at O’Reilly, I came across a Q&A with Judah Pearl, an AI pioneer, who has some measured criticism of the enterprise as it stands:
Hartnett: People are excited about the possibilities for AI. You’re not?
Pearl: As much as I look into what’s being done with deep learning, I see they’re all stuck there on the level of associations. Curve fitting. That sounds like sacrilege, to say that all the impressive achievements of deep learning amount to just fitting a curve to data. From the point of view of the mathematical hierarchy, no matter how skillfully you manipulate the data and what you read into the data when you manipulate it, it’s still a curve-fitting exercise, albeit complex and nontrivial.
Hartnett: The way you talk about curve fitting, it sounds like you’re not very impressed with machine learning.
Pearl: No, I’m very impressed, because we did not expect that so many problems could be solved by pure curve fitting. It turns out they can. But I’m asking about the future—what next? Can you have a robot scientist that would plan an experiment and find new answers to pending scientific questions? That’s the next step. We also want to conduct some communication with a machine that is meaningful, and meaningful means matching our intuition. If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully. Robots could not say “I should have done better,” as you and I do. And we thus lose an important channel of communication.
So maybe it’s not all curve fitting and optimization problems? Seems plausible, but the already formidable mathematics would seemingly get nearly impossible.