博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
中国最厉害的数据科学家_数据科学家最重要的一项技能
阅读量:2518 次
发布时间:2019-05-11

本文共 5123 字,大约阅读时间需要 17 分钟。

中国最厉害的数据科学家

By Richard Pugh, Commercial Director

商业总监Richard Pugh

I love my job.  Seriously.  I was enjoying it before , but since then, and the data science explosion, everything has kicked into an even higher gear.  Why do I love it?  Because fundamentally, our job as data scientists is to help people make better decisions based on the information we have at hand. That makes them happy, so they ask me to help them with something else, so more challenges for me!

我喜欢我的工作。 说真的 在之前,我很喜欢它,但是从那以后,随着数据科学的爆炸式发展,一切都在飞速发展。 我为什么喜欢它? 因为从根本上讲,我们作为数据科学家的工作是帮助人们根据我们掌握的信息做出更好的决策。 这使他们感到高兴,所以他们要求我帮助他们做其他事情,所以对我来说还有更多挑战!

As Mango Solutions continues to grow faster than you can say “what do you mean we need to look at offices again”, I find myself talking to more and more graduates about the skills needed to be a good data scientist, and pitfalls to avoid (mostly because I’ve stumbled into every pitfall at some point, so just about know where they are .. well, the ones I know about so far, of course).

随着Mango Solutions的增长速度超过您所说的“您意味着我们需要再次查看办公室”的意思,我发现自己在与越来越多的毕业生谈论成为一名优秀数据科学家所需的技能,以及避免的陷阱(主要是因为我在某个时候跌入了每个陷阱,所以大概就知道它们在哪里……好吧,当然,到目前为止我所知道的那些)。

When someone suggested writing this blog post, and gave me a title starting with “the single most important …” my initial reaction was to run away quickly.  Because surely, stating “the single most important …” in front of anything leaves you open to a back down at some point along the line.  But in the end I agreed … so here goes …

当有人建议写这篇博客文章,并给我一个标题,即“最重要的一个……”时,我最初的React是Swift逃脱。 因为可以肯定的是,在任何内容前面都注明“最重要的一个……”,这样一来,您就可以 。 但是最后我同意了…如此下去…

In my opinion the single most important skill for a data scientist is not:

我认为数据科学家最重要的技能不是:

  • Knowing the difference between a and a
  • Understanding which R package is best to use for a particular task
  • Being able to extract data from twitter and merge it with your relational database
  • Creating a really smart plot that simultaneously communicates a message clearly and looks really sexy
  • 了解和之间的区别
  • 了解哪种R软件包最适合用于特定任务
  • 能够从Twitter提取数据并将其与关系数据库合并
  • 创建一个真正聪明的情节,同时清晰地传达信息并看起来很性感

No, in my opinion, the single most important skill for a data scientist is … Empathy.

不,我认为,对于数据科学家来说,最重要的技能就是……移情。

Why “empathy”?  Because if we’re going to drive decisions with analytics, we need to appreciate the number of different personalities involved, what they are trying to achieve, what constraints they work under etc.

为什么要“移情”? 因为如果我们要通过分析来推动决策,那么我们需要了解所涉及的不同个性的数量,他们试图实现的目标,他们在什么样的约束下工作等。

For example, a data scientist may end up interacting with:

例如,数据科学家可能最终与以下人员进行交互:

  • The business user, who just wants to make more informed decisions, possibly in a very short time frame.
  • The IT contact, who has possibly never heard of the funky analytic technology you’re about to mention, and has to fill in 100 forms just to get a new server commissioned.
  • The marketing person, who wants to make sure you know that the colour of your graph needs to be #333380, not #3D3D99!
  • The internal statistician, who perhaps doesn’t understand this funky gradient boosted regression trees approach of which you speak, but is going to end up supporting this analytic solution.
  • 仅希望在很短的时间内做出更明智决定的业务用户。
  • IT联系人可能从未听说过您要提到的时髦的分析技术,因此必须填写100个表格才能调试新服务器。
  • 营销人员想确保您知道图表的颜色必须是#333380,而不是#3D3D99!
  • 内部统计学家可能不理解您所说的这种时髦的梯度提升了回归树方法,但最终将支持这种分析解决方案。

Being able to interact with these people and take their aims and concerns into account when you’re designing analytic solutions is essential to make sure you create something fit for purpose in a positive way.

在设计分析解决方案时,能够与这些人进行互动并考虑他们的目标和关注点,对于确保您以积极的方式创造出适合目标的东西至关重要。

Even when you’re not interacting with the team above, empathy is still something that should be at the front of your mind as a data scientist.  For example:

即使您不与上面的团队进行互动,作为数据科学家,同理心也应该摆在您的脑海中。 例如:

  • When I’m writing some code to extract data, is this a “one off” thing, or had I better write it in a more generic style, parameterise column names etc?
  • Who is going to support the code I’m writing?  Maybe I should steer clear of that “holy crap how clever am I” short line of code that does a million things and replace it with a few well-documented lines of simpler code?
  • How do I best present the insight back to the user?  In a visual style perhaps?  Then let’s make sure they can clearly see the message past the funky interactive embedded scatter/ring/pie(!) graph I’m making
  • Once they’ve understood the message, what will my business users’ next question be?  Maybe I should anticipate that and make it easy to answer that question too?
  • Having fit a cool model, does the end user really want to see a p-value?  Or do they just want to know what decision to make?
  • 当我编写一些代码来提取数据时,这是“一次性”的事情,还是我最好以更通用的风格编写它,参数化列名等?
  • 谁来支持我正在编写的代码? 也许我应该避开那行一百万行代码的短代码“我真聪明”,并用一些记录良好的简单代码代替它?
  • 我如何最好地将见解呈现给用户? 也许是视觉风格? 然后,确保他们可以清楚地看到我正在制作的时髦的交互式嵌入式散点图/圆环/饼图(!)图上的消息。
  • 一旦他们理解了消息,业务用户的下一个问题是什么? 也许我应该预料到这一点,并且也可以轻松地回答该问题?
  • 拟合了一个很酷的模型后,最终用户是否真的希望看到p值? 还是他们只是想知道要做出什么决定?

So, that’s it.  In my opinion, the single most important skill for a data scientist is “Empathy”.

就是这样了。 我认为,数据科学家最重要的一项技能就是“ Empathy”。

… and Fear! The two most important things are Empathy and Fear …

…和恐惧! 最重要的两件事是移情与恐惧……

翻译自:

中国最厉害的数据科学家

转载地址:http://wrqwd.baihongyu.com/

你可能感兴趣的文章
代码示例_进程
查看>>
Java中关键词之this,super的使用
查看>>
学习进度
查看>>
“此人不存在”
查看>>
github.com加速节点
查看>>
解密zend-PHP凤凰源码程序
查看>>
python3 序列分片记录
查看>>
Atitit.git的存储结构and 追踪
查看>>
atitit 读书与获取知识资料的attilax的总结.docx
查看>>
B站 React教程笔记day2(3)React-Redux
查看>>
找了一个api管理工具
查看>>
使用Postmark测试后端存储性能
查看>>
NSTextView 文字链接的定制化
查看>>
第五天站立会议内容
查看>>
ATMEGA16 IOport相关汇总
查看>>
JAVA基础-多线程
查看>>
面试题5:字符串替换空格
查看>>
[Codevs] 线段树练习5
查看>>
Amazon
查看>>
component-based scene model
查看>>