Virtualization Technology News and Information
Article
RSS
VMblog's Expert Interviews: Anaconda Discusses Results from First 'State of Data Science' Survey

Anaconda, Inc., a popular Python data science platform provider with 2.5 million downloads per month, recently announced the results of its 'State of Data Science' survey, revealing key trends in data science and machine learning within the Anaconda community.  To find out more, VMblog spoke with Mathew Lodge, the company's SVP of Product and Marketing. 

VMblog:  So, you recently completed a survey of Anaconda users.  What were the high level findings?

Mathew Lodge:  Yes, we set about creating our first State of Data Science survey in order to gauge the key trends in data science and machine learning within the Anaconda community. The survey ran from March 22 to April 30, 2018, and resulted in 4,218 responses, all with a 100% survey completion rate. With such a great response, we were eager to review the findings.

At a high-level, we found that cloud-native data science is on the rise. The results from the survey show that traditional big data infrastructures, like Hadoop and Spark, are losing traction as the world heads towards cloud-native technologies such as Docker containers, Kubernetes and API-driven applications.

VMblog:  What were you most surprised by?

Lodge:  We were most surprised by the number of software developers using the Anaconda platform. While the majority of respondents were students (26%) and data scientists (16%), we were pleased to see that software developers comprised 15% of all respondents. With machine learning becoming as pervasive as it is, early adoption of data science tools like Anaconda by software developers will ease the integration of machine learning into countless applications.

VMblog:  It definitely seems to indicate that Anaconda users are moving from big data towards cloud native. Why do you think that is?

Lodge:  Well, there is definitely a shift taking place from on-premise infrastructures to cloud-native ones. When we take a look back at the history of Hadoop, it was a solution for the "big data" era of 2005. Now, the amount of data that data scientists are presented with is exponentially bigger, and is only growing. In conjunction, containers are also growing in production. Based on the survey, Docker makes a strong showing at 19%, beating out Hadoop/Spark with 15%, followed by Kubernetes at 5.8%. These results suggest that modern cloud-native style architectures like Docker and Kubernetes are rising.

VMblog:  What does the trend in container usage mean and how is that used by data scientists?

Lodge:  It means that data scientists are embracing cloud-native architectures like containers. If we look at the container market, we see all of the major players - AWS, Microsoft, Google, IBM, Red Hat, and Docker - behind Kubernetes. Containers and Kubernetes make great language-agnostic distributed computing clusters which is important in today's world of extreme data. It is no longer enough for enterprises to merely manage their data - they must be able to use it. The ability to use machine learning to gain insights from data in real time is the appeal that data scientists are drawn to.

VMblog:  Is there anything else you can highlight?

Lodge:  Another interesting point to call out is that Google Cloud Platform's data services outrank those of Amazon Web Services and Microsoft Azure. Even though Google Cloud is the third largest cloud provider, its focus on data services is paying off with the Anaconda community. Ultimately, we aim to do the same thing - focus on providing our community with innovative tools for data science.

We were very pleased that our survey resulted in a deeper understanding of the Anaconda community as a whole. It was great to see so many people participate. We were also pleased to see how strong our position is within the community and we will continue to drive innovation in this part of the market.

##

Published Wednesday, June 27, 2018 2:00 PM by David Marshall
Filed under: ,
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<June 2018>
SuMoTuWeThFrSa
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567