Apache Arrow and the Future of Data Frames

In this talk I will discuss the background and motivation for the Apache Arrow project, which contains a columnar in-memory data standard and an expanding set of supporting libraries for a variety of programming languages. We will look at the relationship between data frame libraries and database systems and explore the ways in which analytics systems are likely to evolve to be more "Arrow-native" over the coming years.

Wes McKinney

 

Wes McKinney is an open source software developer focusing on analytical computing. He created the Python pandas project and is a co-creator of Apache Arrow, his current focus. He authored two editions of the reference book Python for Data Analysis. Wes is a Member of The Apache Software Foundation and also a PMC member for Apache Parquet. He is the director of Ursa Labs, a not-for-profit development group focused on data science tools for Python and R powered by Apache Arrow, built in partnership with RStudio. Previously, he worked for Two Sigma, Cloudera, and AQR Capital Management, and he was co-founder and CEO of the startup DataPad.