New York: Strata Conference, Day 1
I’m attending the O’Reilly Strata Conference (the well-respected techical books publisher, not the blowhard TV personality). This is the east coast edition, at the Javits Convention Center in New York City.
Today was tutorials day. Out of about a dozen tracks, I picked the PyData track. This was a rewarding choice. The developers of varioius Python tools and projects explained what they did and new features.
The IPython Notebook looks like something I will want to try out now. This allows interacting with Python through a web browser. It can allow the creation of narrative text along with Python code. There are provisions for being able to set up GUI elements as well for setting parameters. The Notebook can be executed locally or remotely. I can seesee setting this up to provide a shared platform for collaboration.
I’m still a bit hazy on Blaze, which at a minimum appears to be a super data access layer for Python.
The Bokeh project aims to provide data visualization for Python. It, too, can display in a browser page and support local or remote operation. It sets up interactive graphics, and allows succinct specification of new graph types in Python without having to delve into JavaScript.
After lunch, the discussion turned to SciKit Learn, the machine learning library. The main thing I learned from the session was that all the models in SKL utilize the same fit/predict/transform API, making it simple to shift from deploying one algorithm to another. SKL also provides a range of model assessment methods that I hadn
t realized were there, but will now need to try out.
Wes McKinney, the creator of the Pandas data handling package, talked about the changes coming in the soon-to-be-released version 0.15.0. The largest change is the new categorical data type, meaning that Pandas will shortly be able to handle those nominal data values so commonly encountered in data sets, but previously not in Pandas.
The final talk was about design choices in PyParallel and how those allowed it to outperfom Python or NodeJS by about a factor of three on benchmarks.
That will do it for now.