Open Source Developer ResourcesContinuum Analytics is working on a variety of new open-source Python-based technologies for big data.
Blaze extends NumPy, Python’s extremely popular array library, to handle out-of-core computations on large data that exceed the system memory capacity, as well as distributed and streaming dataset
Blaze extends NumPy's successful model of array-oriented programming to out-of-core and distributed data. Blaze allows analysts and scientists to productively write robust and efficient code, without getting bogged down in the details of how to distribute computation, or worse, how to transport and convert data between databases, formats, proprietary data warehouses, and other silos.
The core of Blaze is a generic N-dimensional array/table object with a very general “data type” and “data shape” descriptor for all kinds of data, but especially semi-structured, sparse, and columnar data. Blaze’s generalized calculation engine can iterate over the distributed array or table and dispatch to low-level kernels, selected via the dynamic data typing mechanism.Source Code Developers' Mailing List
Bokeh is a Python data visualization library combining the ideas of the Grammar of Graphics and Protovis, with enhancements to support interactive visualization. Its primary output backend is HTML5 Canvas.
There are many excellent plotting packages for Python, but they generally do not optimize for the particular needs of statistical plotting or multidimensional datasets. Additionally, advanced visual customization is typically difficult for non-programmers, and most libraries do not build a reified data processing pipeline that supports rich interactivity like linked brushing. Bokeh addresses these problems at their core by using a declarative data transformation scheme, and is engineered to operate in a client/server model for the modern web.Source Code Developers' Mailing List
conda is an innovative package manager tool that allows users to mix-and-match different versions of Python, NumPy, SciPy, and other packages in isolated environments and easily switch between them.
The conda command is the primary interface for managing Anaconda installations. It is great for solving enterprise integration and application deployment challenges. It can query and search the Anaconda package index and current Anaconda installation, create new Anaconda environments, and install and update packages into existing Anaconda environments.Source Code Developers' Mailing List
Numba is an open-source, NumPy-aware optimizing compiler for Python sponsored by Continuum Analytics, Inc. It uses the remarkable LLVM compiler infrastructure to compile Python byte-code to machine code especially for use in the NumPy run-time and SciPy modules.Source Code Developers' Mailing List
NumFOCUS is a charity with the purpose of supporting and promoting world-class, innovative, open-source scientific software. It provides the critical service of helping to remove the financial burden of continual development for many projects, including many of the most popular projects like NumPy, SciPy, IPython, PyTables, pandas, Matplotlib, scikit-learn, and more.Donate Now