551days since
Proposal Due Date

Recent site activity

SciPy 2005 Trip Report

The 2005 SciPy Conference1 (September 22-23, 2005) was held at CalTech. About 50 people attended the conference, of which about half were from universities and the remaining attendees being split evenly between government laboratories and the private sector. There were 12 presentations and about eight shorter "lightening talks". The organizers of the conference deliberately limited the talks and provided long break sessions so the attendees would have plenty of time to hold informal conversations. Links to the presentations for many of the talks are available from the conference schedule page2.

The conference was a great opportunity to observe the state-of-the-art in scientific Python computing project. The new array object, scipy_core3, was the opening paper at the conference. Travis Oliphant of BYU is the author of the new package. The new implementation updates the array type to make use type inheritance using Python's new style types. (The new type system was introduced in Python 2.2.) The C API and the Python API are close to 100% backwards compatibility. The exception to this was a few name changes to improve consistency in the naming conventions. A script will be provided to support migrating older code to use the new API. The implementation is based on the same internal data structure as the old Numeric array type. The new Array is currently in a beta review process. The final release is expected in November or December. The Numerical, Numarray, and scipy_core implementations of array objects can all be imported and used together in an application. In the short term it may be necessary to use the older array types because they are used by other tools that we will be using for Fieldmarshal, however, I expect these tools will quickly move to using scipy_core.

A reoccurring topic of the conference was the desire to have the numerical Python displace MatLab as the primary tool used to teach students about numerical computations. Eric Hagberg from Los Alamos expressed concern that the current trend in universities is to teach students to use a specific commercial tool instead of teaching students the fundamentals of computer science. This concern was echoed by many others at the conference. Teaching the students to use numerical Python was viewed as a mechanism reducing this problem. Students will have a much better opportunity to learn how the calculations work because Python makes understanding how the calculations work much easier to learn.

The second presentation by John Hunter from the University of Chicago demonstrated many interesting capabilities of matplotlib4. This library was developed to provide the numeric packages for Python with an API compatible library for MatLab plotting. The author stated that anyone who knows MatLab should be able to use the matplotlib software immediately. The library also provides a more "pythonic" interface to the underlying plotting software. The presentation demonstrated capabilities the Python interface provided that were not available to MatLab.

The IPython5 command line interface to Python was used in the demonstration of matplotlib. In the hands of an expert these tools were amazing to watch. The IPython software was the subject of another talk on the second day. Robert Kern, from The Scripps Research Institute, is extending IPython to include a wxPython based widget to create the equivalent of the Mathmatica interactive notebook. The implementation saves an IPython session to an XML file which can shared with others. Like the Mathmatica notebook, the executable python statements in the notebook can be changed and the the calculations from any point forward in the notebook can be recalculated. The demonstration was very impressive for a project that is still in the early alpha stage.

There was also a demonstration by Fernando Perez from the University of Colorado at Boulder on the “chainsaw” branch of IPython during the lightening talks. The new branch extracts an IPython kernel out of the IPython implementation. The kernel can support multiple interfaces, such as the wxPython widget. The demonstration showed the kernel being used to implement a parallel IPython in which code can be automatically distributed to a cluster of servers.

The Envisage tool from Enthought was demonstrated by the owner of the company, Eric Jones. Enthought has just under 20 employees. They are based in Austin, Texas and their primary client base is the petroleum industry. Most of the Envisage tool is available as open source. Some of the industry specific applications that are built on Envisage are proprietary. Enthought was actively encouraging others to download and integrate the Envisage tool into projects. Their strategy is to develop a community of developers just as the Zope Corporation has developed a community around the Zope technology.

Envisage6 is a framework/toolkit for building large and complex scientific applications. At the core of Envisage is a Python extension called "traits". The key feature of traits is the addition of the ability to add type constraints to Python class attributes. For example, an attribute could be constrained to be a floating point number that is in the range of 1.0 to 200.0. Traits can also have actor associated with an instance of the attribute. If the attribute is modified any references to the actor will be notified that the trait has change. The actor capability was demonstrated using the Chaco plotting package that is built to work with traits and Envisage. As values where being changed in a dialog box the traits were automatically reflected in a graph that was tied to the trait.

A demonstration of the TVTK (Traits for VTK) component of Envisage was very impressive. A 3D model of a industrial mixer was displayed in a TVTK window. At the interactive Python prompt of Envisage the blade angle on the mixer was assigned a new value and the angle was automatically changed in the 3D model. Eric then wrote a three line python script that iterated over a range of angle values. As the script executed the mixer blade angle rotated to through the range of angles.

The Envisage tool provides a rich environment for building dialog boxes for data entry of trait values. A default implementation of the table editor for building the dialog boxes provides an entry field for each trait in a class definition. The layout and appearance of the dialog boxes easily customized. For example, if a trait has been assigned a default value in the Python class definition it is possible to hide the trait so the trait isn't displayed in the dialog box. If an trait is used to hold the results of a calculation then the dialog entry can be set to being read-only. When the content of a trait does not meet the constraint the background color of the entry box is changed to flag the error.

Eric also demonstrated an visual programming GUI on which work flows could drawn on a canvas. This portion of the application is still early in the development process, however, the canvas they created for drawing the work flows could be of immediate use as a basis for the SimpleCAD program.

Another impressive tool for building applications was Vision7, a 3D model analysis and visualization tool that was created by Michel Sanner of The Scripps Research Institute. A key design point emphasized by Michel was the separation of the tool from the science it was designed to support. His research is in the area of molecular modeling of drug docking. The example he used to demonstrate the toolkit was a 3D model of the AIDS virus. He demonstrated writing a Python script to automatically cycle through thousands of chemical models to find which chemicals demonstrated the best fit to the docking point on the AIDS virus. (A second presentation by Ruth Huey on AutoDocTools: A Tutorial provided a more detailed look at the tool.) Michel demonstrated his interface for interactively manipulating the display of a very complex 3D model of the virus. He as created a very impressive tool for examining features of a model by making parts of a model transparent, coloring parts, and changing the lighting model.

Michel also created a visual programming environment in which program blocks to carry out calculations were draw on the screen and connected to illustrate the data flow through the application. In some ways Vision is much more advanced than Envisage, however, Michel is primarily working alone on the development of the Vision and he has based the GUI application portion of the project on PWM, which is in turn based on Tkinter. While Tkinter is bundled with Python, it is an old GUI with some non-native look and feel issue. Also, Tkinter is not actively being developed. The wxPython GUI toolkit has a large and active user base with many people working on the development of the toolkit. It has caught on as the most commonly used GUI for building new applications. For example, wxPython is used by Envisage and IPython. The Vision software should be a good source for ideas, but these ideas should be ported to the Envisage tool.

Daniel Wheeler from the Material Laboratory at NIST gave a presentation of FiPy8, a finite volume PDE solver. The manual9 for FiPy is 125 pages and includes about 60 pages of detailed examples. The examples include calculations of diffusions, convection, and phase field changes. FiPy uses matplotlib package for creating all of the plots normally used in materials research. The tool also is a very good example of how to make the most of operator overloading in Python. The expressions in the source code examples read just like the mathematical expressions that they implement. This aspect of the implementation is very much like Konrad Hinsen's Molecular Modeling Toolkit (MMTK)10, which was first demonstrated at the San Jose Python conference in 1997.

There were two other presentations on solver packages that have been integrated into Python. Both suites of software have large development teams writing large numerical libraries in C++. The first presentation by Bill Spotz was on PyTrilinos11, a wrapper around the Trilinos suite from Sandia National Laboratory. Sandia has 31 developers actively working on Trilinos. They have 22 solvers in the toolkit and somewhere around 2000 user licenses. (It is open source, but there is some type of license required.) The second presentation by Mathew Knepley of Argonne National Laboratory, was on FEniCS12, a joint project between the Toyota Technological Institute at Chicago, the University of Chicago, Chalmers University of Technology, the Royal Institute of Technology, Argonne National Laboratory, and Simula Research Laboratory. The presentation mentioned multigrid, finite element and finite difference solvers. Matt commented after the presentation that he thought the core architecture of distributed solvers needed to be revisited. The current set of tools do not put enough intelligence into the local code execution. This is an artifact of old software assumptions about memory/processor/network speed concerns combined with the use of languages that were not as rich and easy to use as Python.

A talk by Mathew Brett of UC Berkeley on NiPy (NeuroImaging Software in Python) was reporting on a project that is just getting underway. They are building software for analyzing functional brain images in Python. The software will create a standard analysis pipeline for viewing and analyzing of data gathered from an fMRI system. They have released an alpha version of a statistical analysis package written in Python called brainstat.

I had many interesting and informative conversations during the break and at dinner. Travis Oliphant talked about his research interests at BYU. His research area is numerical analysis of medical instrumentation data. He is interested in developing models for doing fMRI analysis in which the magnetic fields are not uniform. This might enable the development of cheaper equipment in which the magnetic fields are generated by inexpensive permanent magnets instead of the very expensive equipment required to generate large uniform magnetic fields.

During the introduction several people from JPL, Northrop/Grumman, and Ratheon stated they were using Python for radar related projects. I talked briefly with Carlos Alcalde from Northrup/Grumman. They are using Python to facilitate the testing of a new radar system. There were many other conversations on the internals of Python and on how to improve the install tools for Python extensions. (I need to investigate the eggs tool. Also look at Matt Knepley's www.anl.gov/petsc packaging software.) There was a “birds of a feather” session on this subject following the dinner on Thursday evening. It was very helpful to talk with other developers that are facing similar challenges.

On the return flight I took a closer look at Inkscape13. This is vector graphic drawing tool for editing the W3C Scalable Vector Graphics (SVG) file format. It has been designed to allow plugins and extensions to be written for the tool. It is possible to write the extensions in Python. By adding extensions to add special shape drawing tools that can hold parametric data, it may be possible to use Inkscape and SVG could to create the simple 2D drawings that SimpleCAD was going to create. The examples for adding Inkscape Effects14 are encouraging. The API provided by Inkscape makes it very easy to search for shapes and extract the relevant data from the SVG file. SimpleCAD may still be written using the Envisage canvas extension to wxPython. Using SVG as the internal representation format may still make sense, regardless of how the drawings are generated. This would allow the shapes to be easily read into tools such as Inkscape for additional annotation. Also, the next release of Firefox will directly support SVG, so we will also be able to display the drawings directly on web pages.