While surfing about on the Internet the other day, I
came across a hidden treasure of Open Source code hiding in plain
sight, sponsored by the United States National Institute of Standards
and Technology. The site, NIST Digital Library of Mathematical Functions, is best described in
their own documentation:
"In 1964 the National Institute of Standards and
Technology...published the Handbook of Mathematical Functions with
Formulas, Graphs, and Mathematical Tables, edited by Milton
Abramowitz and Irene A. Stegun...The online version, the NIST
Digital Library of Mathematical Functions (DLMF), presents the
same technical information along with extensions and innovative
interactive features consistent with the new medium [of computing
technology]...
"...The technical information provided in the
Handbook and DLMF was prepared by subject experts from around the
world...The validators played a critical role in the
project,...[providing] critical, independent reviews during the
development of each chapter, with attention to accuracy and
appropriateness of subject coverage...All of the mathematical
information contained in the [print version of the] Handbook is also
contained in the DLMF, along with additional features...The DLMF has
been constructed specifically for effective Web usage...The NIST
Handbook has...[the] objective...to provide a reference tool for
researchers and other users in applied mathematics, the physical
sciences, engineering, and elsewhere who encounter special functions
in the course of their everyday work..."
The site includes a link to the Guide to Available Mathematical Software, which
is a cross index of mathematical software in use at NIST. Of
course, not all of the software included in this cross index is Open
Source, but that which is available as Open Source (possibly subject
to application-specific licensing constraints) can easily be
downloaded and used either independently, or incorporated into one's
own code. The Guide also provides a cross-reference between
commercial and Open Source software packages that implement specific
algorithms.
As a quick example of the gems available here, let
us look at a small application called "envelope",
which is a "program for calculating envelope curves for oscillatory
data", developed by Marjorie McClain of NIST, which is in the
Public Domain. This program does something I have often felt would
be very helpful in the primary stages of studying noisy time-series
logged data, and something I have actually done through a
time-consuming process of physically observing the data and
estimating the envelope from a graph of the raw data. None of the
analysis packages (both commercial and Open Source) I typically use
provide a simple automated means of accomplishing this (which doesn't
mean they don't provide it- it just means that if these packages do
offer the capability, it is buried so deeply in all the bloat, it
can't be found).
When one downloads the package, one winds up with a
*.shar shell archive file, which is essentially an ASCII text file
that combines all (hopefully) of the necessary components of the
package. One unpacks such files with the simple command, "sh [file
name]". This particular package contains a "manual" and some
test data. But we also run in to a couple of issues. The program
uses an obscure screen graphics render (Volksgrapher) for its output,
and the program was originally written in Fortran 77 (which was the
de facto standard back when it was written). The first problem is
easy to overcome: since we don't need yet another plotting routine,
we just comment out the calls to the graphical output in the source
code (we could just as easily replace these calls with calls to
something like gnoplot, but we are still in the evaluation stage).
The second problem seems a little more ominous at
first, because the gcc Fortran compiler (ubiquitous in the Linux
community), gfortran, does not like Fortran77 (I suspect this is a
problem one will encounter with many of the packages encountered in
such repositories). The old g77 compiler is no longer supported, and
I don't want to risk corrupting my current libraries with legacy
versions. But all hope is not lost. There is a program called f2c,
a public-domain Fortran-to-C converter, or the The fort77
utility, which is an interface to the FORTRAN compiler that accepts
the FORTRAN-77 language. Since both of these are available from the
Ubuntu Software Center for my particular distribution, this seems
safer than trying the older g77 package. We try compiling with the
sample command provided in the *.shar file with fort77:
f77 -o envelope envelope.f smooth.f
efc.f initpt.f env.f pchic.f pchfe.f savenv.f \ r1mach.f vg.a vgx11.a -Bstatic -native -libmil -lX11 -lm
This, of course, doesn't
work because the vg.a, vgx11.a and libmil libraries are apparently
related to the Volksgrapher package, which we have not built so we do
not have them on our system. Also, we note that the flags "-Bstatic"
and "-native" are not legal in fort77. By eliminating the
plotting routines and the illegal flags, the program compiles the
first time around with the command:
f77 -o envelope
envelope.f smooth.f efc.f initpt.f env.f pchic.f pchfe.f \
savenv.f
r1mach.f -lX11 -lm
(NOTE: it is likely that
we could eliminate the link to the X11 library as well).
But, does it work?
The package includes some
test data and sample results we can use to evaluate the program, so
we run the "envelope" command we just created with the "test.dat"
file included with the archive. Our output generated by the program
agree exactly with the sample output files, telling us the program is
running as it was designed, doing what it was designed to do(and it
did it too fast to time!). Using LibreOffice Calc to have a look at
the output:

Looks good, exactly what
we expected, but this looks like pretty clean data compared to what
we are usually working with. How does the program work with some
real-world data?
Here is an example of
some data from a project:

As seen in the above illustration, the original
envelope curves (dashed lines) seem to follow the data fairly
closely, but are too close together, missing a number of peaks. We
fix this by shifting the upper curve by 0.5 in the positive y
direction and the lower envelope curve in 0.3 in the negative y
direction. I suspect the reason the curves did not fit the data as
well as they fit the test data has to do with the "initial
smoothing" routine. But, with these preliminary results (which
took all of 15 minutes to develop, including loading the results into
LibreOffice Calc to plot them, studying the results, re-running the
"envelope" program with new tolerances a couple of times and
modifying the plots a couple of times to see the new results, etc.),
we now have some valuable information that we can use for further
analysis.
This little exercise has been intended to illustrate
some of the more esoteric advantages of Open Source software. First
of all, we have a public domain tool readily available for solving a
particular issue with minimal investment (in both dollars and time).
We have a tool that does one thing very, very efficiently, and we do
not find ourselves wading through all sorts of bloat in a commercial
package (or, even some of the larger Open Source projects)- we get to
our solution much faster. If we are not pleased with the way the
software works, we have the option of modifying it to suit our own
needs (i.e., we dropped the program's original graphic output
interface. Note that this did not require a high level of
programming expertise to accomplish). The little application is
pretty much platform independent, so long as one has the appropriate
compiler.
Plus, it is really nice to see our tax dollars going
for something that is practical...
The NIST site is not the only site where such
practical Open Source software can be found. A couple of others (to
which the NIST site also links):
- netlib: A
repository of freely available software, documents, and databases of
interest to the numerical, scientific computing, and other
communities. The repository is maintained by AT&T Bell
Laboratories, the University of Tennessee and Oak Ridge National
Laboratory, and by colleagues world-wide. Most netlib software
packages have no restrictions on their use.
- Collected Algorithms of the ACM: Software published by the journal ACM
Transactions on Mathematical Software (TOMS).
- Computer Physics Communications Program Library: Software associated with
papers published in the journal Computer Physics Communications.
|
Comments rated to be "almost" Good Answers: