Well, I spent my Saturday morning programming a C# application for Windows Mobile 5 to periodically record audio.
I was aiming to set up a data logger using the Raspberry Pi board I’ve got, but I’ve run into enough problems that I decided to look at another approach.
The idea is to log acoustic data underwater to get snapping shrimp snaps. Snapping shrimp are small crustacean predators who use a specialized claw to generate cavitation events that stun or kill their prey. Cavitation events are loud, and snapping shrimp populations are large, so a major component of the acoustic background in tropical to semi-tropical waters where there is structure is snapping shrimp snaps. These various elements come together to make snapping shrimp an excellent indicator species. The acoustic record can be used as an indicator of population health for snapping shrimp, and since snapping shrimp are metazoan predators, they indicate the health of the ecosystem.
The state of Florida has a Harmful Algal Bloom (HAB) program where monitoring stations are scattered around places. The stations have power, some have internet, and there’s space for more instruments to be loaded aboard. The regular schedule of sampling of other instruments is four times an hour. So I’m looking to sample one minute of acoustic data on each quarter hour, so my data can be correlated with the data generated by the other instruments.
The Raspberry Pi looked pretty promising as a data logging platform. It comes with an SD card slot, has an Ethernet port, and allows expansion via USB. There are three things that are holding me back on that: stable power, USB audio compatibility, and time synchronization. Getting the RasPi to power on from a cold start is a piece of cake. Getting it to reboot with “sudo shutdown -r now” is not reliable. This is likely a power interaction between my power source(s) (I’ve tried three so far) and the USB hub(s) (I’ve tried two so far). The recommended low-cost USB audio interface for Linux is a Behringer UCA202. ALSA on the RasPi, though, doesn’t think it has a capture device. The RasPi doesn’t have an RTC. This isn’t a problem with an Ethernet connection to the internet, but it is a problem if there is no connection when the RasPi boots. I’ve tried setting up GPSD with NTP to fill in when there is no network connection, but with negative results so far.
That brings me to my alternative plan. Back in 2005, Diane and I had to hurriedly design an autonomous acoustic recording system with COTS parts in order to deploy in three weeks for a field season in Wyoming. We settled on using Pocket PC devices with Core Audio’s PDAudio sound cards and A/D devices. I rigged regulated power supplies that ran off motorcycle batteries so that each unit only had to be serviced every couple of days to swap batteries and memory cards.
Since I’ve gotten my Android phone, I haven’t been using my Dell Axim X50v PDA much. I’ve done ad hoc recordings with the X50v before dropping a hydrophone over a seawall or overpass, and it’s done OK. So I started looking at audio programming for Windows Mobile devices. It was a bit tougher than it strictly needed to be. The declining market share for Windows CE/Pocket PC/Windows Mobile means that application development goes through Visual Studio 2008, not the latest development tools. (I did install SharpDevelop, but dropped that for VS2008). Audio support for Windows Mobile looks pretty minimal. There’s an interface based on the Platform Invoke Library (PIL) provided by Microsoft, and there’s the OpenNETCF library that wraps the PIL. I tried OpenNETCF because it looked simpler to implement, but I ended up with memory leaks. Using PIL directly gave me success this morning, and now I’m just letting the application record one minute of audio every five minutes as a test. The system memory report under Settings is showing stable usage of memory so far. I’m aiming to deploy the system where it will only be serviced every two weeks, so I really need to watch out for long-term problems. So far, though, it looks like I should be able to finish up with power supply issues and the acoustic gear side of things and get it installed by the time the next HAB station gets deployed. And it fits on the budget I’ve got, which is pretty close to nothing at all.
I spoke a little too soon. While the periodic recording bit seems stable and I appear to have quashed memory leaks, I think I’ve got a hardware issue. I had listened to a couple of recordings that actually picked up the signal I was providing on line-in. I checked some more, and found that at some point the X50v went back to recording from its built-in microphone. That, of course, does me no good. I have tried two different cables with the same result. Usually, switching with a plug is simply a matter of physical displacement; if the plug is in, the alternative input makes no connection to the system. I’m not sure how it goes with the X50v, but I’ll probably have to disassemble it to find out.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 19292 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 5156 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
I had the chance to work with my Raspberry Pi some more late last night and this morning. Quite a lot of stuff works, given that the board design is essentially at a state of “ready for the software developers to do their thing”. But some things are not quite there, or behave oddly.
My first boot-up that I talked about was on a bench without networking. That turns out to be significant. When I tried to run my RasPi with the full load of peripherals in the USB hub and also have the wired Ethernet on, I got a lot of “kevent 4 may have been dropped” error messages and no network connection. The canonical answer on the RasPi forum is that this is a power supply issue, where marginal power to the board means there isn’t enough to properly run the Ethernet circuitry. Some respondents have noted that their circumstances don’t fit into that neatly. I suspect that I may be joining them, but I have some more experimentation to do before saying so categorically. My get-it-working solution so far is to run the RasPi off a dedicated power supply and have all the peripherals on a powered hub. This isn’t ideal for something I hope to deploy remotely. I need to figure out really reliable, comes-up-on-power-on every time configurations.
I spent entirely too much time dealing with something that I should have caught early. The RasPi is a UK invention, and its default settings are convenient for people in the UK. I have a firewall here, and I set my RasPi to enable SSHD so I could login over the net. I logged in from an Ubuntu box and changed the “pi” user password to something approaching a strong password, you know, one with odd case, numbers, and symbols. That’s all to the good, but then I rebooted and ran into the network interface being offline. Fine, I thought, I’ll login directly. But I couldn’t, because no matter what I did, I could not generate one of the symbols in the new password from the directly-connected keyboard, not even with alt-codes. Stripping the RasPi down to just power and network allowed it to boot and establish the network interface, and I could login once again from a remote computer. I changed the password to avoid the bad symbol and worked on localization. The involves “dpkg-reconfigure” applied with three different targets, the keyboard, the locale, and the timezone.
I’ve been able to install a batch of additional software. I installed Cmake and libncurses5, then tried building Avida on the RasPi. The Avida build doesn’t get far. tcmalloc apparently is known to have build issues on ARMv6, plus multiple classes got an “out of virtual memory” error. That still holds with the boot switched to the 224MB main memory setting. But python-scipy and python-gps installed without issues. I even installed VLC to check if the final piece of a media center was anywhere close to done. While the VLC and its dependencies went on without complaint, plugging in a USB DVD drive and pointing VLC at it did not go much of anywhere. There was no continuous playback, and if I changed the media pointer, it would display a single frame. I think that the color rendition was off, but I had plugged in a movie that I hadn’t watched yet, so it is just possible that the cinematographer thought a strange palette would be a good thing.
I tried out my USB GPS dongle. I installed “gpsd-clients” and ran cgps, which reported … absolutely nothing. That was disappointing. I plugged the GPS into my Ubuntu box, and cgps happily displayed a fix and chatter from the dongle. I went back to the RasPi, stopped gpsd, then used gpsmon. That displayed a fix and messages from the dongle. So I’m not sure why gpsd on the RasPi is doing things differently than on the Ubuntu box.
For those pulling up “Geany” to do some Python scripting, you’ll need to change the preferences so that the terminal of choice is not “xterm”, but rather “lxterminal” (this is for Debian Squeeze).
That’s it for now. I’m expecting to have to repeat this process whenever a new version of the operating system is released, so a set of notes on what gets done seems in order.
<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 21104 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 5664 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
Notes on RasPi
sudo nano /etc/resolv.conf
sudo dpkg-reconfigure keyboard-configuration
sudo dpkg-reconfigure locales
sudo dpkg-reconfigure tzdata
Additional python modules:
sudo apt-get install python-scipy
sudo apt-get install python-gps
WiFi dongle —————————————–
Add to /etc/apt/sources.list:
deb http://ftp.us.debian.org/debian squeeze non-free
sudo aptitude update
sudo aptitude install firmware-atheros
sudo wget http://wireless.kernel.org/download/htc_fw/1.3/htc_9271.fw
sudo wget http://wireless.kernel.org/download/htc_fw/1.3/htc_7010.fw
I checked the UPS tracking number periodically today. My Raspberry Pi was marked as delivered at about 2:30 today.
When I got home, I found the package. I still needed to prepare the SD card, so I brought up the RasPi Wiki instructions for SD card setup and went with the Debian Squeeze distribution to start with. While “dd” was doing its thing, I was preparing other things.
The LCD monitor I want to use needed to have its built-in stand removed. There wasn’t room to attach the HDMI cable to the HDMI to DVI adapter and fit that to the while the stand was on.
I located a USB keyboard and trackball. I also found a USB trackpad.
Back to the Ubuntu box and the SD card. I went through the steps to resize the SD card partition with parted. I got some weird messages from the two steps following parted, but apparently one other step was needed: remove the SD card and reader, then plug it back in. With that done, the SD card looked to be in good shape.
I unpacked the USB hub and plugged in power. I hooked up the USB Y cable to the USB to Micro B cable and the hub. I plugged the keyboard and trackball into the hub.
Then I opened up the RasPi package. The package held a packing list (one RasPi, of course), a “Getting Started” single sheet document, and a plain cardboard box. The RasPi was in an antistatic sleeve in the box. It came out, and I started hooking things up.
The SD card holder gave me pause. There’s a gold-plated bar that the card meets, and it took me a moment with a magnifying glass to make sure that it was intended to move when the card was inserted. It looks to be a switch arrangement to indicate the presence of a card.
Then the HDMI cable went in. The monitor changed from its “no signal” display to a blank black screen.
I hooked up a USB data cable between the RasPi and the USB hub.
Then I plugged in the power. There was about a three-count before the monitor started displaying the initial boot-up screen. Things proceeded nicely from there.
The RasPi all hooked up.
Here’s the USB hub and a couple of USB peripherals of interest, an audio interface and a GPS.
And here is the RasPi system hooked up and driving the monitor, showing the default X Windows desktop.
The RasPi doesn’t like my Logitech USB trackball, but it works fine with a trackpad. I’m having some trouble with the keyboard, but I expect that it is the keyboard’s fault. These accessories are pretty ancient by computer standards.
It’s a bit disappointing that the USB WiFi adapter that I have on hand doesn’t seem to be working with the system. I’ll give it another try before moving on to other stuff. That means I’ll need to put the RasPi setup where I can run a physical Ethernet cable.
Looking at dmesg, both the GPS and the audio interface appear to be recognized OK. That’s about as far as I’ve gotten on that.
The word is that the platform I’d like to deploy on won’t go out for two to four weeks, so I have a little time to organize and develop a RasPi-based data collection system. The first step went nicely enough that I’m hopeful about the rest.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 20025 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 5486 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
Diane and I are working on a personal project to put together an acoustic sampling system that could yield information about the activity levels of snapping shrimp. Whitlow Au and his group have done this sort of thing out in the Pacific. Of course, they’ve gotten research funding to do it. We’re looking to do this out of our pockets, at least for the first proof-of-concept.
Snapping shrimp are small crustaceans. They stun their prey using an oversized claw. Well, that’s just half the story. Any crustacean with a claw might grab or bonk a prey item using a claw. Snapping shrimp create a cavitation event with a snap of their claw. The resulting burst of acoustic energy is a natural disruptor beam (obligatory SF reference can be checked off now). There’s some cool high-speed video of snapping shrimp doing their thing that got published some years back.
Those cavitation events are loud. Until human shipping noise is added to the picture, the single biggest item in the tropical to semi-tropical littoral marine acoustic environment is energy from snapping shrimp snaps. Part of the challenge for my dissertation work on dolphins clicks was coding a recognizer that would include dolphin clicks but exclude snapping shrimp snaps.
Because their method of prey capture produces a signal that travels significant distances, their activity can be tracked for a particular location just using acoustic recording. Because snapping shrimp are so widely distributed and so abundant anywhere there is structure in the (relatively shallow) marine environment, this can be done just about anywhere of interest: seagrass beds, reefs, mangrove swamps, etc.
We’re thinking of snapping shrimp as an indicator species. The various factors of their life history and acoustic features makes them well-suited for this role. A drop in snapping shrimp activity that doesn’t fit the usual diurnal and seasonal patterns would be taken as an indicator of declining ecosystem health.
But to get there, we have to be able to sample those acoustics. This is a job that we’re hoping to accomplish with an instrument we’ve budgeted $200 for parts. This is pretty much penny-pinching taken to an extreme. Here’s the basic gist of where we’re going.
We’re hoping to base the instrument on the new Raspberry Pi platform. This ARM-based Linux system comes with an SD-card interface plus USB. It doesn’t come with a clock. For places with a network connection, NTP can handle setting the time. For other places, we’re hopeful that a cheap USB GPS dongle will serve to provide both time and location. The RasPi also has no sound input, so a USB sound interface is needed. The RasPi needs a power supply, as do whatever USB devices we want to use, so a powered USB hub seems the best solution. We’ll need a hydrophone. That’s something we can make out of a piezo disk, cabling, and some waterproofing method (epoxy, urethane, or perhaps even Plasti-Dip). And that will need a preamplifier. This is where we might bust our budget.
The RasPi is $35. The GPS with USB is $28. The sound interface is $29. The powered USB hub is $27. A piezo disk is about $0.50, and the Plasti-Dip for it might cost a buck.
Some time back, Diane worked with engineers at the University of Texas at Austin’s Applied Research Lab on a dolphin biosonar project. They set out to make a preamp that would provide flat response from a few kilohertz up to two megahertz. The result was a circuit they called the Universal Dolphin Preamplifier. Depending on the discrete components on the circuit, it could be configured for 0, 20, or 40 dB of gain. Even though our first pass at an instrument would be strictly human audio range, I had hoped to be able to construct one of these preamplifiers for use in the project. That was before I started pricing the integrated circuits used in it. There are three of them, and the prices are $37, $16, and $13. All told, I’m estimating about $86 for the cost of parts for one preamplifier circuit. Instead, I’ll be looking to use a more common — and cheap — audio-range preamplifier for our first instrument to deploy.
There are some other things that would be useful to add that may not make it, like some sort of LCD panel to indicate system status. We may just go with some LEDs.
There are consequences of being cheap. The peak frequency of the broadband transient that is a snapping shrimp click is upwards of 50kHz. There’s energy at frequencies within the human audio range, so recording at that range will allow detection of snapping shrimp clicks, but not any sort of spectral analysis that would mean anything. That means just getting measures of activity, like number of detectable clicks. Recording a single point likewise doesn’t tell us much about spatial distribution of snapping shrimp being recorded. We might group clicks by relative received amplitude as a proxy for distance from the hydrophone. And because we’ll deploy an uncalibrated hydrophone, we won’t be getting absolute amplitudes out of the samples, everything will simply be relative.
Doing this for the maximum amount of information would thus imply use of calibrated hydrophones, multiple hydrophones to allow for acoustic localization, and sampling rates high enough to capture the full frequency range of snapping shrimp clicks. A calibrated hydrophone from a vendor could easily run over $1000 each. A system for recording four simultaneous channels of acoustic data at up to 500 kilosamples per second could be done for about $1000 using the Tern Micro GR4 ADC units and a microcontroller. That complete system could easily run between $6000 and $10000 all told. So for the moment we’ll stick with the limitations of doing science on a shoestring budget.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 6724 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 1939 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
I’ve been going through biosonar data and while the SciPy specgram method is serviceable, I was interested in a short-time Fourier transform (STFT) implementation. There are a couple of ad hoc routines on Stack Overflow and the like, but I’ve started off with the Google Code PyTFD module. There are others out there as well, at least two projects including an STFT implementation are aimed at extracting time and frequency data from musical recordings. I may have a look at one or both of those at some point.
In any case, installing PyTFD involves downloading the code via Subversion and then running the setup.py script.
Since I spent more time than I think was absolutely necessary getting a couple of examples done with the STFT, let me run through an example in the hopes that helps somebody.
- # Imports
- from __future__ import division
- from pytfd.stft import *
- from pytfd import windows
- import numpy as np
- import numpy.fft as nf
- import matplotlib
- import scipy
- import scipy.signal as spsig
- import pylab
- from pylab import *
- # [...]
- w = windows.rectangular(8)
- Y_stft = stft(clkdata,w)
- extt = [0,Y_stft.shape*1e-6,0,5e5]
OK, so there’s a fair amount of things to be imported along the way. The first three items (lines 2 to 4) are specifically for setting up access to PyTFD’s STFT method. Line 18 sets up the window function to use in the STFT. Line 19 actually does the work, getting the resulting multidimensional Numpy array with the STFT result given a Numpy array input and the window.
Line 20 sets up the extent array to express the size of the X range and the Y range covered by the STFT. Lines 20 to 24 puts the result in a subplot. There are some issues there. The STFT results are essentially a whole series of Fourier transforms, and those have both negative and positive frequencies, and are complex values to boot. So the “abs” function provides a magnitude for each point. The slice yields just the positive frequency range. Then the extent gets set to the range represented by the STFT. The “aspect” parameter is set to “auto” so that the X and Y ranges can be calculated separately by Matplotlib. The “origin” is set to “upper” to put the frequencies in the expected orientation.
Here’s a couple of the outputs:
<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 3840 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 1298 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
As noted here before, I’m working through refreshing archived data, mostly from CD-ROM media. I’ve run into a whole batch of CD-ROM disks that are in good physical condition, but which mostly cannot be read. I’m trying some tools that I’ve seen recommended, but would be open to suggestions.
But the whole point of getting the archived data refreshed is to do something with it. And that’s what I will aim to discuss here this time.
Over several years, there were a number of different technologies I was using to collect bioacoustic data. This means that I don’t have one single type of data of interest. I have data that was recorded on audio cassette tape. I have data from a Racal Store V data recorder that was transferred to cassette tape. I have digital data from Keithley-Metrabyte DAS-1800 DAQ, Tucker Davis Technologies DAQ, and a couple of different National Instruments DAQ boards multiplied by at least two different multichannel scenarios. Plus, there’s digital data transferred off of a Racal Storeplex unit via SCSI. There’s mixed endian byte order issues, among other things.
I have a good software solution for two of these particular data acquisition scenarios. I wrote that between 1999 and 2001 using Borland’s Delphi 5. In all, there’s about 60,000 lines of code for data acquisition, reduction, analysis, and visualization. The original can handle multi-channel recordings taken from a single National Instruments board. A variant works on digitized audio recordings. That includes interactive data reduction with an automated click-picker whose choices can be refined with changes in parameters or by interaction with an oscillogram graph.
That still leaves a lot of data waiting for analysis. During my time at Michigan State University, I got into Python programming. There are a number of nice things about going after the rest of the data with Python. A big one is that Python is free, open-source software. I can have colleagues install it and not have to worry about breaking their budgets, which is a concern when one considers the well-established science and engineering scripting platform, MATLAB. While Python doesn’t yet have all the “toolbox” capability of MATLAB, it has enough to move ahead with. For the scientific programmer, there are the Numpy, Scipy, and Pylab modules (I installed the Python(x,y) package on my Windows laptop, which includes those and more besides.) Numpy extends Python with a fast array and matrix manipulation capability. Scipy includes a variety of analysis tools. Pylab looks to put a wrapper on those two, plus the Matplotlib graphics module and the Ipython interactive shell.
I recently wanted to extract spectral information about dolphin clicks from one of the datasets that I hadn’t previously examined. So I turned to Python to do that. The data was stored as raw binary, 16 bit signed integer samples. Reading that data was simply:
- fd = open(fn, 'rb')
- read_data = np.fromfile(file=fd, dtype=np.int16)
where “fn” is a filename pulled from the directory of interest. The “np” reference above resolves to “numpy”. The three lines say to get an open file object, fd, by opening a file, fn, for binary read. Then, a Numpy array containing the data is returned by the Numpy static method, fromfile, given the file object and the specification of the data type as signed 16 bit integers. The third line closes the file object. If I had a problem with endian issues, there’s at least a couple of ways to address that in Numpy. (Getting the wrong byte order should be obvious on visualization, but I’ve seen a professor merrily tout a new processing method for dolphin clicks when his slides clearly showed that he had a byte-order problem with his dataset.)
While it is better to handle DC offset problems at the time of data collection, sometimes you just have to deal with it at analysis time. This dataset handed me that problem. This problem is one where a time-varying signal should be centered at zero volts input, but instead centers at some non-zero voltage. Fortunately, it was a fixed offset, so a pretty simple approach worked nicely: find the mean value across the dataset, and subtract that value from each sample.
- shiftdata = read_data + ([-np.average(read_data)])
The use of a Numpy array for the data means that the one line above handles the element-wise addition operation. The Numpy array on the left is now a floating-point array instead of an integer array.
My Delphi program had a click-picking algorithm that took a while to craft. I haven’t ported it yet, so I just went with a very simple approach in Python. That looks at chunks of the data, where the chunksize was selected to be a bit larger than the maximum click width, but a good deal smaller than the interval between clicks. Within each chunk, the maximum value and minimum value are found. If the maximum and minimum are outside a defined noise level, consider it a found feature.
- chunkmin = np.min(cary)
- chunkmax = np.max(cary)
- if (chunkmin < -noiseband) and (chunkmax > noiseband):
- # Found a click! Or a transient, at least.
- chunkmaxloc = cary.argmax()
Using the Numpy routines to find the min, max, and max location is pretty snappy.
Then, for each “click” located, I ran an FFT to get a power spectral density, and plotted that. I just used example code to add this functionality. (For underwater acoustics where pressure is measured, though, the conversion to decibels uses a factor of 20 rather than 10.)
So, for a quick and dirty script of less than three hundred lines total, I was able to:
* get a directory listing
* match to filename features to identify files to analyze
* remove DC offsets
* save new versions of the data
* scale the data according to field notes
* locate “clicks” in the data
* generate a PSD for each “click”
* collect PSD data
* generate and save oscillogram/PSD plots
* rank “clicks” on spectral features
* copy off plots of the highest-ranked clicks to a directory
My 2.4GHz dual-core Ubuntu workstation ran this script on 230 megabytes of data, producing over 1,400 graphs, and did it in eight minutes time. I’ve just located a calibration sheet on the hydrophone used, so once I’ve digitized that and applied it, I’ll post an example with real dB numbers on the axis.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 6228 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 1535 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
I don’t know what other people got up to this weekend, but mine has been pretty well filled with computing projects.
I’ve been working with my friend Marc to try to get to the bottom of the Verizon FIOS connection foul-up. We each ran TCPDUMP on our respective machines while making a request that could be fulfilled (a small static HTML page) and one that could not be fulfilled (a dynamic page for webmail). We’ve sent the logs off to a networking guru friend of ours to see if he has any ideas. While I fully expect that this is a problem in Verizon’s gear and processes, we are continuing to test any possibility that a fault in our gear could be an issue.
As I’ve mentioned previously here, I have data stretching back to the mid-1990s on CD-ROM. I’ve made a chunk of progress toward refreshing the archive by copying various of those to hard disk. It takes time, and needs manual attention every five minutes or so to unmount the last disk, load the new disk, mount it, and set up a copy process. Fortunately, most of the disks simply copy without error. I’m using ddrescue to go after the few files that won’t copy cleanly.
I’ve also been going through some of the packed boxes to locate more disks to be refreshed. Along the way, I’ve been reminded that I also have a pile of video and acoustic recordings on tape to digitize as well. I do have a cassette tape deck set up to digitize to my laptop, but I haven’t gotten my desk set up nicely to incorporate the video digitizing machine into a smooth workflow. From left to right, I have a Macbook Pro, a Viewsonic 24″ LED monitor for a second screen for a laptop, a Gateway MT6458 laptop running Win7, an Optiquest 15″ monitor for a desktop machine, plus keyboard and mouse for a desktop. Under the desk itself, I’ve got the video digitizing machine and the workstation/file server box. The video digitizing machine was built as state-of-the-art in 2001. It runs Windows XP, since the digitizing card doesn’t work under anything more recent. It still does a nice job of pulling in analog sources in a DV video stream. The file server is much more recent, being built in 2007. It runs Ubuntu Linux 11.10. There’s 4 terabytes of hard disk storage in that machine, which we use for our project files, personal files, multimedia, photos, and data. We’re coming up to the limits on that, especially after this weekend’s work.
I found a box of pocket notebooks, several of which have notes from our research data collection. But I did find one that has notes from the 1997 Discovery Institute conference on “Naturalism, Theism, and the Scientific Enterprise”. I see from my notes that Michael Ruse classed approaches to “religion v. science” into “conflict”, “accommodation”, and “separation”. I don’t think “accommodation” was used by Ruse in exactly the same way that more recent commentary has gone, but I thought it interesting to see the word there, anyway.
I’m also working on some Python programming and a PHP/MySQL project. Between these things, that pretty well soaks up the time.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 11471 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 1914 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
It’s been a long time coming, but the paper on evidence for multiple sound sources in the bottlenose dolphin appears in the October 15th issue of the Journal of Experimental Marine Biology and Ecology. I’ve been told that the PDF will be freely available soon, hopefully in the next week or so.
The abstract is:
Indirect evidence for multiple sonar signal generators in odontocetes exists within the published literature. To explore the long-standing controversy over the site of sonar signal generation, direct evidence was collected from three trained bottlenose dolphins (Tursiops truncatus) by simultaneously observing nasal tissue motion, internal nasal cavity pressure, and external acoustic pressure. High-speed video endoscopy revealed tissue motion within both sets of phonic lips, while two hydrophones measured acoustic pressure during biosonar target recognition. Small catheters measured air-pressure changes at various locations within the nasal passages and in the basicranial spaces. Video and acoustic records demonstrate that acoustic pulses can be generated along the phonic fissure by vibrating the phonic labia within each set of phonic lips. The left and right phonic lips are capable of operating independently or simultaneously. Air pressure in both bony nasal passages rose and fell synchronously, even if the activity patterns of the two phonic lips were different. Whistle production and increasing sound pressure levels are generally accompanied by increasing intranarial air pressure. One acoustic “click” occurred coincident with one oscillatory cycle of the phonic labia. Changes in the click repetition rate and cycles of the phonic labia were simultaneous, indicating that these events are coupled. Structural similarity in the nasal apparatus across the Odontoceti suggests that all extant toothed whales generate sonar signals using the phonic lips and similar biomechanical processes.
This was a big undertaking, requiring the coordinated effort of a lot of talented and busy people.
Diane Blackwood designed and implemented our acoustic recording layout and the dolphin stationing device and biteplate, and made sure the amplifying equipment was operational and protected from incident. (Incidents with electronics in proximity to sea water are all too common.) I designed and wrote the software that acted as a multichannel digital data recorder, the data reduction program, and the analysis program. Bill van Bonn was our veterinarian who spent our data recording sessions lying prone on the dock as he placed, checked, and positioned the endoscopes and pressure catheters. Our principal investigator, Ted Cranford, operated the video side of things, including the high-speed video capturing the endoscope views. Sam Ridgway and Don Carder consulted with us, helping us with the use of the pressure catheters (which had previously been used in two prior studies they authored). Monica Chaplin and Jennifer Jeffress were the dolphin trainers on the spot during data recording. Tricia Kamolnick and Mark Todd were trainers who helped get the subjects prepared for our data collection process, and Mark Todd implemented the regular video system. It took between two and three hours each data collection day for us to set up, test, and calibrate all the equipment. Breaking down took somewhat less time, but I would still have to run a custom program to demux the data, produce images visualizing the data for each trial, and then shift the day’s data off the hard disk and on to CD-ROM media.
Update: The Marine Mammal Center has put up the PDF of the paper.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 117062 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 7871 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
I occasionally check out the Tern Micro website. They are manufacturers of controllers and expansion boards for embedded applications. Their controller boards use IAPx86 class CPUs and are programmed in C. A few years ago, I had checked with them about whether they had components suitable for a field acoustic recorder, and given the short time schedule we had, we decided to go with off-the-shelf components instead for that. Things have changed, though, as I found an expansion board of theirs called the GR4 on their page.
Let me set some context. Some years ago, Whitlow Au and Marc Lammers put together a four-element hydrophone array that allowed them to perform acoustic localization. If I recall correctly, their recording system was based upon a National Instruments DAQ card for CardBus hosted in a laptop computer and was capable of 500 kilosamples per second. When multiplexed across four channels, that’s a max of 125 kilosamples per second per channel. With a multiplexed system, you have to account for time offsets between channels as you analyze the data for time-of-arrival estimates of signals. If there is crosstalk at the high acquisition rates, you might have to drop the total sampling bandwidth to give the multiplexing circuitry time to settle to the next channel’s input level. That at least is how I had to work with an NI PCI-MIO-16-E DAQ card back in 1999. The solution to this problem is simultaneous-sampling, where all the channels of interest get their own sample-and-hold circuitry and the conversion is triggered off the same clock input. Simultaneous-sampling hardware is more expensive, since the main sets of circuits have to be multiplied for the number of channels. Around 2001, a project I was involved with bought a couple of simultaneous-sampling DAQ cards for the PCI interface, at a cost of a couple of thousand dollars each.
Of course, lugging a full-up desktop system into the marine environment is not a thing to be undertaken lightly. If one could instead reduce the field recording part to something that could be effectively shielded from the elements and work instead off of straight DC battery power, it would be all-around more convenient. The more remote the field work, the more convenient that gets.
So let’s get back to the Tern GR4. This analog-to-digital expansion board is small, just a bit longer and wider than a business card. It can be provisioned with two ADC chips and 4 MB of memory (and that full configuration is what I’m talking about). The base price is $129, but with the additional features added the cost is $259. The GR4 boards are stackable. There are pin headers that form a communication and data bus with a controller card. Each GR4 permits simultaneous-sampling of two input channels. Each GR4 with two ADC chips aboard can record to its own CompactFlash card continuously by switching between ADC chips and FIFO memory, allowing the just-converted data from one FIFO to be streamed to the CF card while the other is collecting newly-converted data. Because the GR4 units are stackable, you can run several together at once. The Tern page shows a stack of four GR4s and a controller card. The maximum sample rate for the GR4 is 500 kilosamples per second. This means that each simultaneously-sampled channel can be recorded at that 500 kilosamples per second rate. It does 16-bit conversion, which gives good dynamic range to the recordings.
So the technical problem of getting to a four-channel field-deployable data recorder capable of capturing most of the acoustic information from a dolphin click has gotten both easier and cheaper with Tern’s GR4. I had a chat with a technical representative at Tern going over what would be needed for this application, and basically got a recommendation for a couple of different controllers that could do the job with the addition of two GR4 units. Tern offers an evaluation package of a controller board plus the interface hardware and software needed for system development at $249. Add-on options are additional cost. For one of the boards, I’d be interested in an LCD 16×2 readout, RTC clock, CompactFlash interface, and switching regulator, which would add another $100 to the $249 evaluation kit price. So for $349 + 259 + 259 = $867, I’d have that part of the data recorder in hand. Of course, I’d still be looking at a variety of additional costs in development, but this makes contemplating the task that much more feasible.
There are some additional concepts that ought to be broached. For two GR4s, one has to provide CF cards for each. It is pushing the hardware to get continuous sampled data out to the CF card on each expansion card. Trying to move the data over the bus to the controller and out to its CF card just isn’t feasible. There is no file system involved on the CF cards; the data is written to absolute sectors. This makes it a bit more interesting pulling that data off for analysis. In development, it will be up to the programmer to track which sectors go with which recording if multiple recording sessions are used. The signal input range for the ADC circuitry is 0-5V, which means that the output of many amplifiers will have to be conditioned to fit in that range. When recording two channels at 500 kilosamples per second, the total data bandwidth is 2 million bytes per second. So each CF card will receive about 7 gigabytes of data per hour of recording operation. A 32 GB card should be good for over four hours of data recording before needing to be swapped out. The Tern rep estimated that my stack of a controller plus two GR4s would pull around 500 mA of power at 5V while recording. The A-86-P controller at least has on-board power regulation so that it handles DC input from 8.5V to 24V and delivers regulated 5V power to its stack. I figure something like a motorcycle 12V battery would likely provide enough juice for a day’s worth of recording. When not actively recording, though, the controller and its stack can go into a sleep mode that draws only a few mA, which saves a lot on battery power.
I was told by the Tern rep that the GR4 was developed for the needs of a research group doing field work on bat biosonar. It’s no wonder that it caught my eye when I ran across its description.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 80000 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 8279 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
I’ve been busy recently doing up figures for a paper on dolphin biosonar. One of the figures we ended up turning in earlier this week wasn’t exactly as I wanted it, but deadlines don’t wait. I put a lot of hours into trying to find alternative plotting for it, but just hadn’t found the right approach for an alternative.
Now that we’re done with that paper’s submission, I think I’ve found the approach to use in the future.
Here’s the problem: show the power spectral density (PSD) curves for all the clicks in a biosonar click train. What I was using years ago was my own code plotting a waterfall of PSDs on a bitmap. But I tied things too closely to the specifics of how I generated the PSDs, so for the 256-point FFT window I end up with each PSD’s width as exactly 256 pixels. That’s less than an inch for standard 300 dpi print resolution.
There are examples for “fence” plots in gnuplot and Python’s matplotlib, but I wasn’t able to get stuff that looked much better than up-res’d versions of my originals. Did I mention that I want to assign particular colors to each PSD in the click train?
Yesterday, I was thinking a bit more about the problem, and decided to look into Python’s matplotlib again, this time going from the demo code on using a PolyCollection, that is, a collection of arbitrary polygons. That is looking quite promising. Here is an example of what I’ve got so far going along this approach:
The shapes are nicely done, I like being able to set a transparency value, I can output to a scale and file type I specify, and I can assign a specific color to each PSD in the series. (The colors are randomly set in this demo.) About the only quibble I have with the whole thing is that I’d like to run the “Y” axis in the other direction, so that the earliest clicks are plotted at the back of the plot, and the most recent are in the foreground. It’s easy enough to flip around the list, but I haven’t yet figured out getting the numbering to run the wrong way.
About the particulars of this click train… the X axis is in kiloHertz units (kHz). There are 24 clicks in the click train. It is apparent that the click train shows variation in the spectral content and amplitude of clicks, with a ramp-up to high-amplitude and high peak frequency, and followed by diminishing amplitude toward the end of the click train. For the highest-amplitude clicks, one may notice that there is some energy at the very highest frequency bins. There was anti-aliasing applied in the recording setup, but it evidently was not entirely adequate to the task. The B&K amplifier used has built-in attenuation of -3dB at 200 kHz, IIRC. The B&K hydrophone, an 8103, has roll-off at frequencies that high. So, if anything, the magnitude of energy in the highest frequency bins shown here is underestimated. That the high-frequency energy is correlated with the high peak frequency, high amplitude clicks is an indication that this isn’t a general issue with background noise; this is part and parcel of the dolphin biosonar click output. There’s some research that Diane did with the UT ARL group on such high frequency components in dolphin biosonar that I’d like to revisit sometime soon.
Update: A handy page over at StackOverflow put me on course to flip my Y-axis numbers. I’ve also fixed up assigning colors that way that I want them, so now the result is looking much better to me.
The colors correspond to a classification based on spectral features (all things related to the FFT taken) first proposed by Houser, Helweg, and Moore in the late 1990s. I don’t process my transform in exactly the same way that they processed theirs, so the resulting classification is not necessarily identical to what they would have found if they processed the same click train. An extended discussion on that should be put off to another post.
Update 2: That was all too optimistic. There is a bug in “matplotlib”. Actually, if you look closely at the figure just above, the red polygon toward the back is plotted over a blue polygon, and it should not be. Depending on the view angle chosen, “matplotlib” gets the render order of polygons wrong. I was able to reproduce this error directly in the example code provided on the “matplotlib” website. Here’s the problem demonstrated:
I’m posting it here especially so that the “matplotlib” people can have a look. For my data and just 24 polygons, I can find angles where about a third of the polygons are rendered out of order. For other angles, everything renders properly. If you happen to like one of the correct-rendering angles, you can use the output. If the angle you want happens to be in the other range of incorrect-rendering, context does not seem to matter; no matter which direction you come to that view, it still renders incorrectly.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 75713 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 8250 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
I’m working on setting up a citizen scientist project to document where snapping shrimp (family Alpheidae) are active pre- and post-contamination by the oil spill in the Gulf of Mexico. In this post, I just want to introduce the basic concepts and provide an example sound file.
Snapping shrimp comprise a number of species, mostly distributed in tropical to temperate waters. They live in near-shore structured environments, including seagrasses, rocks, and coral reefs. They are predators on small, live prey, and they kill or stun their prey using a snap from a disproportionately large claw. The snap of the claw generates a cavitation event and, by the way, a high-amplitude, broadband transient sound that is also called a snap. The combined noise from the local population of snapping shrimp is a familiar feature not only to bioacoustics researchers, but to anyone who snorkels or SCUBA dives in areas with snapping shrimp.
Because of this noise and the role snapping shrimp play in the marine food web, they are an excellent candidate as an “indicator species”, a species that can be easily monitored and which provides a measure of the health of that part of the marine food web. Better yet, the monitoring and assessment can be done acoustically, by sound recording, to get a measure for a local population.
If I had a chunk of money to throw at this, a sophisticated way to do this would be to make a baseline of calibrated sound recordings and be able to characterize tidal and daily cycle effects on snapping shrimp sound activity, and thus be able to statistically determine a reduction in activity post-contamination. I estimate somewhere around $10K would be needed to set up a portable data collection system from scratch with that kind of capability. Not having that in spare change in my pocket, I’m looking at a somewhat different approach that a lot more people can get into with minimal outlay of funds and just a bit of do-it-yourself drive.
Because snapping shrimp noise is broadband, you can hear it even in plain audio recordings, though the peak frequencies are actually ultrasonic. This means any sort of audio recorder can be used to find out if snapping shrimp are present in a location: cassette tape recorder, digital recorders, and even video cameras. The thing that any of those will need is a microphone input. What to plug in for that recording? A hydrophone would be great, but most people don’t have those lying around. But one can also make a normal microphone water-resistant and use it. It is best to think of such a microphone as disposable, since better sensitivity also corresponds to the water-resistance being more fragile, and saltwater is great at destroying electronics. In another post, I’ll describe making your own hydrophone or water-resistant microphone. If you already have a recorder, the additional cost is under $50 to be able to record underwater sound. I’m not looking for this sort of recording to do as much, simply to say whether a snapping shrimp population is active or not.
Below is an example of a simple recording I made last night that demonstrates the presence of an active population of snapping shrimp at one location and time. I’m still working on what additional information should be noted along with the recording, but I think what I provide here may be sufficient.
Recorder: Olympus WS-320M, ST HQ mode, CONF mic sensitivity
Transducer: Salvaged hydrophone from a sonobuoy
Transducer depth: Approximately 2 feet
Recording made by: Wesley R. Elsberry
Time: 18:51 EDT
Location description: South Sunshine Skyway Bridge on road to south fishing pier, at overpass over water, north side, toward east end.
I’ll be posting more on this topic later.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 3541 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 934 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
This came across MARMAM just now.
Subject: [MARMAM] Six Beaked Whales stranded in Azores (URGENT)
From: “marc fernandez”
Date: Sat, June 27, 2009 5:47 amDear Colleagues, I want to report an unusual situation occurred during the last week and a half in São Miguel island, Azores, and ask for help in order to get some clear conclusions. During the last two weeks a total of *6 beaked whales stranded* on this small island, a really unusual fact. Of these *6 two were dead and 4 stranded alive *and returned to the open sea. From the first two animals (the dead ones) we only can get one identification and it was a Cuvier's Beaked Whale, probably an immature male. The other four animals stranded on a beach and they were returned to the sea immediately by the lifeguards and the coastal guard, for these reason we don't have a lot of information, but for the pictures they send us probably were Sowerby's Beaked Whales, we only now that they stranded alive and probably they were immature animals also, due to the body lenght (about 3.5 meters). We don't have any notice about military activities in the area, but is really difficult to get this kind of information, for this reason i want to ask you for help to find if there is any military or seismic prospection on the area that could affect these animals. Thanks for your help. All the best, Marc Fernandez Morron Universidade dos Açores
Marc Morron is asking about military exercises because there is a known correlation between use of mid-frequency military sonar and injury to beaked whales. If anyone has any information, please leave a comment.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 25619 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 5907 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
I saw the NCIS episode “One Shot, One Kill” and noticed a blooper in the show. Maybe that’s not that cool, but this particular blooper requires knowing something about acoustic localization. This is the technology that is being used to let marine mammal researchers place the position of whales who are vocalizing and also lets police departments know about where gunshots have been fired. The basic idea is that one places a bunch of sound transducers in known positions, and one can — with the aid of a bunch of math and computer power — estimate where a sound originated.
In the case of marine mammal researchers, Whitlow Au and the research group at the University of Hawaii have had a four-hydrophone array, where the hydrophones are arranged in a tetrahedral shaped, and the whole thing is a bit over a meter across, IIRC. With a sufficiently fast simultaneous-sampling data recorder, they can get a reasonably good bearing and range estimate on a whale or dolphin.
For the police, there have been installations of microphones in several cities. First, a microphone detects a gunshot. Second, another program delivers a localization estimate. Some of these are claimed to be accurate to about 80 feet. Given the reverberant qualities of sound propagation in the urban environment that’s either a testament to amazing skill on the part of the engineers, or amazing BS on the part of the marketers.
So if gunshots can be acoustically localized, what was the problem with NCIS showing use of the technology? It came in the form of having the goth forensics guru character, Abby Sciuto (played by Pauley Perrette), showing a graphic on a computer monitor supposedly giving the result of the localization for a gunshot. The graphic showed a linear array of three microphones and three straight lines running through the estimated shooter’s position and each of the microphones. Nice, simple to grasp, and wrong. First, acoustic localization will give you a half of a hyperboloid as a solution for a time-of-arrival difference between any pair of sound transducers. The estimated location is going to be at places where multiple hyperboloids intersect. Even if one simplifies things to being more-or-less restricted to a 2D solution, there isn’t much call for showing a straight line when graphing an acoustic localization. Second, anyone worth a flip doing acoustic localization for a known sniper situation isn’t going to deploy just three microphones relatively close to each other, and certainly not with with them in a linear array. The best situation to have for acoustic localization is to have the sound source within one’s array of microphones. If you have to deploy a small number of microphones, and can’t get a long baseline, staggering them so there is not a straight line through the positions is going to help. With a symmetrical situation like the line of three microphones, one has poorer localization the closer a source is to being on that line. (On the line, there is no localization of a source outside the microphones; time of arrival will tell you on which side of the array the source is, but whether it is eight feet or 800 yards from the outside microphone isn’t going to be determined by time of arrival differences.) Using a triangle when in a 2D situation or a tetrahedron for 3D is going to work out better.
It makes for an interesting question of what the best placement would be if one were planning
to do acoustic localization of a sniper and one only had three mics, but knew where the target was, and that the only distant approach came from one side of the building where the target was. Offhand, I’d put one mic on the line normal to the target’s building, either on the building or 50 to 100 yards to the front of the building. The other two would go out to either side about 300 yards and a total of about 1,200 yards from the building. That would help make it more likely that a sniper spot is within the triangle formed by the three mics.
SI also has a field course in Belize that one can sign up for:
Ecology, Behavior & Conservation of Manatees & Dolphins
A Unique Field Course in the Drowned Cayes, Belize
Host: Caryn Self-Sullivan
Type: Education – Class
Start Time: Saturday, May 30, 2009 at 12:00am
End Time: Friday, June 12, 2009 at 5:00pm
Location: Spanish Bay Conservation & Research Center
Street: Drowned Cayes
City/Town: Belize, Belize
Want to be a Marine Mammal Biologist? Want to be a Behavioral Ecologist?
Here’s your chance to join our research team for two intense weeks of total immersion into the world of Animal Behavior, Antillean manatees, and bottlenose dolphins in Belize!
REGISTER EARLY! SAVE $100 WHEN YOU REGISTER BY MARCH 10th!
Become totally immersed into island living, behavioral ecology and marine biology through lectures and learning activities, literature review, debate, projects, and field research. This unique field course combines an overview of the ecology, behavior, and conservation of sirenians and cetaceans with hands-on manatee & dolphin research in the Drowned Cayes, Belize.
Get out of the classroom! You’ll spend 3-4 hours on the water each day learning about the environment as we explore a labyrinth of mangrove islands, seagrass beds, and coral patches searching for elusive manatees and charismatic dolphins. You’ll collect behavioral and environmental data and learn about photo-id techniques; you’ll develop a Fact Sheet or Activity Booklet about a related topic to be published by the Hugh Parkey Foundation for Marine Awareness & Education and/or Sirenian International. Extra-curricular activities include diving or snorkeling at Turneffe Atoll, and exploring an ancient Maya City.
That just sounds cool.
Dr. Self-Sullivan was one of my fellow grad students back when Diane and I were at Texas A&M University. She’s terrific and has years of experience with the marine mammal populations in Belize, so if you have the time and inclination, I’d suggest signing up pronto.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 21554 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 5238 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
Nature (7228, p.361) had notice of an article in Biological Journal of the Linnean Society (96, 82-102, 2009) that demonstrates the use of biosonar in several species of parasitic wasps. These wasps seek out beetle larvae in trees, using hammer-like ends of the antennae to produce sound.
So far as I know, this is the first group of invertebrate species to be shown to use biosonar.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 14108 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 3683 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
There’s a study in the Journal of Comparative Psychology that comes somewhat up my alley.
Quick, Nicola J. and Vincent M. Janik. 2008. Whistle Rates of Wild Bottlenose Dolphins (Tursiops truncatus): Influences of Group Size and Behavior. J. Comp. Psych. 122(3):305-311.
OK, so what are the authors reporting as results? A lot of that is in the abstract:
In large social groups acoustic communication signals are prone to signal masking by conspecific sounds. Bottlenose dolphins (Tursiops truncatus) use highly distinctive signature whistles that counter masking effects. However, they can be found in very large groups where masking by conspecific sounds may become unavoidable. In this study we used passive acoustic localization to investigate how whistle rates of wild bottlenose dolphins change in relation to group size and behavioral context. We found that individual whistle rates decreased when group sizes got larger. Dolphins displayed higher whistle rates in contexts when group members were more dispersed as in socializing and in nonpolarized movement than during coordinated surface travel. Using acoustic localization showed that many whistles were produced by groups nearby and not by our focal group. Thus, previous studies based on single hydrophone recordings may have been overestimating whistle rates. Our results show that although bottlenose dolphins whistle more in social situations they also decrease vocal output in large groups where the potential for signal masking by other dolphin whistles increases.
I was right with them up to the part bolded in the above. Why am I not convinced that their paper delivers on what the abstract promises? Let’s have a look at the methods of the paper.
The distributed array consisted of three HTI–94–SSQ hydrophones and one HTI–96–MIN hydrophone (High Tech, Inc., Gulfport, MS) all with a frequency response of 2 Hz to 30 kHz +/- 1 dB, attached to tensioned 2m pieces of chain with waterproof tape. The four elements were then distributed around the boat in a box array to allow passive acoustic localization. Hydrophones were positioned at 2m depth and were placed between 160cm and 280cm apart. Recordings were made onto a Fostex D824 multitrack digital recorder (Fostex, Tokyo, Japan) during 2003 and an Alesis adat HD24 multitrack digital recorder (Alesis, Cumberland, RI) during 2004 (sampling frequency 48 kHz, 24 bit for the Fostex, 32 bit for the Alesis). Spoken tracks of the two observers, one detailing the surface behavior of the animals in the focal group and one the positions and behavior of nonfocal groups were also recorded on the multitrack recorder.
Anybody spot the trouble yet?
Sure you did.
First off, one simply isn’t going to get higher frequency response out of a system than that of the least capable component. Starting with the hydrophones, 30 kHz is near the top frequency one might be getting. There is the issue of roll-off, but generally there is pretty steep roll-off at the high end of a hydrophone frequency response curve. I didn’t find an accessible calibration curve for the Hi Tech hydrophones to find exactly what the roll-off would be. But even that is going to be truncated sharply by the recording gear. The Fostek D824 recorder is said in the methods to have a “sampling frequency [of] 48 kHz”, but that is ambiguous. One can report a sampling rate of, say, 48 kilosamples per second, or a Nyquist frequency of 24 kHz. The upshot is that the Fostek recording six channels, as stated in the methods, is capable of 48 kilosamples per second per channel, giving a Nyquist frequency of 24 kHz as the highest frequency that the recorder might manage to represent. So just from that, we know that no frequency data over 24 kHz was part of the set analyzed in this study. (We won’t go into the lack of specification of anti-alias filters in the equipment as it isn’t really relevant to my critique, but any serious acoustic analysis would need to take aliasing into account.)
Here’s something from the discussion…
In our study, whistle rates during socializing in groups of 6 to 10 animals was 0.53 whistles per minute per dolphin, for nonpolarized movement it was 0.27. Dolphins in Sarasota only whistled half as often in similar group sizes during these behavior patterns (Jones & Sayigh, 2002). In Wilmington, whistle rates during milling were the same as in Sarasota, but for socializing they were Increased whistle rates during socializing may be due to animals communicating information to social associates or using calls to maintain contact. According to our definition animals were socializing when they were within very close proximity, often rubbing body parts and touching (see definition in Table 1). Rates may be dependent on social bonds between the individuals present or may be a consequence of increased arousal due to contact with individuals and not be dependent on social relationships. Cook et al. (2004) showed higher signature whistle rates during socializing and suggest that this may function to maintain contact as other group members get more dispersed while individuals are engaged in socializing.
OK, this gets to the stuff that I just can’t handle. Remember that 24 kHz maximum possible frequency in the data set? (Due to practical considerations, it could be even lower.) Bottlenose dolphins have an upper hearing response over eight times as high as that. Bottlenose dolphins have peak frequencies in clicks over six times as high as that. The sweeping speculations about communication inherent in the abstract and the quote just above are made in complete ignorance of over 7/8ths of the acoustic sensitivity of the subject species, and over 5/6ths of the peak frequencies within the vocal repertoire of the species. We know that click-based sounds are used by dolphins in communication; the only experimental work on obligate acoustic communication between individual dolphins to perform a task revealed that the signals used by the subjects were click-based and not whistles. We know that recording lower frequencies does not necessarily secure any vestige of click-based vocalizations. There is absolutely no consideration in the study here given to any of those issues or the fact that click-based sounds could quite plausibly be used for some of the functions being discussed. In fact, the paper does not even contain the word “click” or the word “pulse”. As a result, the claim that dolphins “decrease vocal output” within this study is something that the authors cannot possibly support; they have no clue whatever what dolphin vocal output over 24 kHz might be. They can, at best, report that dolphins decrease whistle rates with increasing group size, but they need to leave the “vocal output” claim out of it, since there is so much more to bottlenose dolphin vocal output(*) than the puny amount of bandwidth that they actually measured.
Whistles do seem to have some importance in dolphin behavior and sociality. But just because whistles have traditionally been relatively simple to acquire by comparison to high frequency, broad bandwidth, narrow beam clicks does not mean that clicks in general or those higher frequencies can be ignored with impunity. Studies like the present one should make their speculations that take cognizance of this technological gap between what we humans can readily find out and what the dolphins are actually using.
(*) What to call dolphin sound emissions is a semantic issue. Personally, I don’t see the problem with “vocal” that some others do; the roots of the words were non-technical in the Latin, and one need not treat “vocal” as pertaining only to laryngeal sound production. Some offer “phonation” as an alternative, which so far as I can see simply grabs another non-technical Latin root that has as many issues and baggage from prior usage as the other. I went so far to avoid even getting into this terminological morass in my dissertation as to invent a neologism for emitted sound, and called them ensonds. For the purposes of this post, I’m simply accepting “vocal output” as sufficiently clear to move along with.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 13023 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 4509 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
Today, Nelson Alonso turned up on AtBC and turned in an amazing performance that has to be seen to be believed. Alonso is a long-time second-tier “intelligent design” creationism cheerleader; I’ve had experience in online discussions with him since the late 1990s. Some of his braggadocio touched upon having a long history of online discussion. I had a look back at the archives of the Calvin “evolution” email list, where I had some exchanges with Nelson. And I found one such discussion that had an end-point. It even has to do with “irreducible complexity”. I pointed out that the mammalian middle ear ossicular chain is an IC system providing an impedance-matching function, and that the impedance-matching goes away if you remove any of the parts. Nelson tried to deny that this qualified as IC
, at least in part because the fossil record is clear that the system evolved. I’ll quote this last part of the exchange.
<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 12672 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 5436 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
Nelson Alonso wrote:
I’m going to put it in one block here before moving on to
responding to Nelson’s post.
MI>People have given examples: The Krebs cycle and the human
MI>inner ear are IC systems (as defined by Behe and asserted by
MI>me) for which means of gradual evolution have been given.
It’s the impedance-matching function of the mammalian *middle*
ear that is proffered as an example. I saw someone today
saying that it is unnecessary to mammalian hearing. This
ignores the fact that every piece is absolutely necessary to
the impedance-matching function. That function goes away
(with about a 30 dB re 1 microbar decrease in sensitivity, or
about 1 / (2^10) the original sensitivity) if any of the parts
are removed. The human blood clotting system, one of Behe’s
examples of IC systems, is not *necessary* to circulation in
much the same way.
WRE>”It’s the impedance-matching function of the mammalian
WRE>*middle* ear that is proffered as an example. I saw
WRE>someone today saying that it is unnecessary to mammalian
WRE>hearing. This ignores the fact that every piece is
WRE>absolutely necessary to the impedance-matching function.
NA>This isn’t true, as I have stated above, one can remove the
NA>entire 3-bone system and I would still hear when pressure
NA>waves hit the oval window.
It is true. The impedance-matching function is lost if any of
the components is removed. As I develop below, there is a
characteristic and significant loss of sensitivity due to the
loss of the impedance-matching function.
My point was not that impedance-matching in the middle ear is
*necessary* to any amount of hearing, but rather that trying
to dismiss the impedance-matching function on the basis that
hearing itself is not completely eliminated is a digression.
One can simulate the loss of sensitivity involved in a gross
manner by donning a good pair of hearing protectors. Trying
to argue that the difference in sensitivity is not a
functional difference seems ludicrous to me.
I suggest that Nelson pick up any good basic text on
audiometry, which will explain about impedance mismatches
going from pressure changes in air to movement of the oval
WRE>That [impedance-matching] function goes away (with about a
WRE>30 dB re 1 microbar decrease in sensitivity, or about
WRE>1 / (2^10) the original sensitivity) if any of the parts
NA>Mere observation can tell us this is false, the one-bone
NA>system of reptiles make them hear quite well.
No, actual experimentation has shown this characteristic loss
of sensitivity in terrestrial mammals to be the case. The
topic of discussion is the function of impedance-matching in
the mammalian middle ear. Normal hearing in another taxon is
not responsive to the point. But Nelson’s digression to
reptilian systems does him no favors. When the middle ear of
lizards is removed, their hearing likewise decreases by 35 to
57 dB in sensitivity, showing the importance of
impedance-matching to acute hearing even outside mammalian
Also, Nelson’s digression shoots him in the foot on another
point, which is that such systems help establish the utility
of simpler systems in accomplishing the same function, which
is a point in favor of evolutionary development of the IC
impedance-matching function of the terrestrial mammalian
I’m a co-author on research that looked at hearing sensitivity
in white whales. Part of that paper discusses the loss of
impedance-matching reported by others in terrestrial mammals
placed in hyperbaric chambers. (You don’t have to use surgery
to reduce the efficacy of the middle ear’s
Sam Ridgway, Donald Carder, Rob Smith, Tricia Kamolnick, and
Wesley Elsberry. 1997. First audiogram for marine mammals in
the open ocean and at depth: Hearing and whistling by two
white whales down to 30 atmospheres. The Journal of the
Acoustical Society of America Volume 101, Issue 5, p. 3136.
WRE>The human blood clotting system, one of Behe’s examples of
WRE>IC systems, is not *necessary* to circulation in much the
NA>Why can’t any one anti-IDist be specific?
What, specifically, does Nelson think is vague about the
statement above? Human circulation occurs even if there is a
problem with the human blood clotting system. Terrestrial
mammalian hearing occurs, at reduced sensitivity, if the
impedance-matching function of the middle ear is compromised.
Trying to dismiss the impedance-matching function of the
mammalian middle ear on the grounds that hearing is not
entirely lost if it is interrupted should likewise cause ID
proponents to reject the example of the human blood clotting
system, which if interrupted does not mean that all
Here’s some of what I’ve written on the topic before.
By irreducibly complex I mean a single system composed
of several well-matched, interacting parts that contribute to
the basic function, wherein the removal of any one of the
parts causes the system to effectively cease functioning.
[End Quote - MJ Behe, Darwin's Black Box, p.39]
The mammalian middle ear has on one side the tympanum, which
demarcates between middle and outer ear, and on the other the
oval window of the cochlea. In between the two are three
small bones, the malleus, incus, and stapes. These small
bones are articulated in series. What the system of tympanum,
malleus, incus, stapes, and oval window accomplish as a
function is the conversion of high-volume, low pressure
movements of sound in air at the tympanum into low-volume,
high-pressure movements of the oval window and thus the fluid
contents of the cochlea. In tech terms, the system is an
If any component of the system is removed, the
impedance-matching properties of the system go away, and
hearing thresholds are reduced by about 30 dB. With this
system in place, though, hearing can be quite sensitive.
This system appears to make a good match for Behe’s definition
of irreducible complexity. One might wonder why Behe doesn’t
use this instead of mousetraps. Well, one reason is that
there is a fossil record showing forms intermediate between
the reptilian ancestral condition and the mammalian anatomy,
and irreducible complexity doesn’t look so spiffy a concept if
one has to say that IC excludes evolutionary explanation,
except for this case that has been documented as having an
Chris Clarke and the Ornithology Lab at Cornell University have an application for bioacoustics: right whale detection in shipping lanes. Right whales often make contact calls, called “up-calls”, and a series of ten deployed buoys with hydrophones and communications gear can pick up these calls for right whales within five miles of a listening buoy. Onboard processing does a first pass at picking out a “top ten” list of possible right whale calls, and those are uploaded to the Cornell Ornithology Lab for further processing. The system is computer-assisted rather than computer-automated, meaning that the computer processing narrows the things that would require a human decision, but it relies upon humans to make a final determination of whether a right whale call was present. If that is the case, the buoy is marked as having one or more right whales in the vicinity, and is tagged as having an “alert” status. This is reflected on a website that ship captains can access and, hopefully, reduce their speed while traversing areas where right whales have been detected. Right whales move slowly, travel near the surface, and ship strikes remain a major source of mortality for right whales. By highlighting where right whales are, Clarke hopes that responsible captains will take steps to reduce ship speed and post lookouts.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 6150 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 2465 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>
M-Audio has updated its handheld Microtrack solid-state recorder to the Microtrack II. The specs are attractive. It records in stereo to Compact Flash or MicroDisk format cards, in either WAV or MP3 formats. It can record at up to 96 kilosamples/second at 24 bits per sample. It has both 1/4″ and 1/8″ microphone/line inputs, and can provide 48V phantom power to microphones. Interfacing this unit to hydrophones should be a piece of cake. NCSE has one of the M-Audio Microtrack recorders for making high-quality podcasts or audio documents.
Back in 2005, Diane and I had to come up with a programmable field recorder on three weeks notice. We went with a PDA-based solution using Core Audio’s Compact Flash format audio input card, coupled with Core Audio’s microphone pre-amp/digitizer system. If you need progammability, as for making unattended scheduled acoustic sampling, that’s still a good solution. On the other hand, for interactive recording, the M-Audio Microtrack II offers the convenience of a smaller, discrete package to use, plus you only have to worry about one power supply. The M-Audio unit is about the same size as a standard PDA, though a bit thicker. Core Audio does sell the MicroTrack II, and sees it as aimed at a different market segment than their PDAudio system.
As with any pro-quality system, though, it is pricey. The MSRP on the Microtrack II is about $500. Sweetwater is advertising them at about $300. As such, it is out of range of our budget at the moment. That’s not a whole lot more than one might pay for a top-end media player these days, though, and I can always hope for a price drop.<= get_option(\'vc_tag\') ?>> = get_option(\'vc_text_before\') ?> 5847 = get_option(\'vc_human_count_text_many\') ?> = get_option(\'vc_preposition\') ?> 2363 = get_option(\'vc_human_viewers_text_many\') ?> = get_option(\'vc_tag\') ?>>