Category ArchiveComputation
Computation & Science Austringer on 05 Feb 2010
Refreshing Data Storage
I have data on Compact Disks (CDs) from past projects. The technology was getting toward being affordable around 1996. CD writers dropped under $100 for the first time somewhere around there, and media started selling for less than $5 a disk. The amount of storage space on a CD was comparable to the size of hard disks available at the time, and optical storage seemed far better than tape as a medium. So now I have cases, drawers, and spindles of CDs dating right back to 1996.
No storage medium is perfect, so archived data is a commitment and not just a static collection. Last month, Sam asked me what I would like for my birthday. I said I wanted a disk for backing up data. After having a look at off-the-shelf external hard drives, it seemed that all the models I looked at had warranties of 1 year or shorter. However, if you buy an internal hard disk and a separate USB enclosure, the warranty on the drive can be much, much longer. Sam and I visited the Newegg site and picked out a Western Digital 1.5 terabyte drive and a Rosewill USB enclosure. The drive comes with a 5-year warranty. I can pair this with another 1.5 terabyte disk so that I can copy off my data from the CDs, then copy to the second hard disk.
Back when I was about to move from California to Michigan, I had a chat with a fellow who works for the Internet Archive. That is a project whose modest aim is to store the World Wide Web. All of it. You can browse sites as they were in 1995. Well, with a few caveats. My acquaintance said that the Internet Archive’s data storage was based on consumer-grade IDE drives. You can get them cheap and in quantity, and if you store things on multiple disks, the redundancy will help. That’s because disks fail. With an organization like the Internet Archive, they rack up lots of failures. They have to be swapping out bad drives and attempting to restore content from remaining copies on other drives. And they couldn’t, he said, quite keep up with the failures. Some data does get lost because failures occur before the redundancy can be exploited to restore some sites.
I figure for my purposes, the data I have is a copy of what my colleagues have, and for the hard disk copy, I aim to have two of those. I think that should be sufficiently paranoid. The process or workflow takes about six to seven minutes per CD to create a directory, copy the files, and mark the CD as copied. I’m working on the third page out of 32 pages in a CD case now. This will take some effort, but then I invested years of my life getting that data in the first place.
Viewed 19512 times by 2495 viewers
Computation Austringer on 27 Jan 2010
Students and the Apple iPad
Apple announced its iPad tablet computer today. The device seems to be mostly a large-screen iPod Touch. The intriguing aspects of the iPad, at least to me, were that Apple says that for the 3G versions ($130 extra over the WiFi-only versions) these devices will be unlocked, and that Apple has arrangements with textbook publishers for EPUB content. It seems that Apple was able to wring some few concessions from AT&T concerning the unlocking and the two tiers of data plans. While the data plan costs are not cheap, they manage not to be exorbitant.
I saw that some other commentators were perplexed about the time taken in the announcement to show Apple’s iWork applications as they are ported to the iPad. I think, though, that a major market for the iPad might just turn out to be among high school and college students. Consider the points made and that market:
- Light enough to carry around in the backpack (If a student can skip carrying even one textbook and carry an iPad instead, they will be lightening their load.)
- 10 hour battery life, good enough for the school day
- Low cost applications that will be good enough for note-taking and in-class analysis
- Capable of holding and displaying full textbook content in color plus supplemental multimedia
- Cost low enough that it is compatible with current budgets for textbooks
- WiFi for on-campus connectivity and research
The fact that it also does a bunch of multimedia service plus gaming will be seen as a plus, at least by the students if not their parents.
Viewed 19403 times by 2893 viewers
Antievolution & Computation & Law and Politics Austringer on 20 Aug 2009
Dembski and Marks Get One Past the Reviewers
William Dembski and Robert Marks finally managed to turn one of their joint manuscripts into a publication. The paper will appear in IEEE Systems, Man, and Cybernetics. There is a PDF of it available here. I’m in the midst of packing, so I just confirmed that Dembski and Marks carefully preserved the error I informed Dembski of almost 9 years ago and Marks almost 2 years ago.
I mentioned some time ago that I would write a response for publication, and I intend to do that. Right now, though, the trailers are partially loaded and there’s a fair bit more work and the trip to do yet.
One more thing… Dembski wants this paper to count in the pro-ID peer-reviewed category and show up in the DI list and whatnot.
P.S. Our critics will immediately say that this really isn’t a pro-ID article but that it’s about something else (I’ve seen this line now for over a decade once work on ID started encroaching into peer-review territory). Before you believe this, have a look at the article. In it we critique, for instance, Richard Dawkins METHINKS*IT*IS*LIKE*A*WEASEL (p. 1055). Question: When Dawkins introduced this example, was he arguing pro-Darwinism? Yes he was. In critiquing his example and arguing that information is not created by unguided evolutionary processes, we are indeed making an argument that supports ID.
The only way to understand the above is if one accepts the religious antievolution “two model” way of thinking. That goes like this: there are only two alternatives, evolution or {creation | design}. Therefore, evidence against evolution is evidence for {creation | design}. The “two model” argument got well-deserved thrashings in McLean v. Arkansas and Edwards v. Aguillard. It’s nice to see Dembski continuing to stick with just the classic, long-rebutted religious antievolution arguments.
Viewed 30110 times by 4633 viewers
Computation & General & Photography Austringer on 25 Jun 2009
Banner Change
I retired my banner that I put together in my hospital bed in 2004 and have set up a set of new banners that get picked randomly with page requests. The original aspect ratio was just too long at 8.84:1, so I shifted it to 8.84:2.
GIMP provides a selection tool for a fixed aspect ratio, which was just what I needed. Rotate, crop, scale, apply levels, unsharp mask, and I can save off another banner image. I’ll try to add more to the mix as time goes by.
Viewed 14850 times by 3327 viewers
Computation Austringer on 08 Jun 2009
Video Workflow
Over the weekend, I vacuumed out my video editing desktop machine. It’s been a while since I used it, and it had collected its fair share of dust.
The machine is based on an Asus P4B motherboard and 1.8GHz Intel P4 CPU. This was state-of-the-art when I built the system in 2002. It was built around the requirements of the Pinnacle DV 500 DVD video card. This is a very picky piece of hardware. It only works with a limited subset of motherboards and only has drivers for Windows 98/NT/2000/XP. OK, given all those requirements, what does it offer? It does analog video signal digitization as well as Firewire for DV capture. And it will provide real-time preview for various supported video transitions when paired with Adobe Premiere.
When I was putting the system together, it seemed that hard disks over 40GB were particularly prone to failure. I had gone with a 20GB boot drive and a 40GB data drive originally. A couple of years later, the rough patch in hard disk QC seemed to have passed, and I replaced the 40GB with a 120GB drive. And there the system remained from about 2005 to this past weekend.
Given that Diane and I still are slowly working off a substantial load of debt from grad school and my health issues (see earliest posts here from 2004 for the grisly details), there is now no discretionary budget for computer gear. We get what is needed when it is needed. Our purchases since 2003 include a laptop bought as a replacement under insurance, a $120 laptop bought from surplus at Diane’s college to replace her failing Dell, miscellaneous hard disks for our files, and a desktop upgrade almost completely underwritten by a generous donation from one of my readers here. We are still using daily a desktop computer bought used around 2002.
So doing anything to the video machine had to fit two constraints: I still needed to be using WinXP for the OS, and it needed to cost nothing but a bit of my time. I located two hard disks that had been replaced in other systems by higher-capacity drives, a 60GB and a 200GB drive. I used a partition cloning tool to copy over the 20GB boot disk to the 60GB. This would let me install various new software packages, including the Microsoft C# Visual Studio Express development environment, which uses 1.1GB all on its own. I cleared off the 200GB with a new NTFS format to add it as a second data disk.
Why video and why now? My time at MSU is drawing to a close, so we will be moving shortly. I have a stack of video tapes in DV, Hi-8, and VHS formats. I’ve always intended to get these digitized, and between having little working space and not having a good workflow sorted out, it hasn’t happened. It seemed to me that if I got things squared away, I could be doing a basic video digitization and archiving sequence in parallel with other activities. After all, most of the time is going to be tied up in either playing a video source for capture, or rendering captured video to an archival format.
So given the extra space and some sprucing up of the installed system, I’m ready for moving video bits around. My goal is to have the video available for non-linear editing in nearly-pristine condition. Raw DV is too large to be efficient. After asking around a bit, I’ve started with the aim of rendering to a multiplexed MPEG-2 stream as a format that is easy on storage requirements, but loses little of whatever quality there was in a DV source.
This starts with capture, which for my system comes through the Pinnacle DV 500 DVD. I’m starting with the DV tapes, as this promises to be easiest all around. Pinnacle has their DVTools package that does a fine job of capturing from a DV source. I’ve done about four tapes so far with no dropped frames at all. It does, however, continue past the last actual image from the DV source if the tape isn’t completely full.
When the capture is done, I fire up Sony Vegas and put the capture file on the timeline. It doesn’t take long to snip out various unneeded bits, including the extra stuff at the end of the capture file. If it was all related to one event, I’m ready to render that. If the capture file includes multiple events, I save the Vegas project file for the complete thing and then begin rendering each sub-section separately.
Because these are intended for further editing sometime when I have free time, I’ll just be putting these out to DVD as data. I have ImgBurn installed to handle that. There are at least three tapes in there that I will also author a video DVD for that I know of off-hand. Those will take a bit extra effort, but I’m not doing that for most of these.
Of course, suggestions are welcome. Please do remember the constraints I have, so software package suggestions should be for open source or freeware packages. I do have a laptop that dual-boots Vista and Ubuntu, plus a desktop running FreeBSD 7.2, so video processing on those systems could be done if there’s a suitable benefit.
Viewed 9451 times by 2575 viewers
Computation Austringer on 24 May 2009
A Nice LaTeX Cheat Sheet
I ran across this cheat sheet while looking for an answer to setting line spacing of single-spaced within paragraphs and double-spaced between paragraphs in the front matter. While it doesn’t have the answer to that, it does look like a very handy reference for more commonly encountered situations in .
If you are wondering what is, it is a document-processing and typography system. It is to a word processor what a process camera is to a point-and-shoot consumer camera. It’s big, has a steep learning curve, but delivers results far beyond what can be done with consumer-grade word-processing applications, or at least makes it possible to do those tasks with far less hair-pulling.
Technically, is a set of macros originally by Leslie Lamport built on the
typography system of Donald Knuth. Documents in
are actually programs, so the process of building a document in
is much like software development. While there are commercial versions of
systems, pretty much everyone I know uses free, open source versions like MikTeX or TeXLive. There are a number of frontends that help users construct and typeset documents using
: TeXNicCenter for Windows, TeXShop for Mac, and Kile for the KDE GUI on Linux and FreeBSD.
Why use and not either word processing programs like WordPerfect and Word or desktop publishing packages like Ventura and Quark? First,
has excellent mathematics typesetting capabilities. It is the sole format accepted by many journals that often deal with typesetting equations. If you are writing for such journals, there is no alternative. If you want to publish math-heavy text and not spend oodles of your time trying to figure out what went wrong in an “equation editor” for a consumer word processing program, you want
. Second, it incorporates a huge amount of typography experience. If you are concerned about making documents that are not just formatted well, but make it easy on the eyes of the reader,
provides that for you. It is flexible enough that if you think you know better, you can override just about anything, though most of the time that’s not really going to help your readers. Third,
automates just about everything that makes writing large documents a hard task. Let me explain that by example.
When Diane and I were writing our dissertations, we had a task of putting together several chapters of material where the final document had to conform to a long set of rules used by the Thesis Office at our university to assure both consistency across dissertations and to allow micro-filming archives to be able to use the result. In particular, there are rules about the placement of figures and tables relative to where they are first referenced in the text. In word processors, you place your text and you place your figure or table, and there is no effective control over where the word processor finally decides to put the figure or table. I had several chapters of material, and I tried WordPerfect and Microsoft Word, without success. I tried Microsoft Publisher and Corel Ventura, but also ran into difficulties there. It was around that time that I started looking at as an alternative. I found that dissertations in the electrical engineering department were often done in
and that there was a thesis class (a sort of configuration or environment document setting a style) for
that the EE department made available. This was like an existence proof; people had actually managed to get the thesis office to accept their manuscripts when done with the thesis class. I asked Jeff Shallit for recommendations on books, and he pointed me to the
book by Leslie Lamport and the
Companion book by Goossens, Mittlebach, and Samarin. What I found back in 2002 was that getting acquainted with
definitely took some effort, but it almost immediately was paying off. My figures and tables weren’t going hither and yon willy-nilly, they were pretty much where they needed to be, or could be tweaked to do so.
Then, some of the other benefits started becoming apparent. While word processors have some mechanisms for generating front matter (table of contents, lists of figures and tables), could do this in a very systematic way that basically took the entire load off my back. The other bane of the dissertation writer is references. The Thesis Office wanted all references to be cited in a consistent style, to be formatted in a consistent style, to appear in order, and that every citation in the text would appear in the references, and no reference would appear that was not cited in the text. That last one puts a huge load on someone who is organizing a large set of references for themselves. Let’s say that your committee decides that you should remove some text including a citation that only appears in that text. You have to remember not only to remove the text, but to revamp your bibliography so that the now-uncited reference no longer appears.
has a helper program, BibTeX, specifically for handling bibliographic data. Using BibTeX and the natbib style, I was able to address all the concerns of the Thesis Office while keeping things pretty simple for me. BibTeX allows you to set up one or more bibliographic source files containing all the references that you might want to use in your document. Within the document, citing a reference occurs using a “\cite” command. There are variants to allow for various in-text citation formats. The cite command is given a parameter that links to a particular reference in one of your BibTeX files.
sets up a file used by BibTeX to pull in just the references that are actually used, and BibTeX applies the desired style to produce the typesetting for the references section. The result is that the references section went from something needing a lot of continuing effort to maintain to needing almost no effort to maintain. That sort of assistance is invaluable when what one wants to be doing is writing content and not worrying incessantly about keeping all the effects of changes one makes to the layout in mind.
Something I didn’t use in writing my dissertation that provides is indexing. If you want to produce a large manuscript with an index, this is something that you can do pretty easily in
. Basically, as you go along in the text, you place an index tag next to the text that you want the index entry to refer to.
will track the entries and the corresponding page numbers for you. If you re-organize your text, say by swapping chapters 2 and 3 around, you don’t have to re-do all those page number references in an index,
will handle it for you.
provides several basic document classes for you, and you can find extensions online. The basic ones include “letter”, “article”, “book”, and “slide”. That last allows you to generate presentations in
. Then there are all sorts of styles that one can add on. For example, if you want to write screenplays using the standard formatting rules, the screenplay style can help you. (If, though, you are really intent on screenplay writing, you probably want to look at Celtx. [Addendum: Looking a bit more at the Celtx website, I found this: "TypeSet provides precise automatic formatting of your script to industry and international standards. The Celtx server uses the very powerful LaTex typesetting tool to deliver perfectly formatted scripts."])
There’s a system called LyX that places itself in between full and the usual way one uses a word processor.
is used by LyX as a back-end, and you get a display of text that is a bit closer to the usual WYSIWYG experience, but cast by LyX as “what you see it what you mean”. Unfortunately, LyX documents are not simply standard
, which to me is a limitation of the system.
Since writing my dissertation, I have relied upon for all my serious writing work, save where a collaborator has insisted upon something else. I use
for writing letters and it is the basis for the six or so pending article manuscripts I have. My curriculum vitae/resume is handled in
, and I have that set up such that I can generate documents of different lengths and detail, plus tuning the focus of my research statement, all by changing a couple of configuration settings. This means that I have one source text whether I want a CV or a resume, or whether I’m sending the result to someone interested in my biology background, my computer science background, or appreciates my interdisciplinary approach. That also means that I can keep things up to date with changes to just one file, and not about a dozen different ones to handle the most common sets of configuration changes that I use.
If you don’t do equation type-setting, don’t need figures and tables to go where you want, don’t need front matter or bibliographies, and don’t need an index, you’ll probably be perfectly happy using the usual word processor. If you do need any of those things, then you owe it to yourself to check out .
Viewed 6186 times by 2578 viewers
Computation Austringer on 09 May 2009
New Server
The email server I use was having some hardware issues. Marc picked up a new box and disk, and Jeff, who has somewhat more spare time at the moment than I do, suggested we go with Ubuntu Server 9.04 for the new install.
So we switched from a FreeBSD 6.3 box to Ubuntu Server today, and on it the new mail system was Postfix/MySQL/Courier. We spent a bit over four hours copying files and preparing the user accounts to use the new system before bringing the Ubuntu server online in place of the old FreeBSD one.
The rest of my day has been spent in fixing up other issues, like switching over the couple of WWW domains that were served from there and setting up email list software.
I’ve been using Majordomo for email lists since the 1990s. Unfortunately, that’s about the time of the last update for that software, too. So I am getting acquainted with Mailman instead.
Hopefully, most of those issues will be sorted before the end of the weekend.
On a somewhat more personal note, the way that I’ve done email since the 1990s has been disrupted. I’ve used the .forward file in my user account to pipe incoming email into a Perl script I wrote. It uses a whitelist file and a pattern file to sort incoming mail and append it to a file named for the day and with an extension according to the recognized class of email. Most of my email reading has been done using emacs from the command line of an ssh session. Now I’m dealing with using SquirrelMail as a primary interface, at least until I can work out what to do about the setup. I’m looking into Fetchmail, which I’m hopeful may allow me to do much the same thing as I did before, where my script only stuck back into my incoming mail box those items matching my whitelist criteria.
Viewed 4128 times by 1504 viewers
Computation & Science Austringer on 05 Apr 2009
Trip to Nashville
There was no sight-seeing, but I went to Nashville, Tennessee from last Sunday to last Thursday. This was to present at a conference, the IEEE Symposium Series on Computational Intelligence. I had a paper in the Artificial Life session that I presented on Tuesday, and it seemed to me that it went well. The Aritficial Life session ended Tuesday, though, so I was attending papers given in various of the other tracks at the conference.
On Wednesday, I sat down at lunch just at random, and two attendees sat down beside me. I talked mostly with Bob Abercrombie on my right, but at some point he brought his colleague, Rick Sheldon, into the conversation. As we compared notes on our backgrounds, Rick and I gradually came to realize that we had gone to grad school in computer science together and had worked together just afterwards at General Dynamics Data Systems Division.
If that wasn’t strange enough, I attended several talks in the computer security track on Thursday. One of the attendees seemed more than usually animated and clued in, and I decided to join his table for lunch. I noticed his nametag said he was Daniel Ashlock of Guelph University. The name rang a bell, and I spent most of lunch trying to recall where I knew him from. It turned out that we had both participated extensively in the Usenet talk.origins newsgroup in the early 1990s, and we both have listings on the University of Ediacara faculty roll. We had never met before in person, so we got a chance to discuss this that and the other while proceeding to the airport and waiting for our flight times.
Score another couple points for the small world.
Last Thursday, though, bad weather was moving into Nashville. By the time we go to the airport, the temperature was dropping and the sky looked quite dark off to the west. We had gotten through the security checkpoint and had been waiting in the terminal a while when a PA announcement said that a tornado had been spotted in the area, and everyone was supposed to move away from glass windows over to interior walls or into the restrooms. This was the first time I’d been someplace where a “take cover” advisory had been issued for a tornado threat. While the tornado didn’t come visit the airport, the bad weather made hash of the departure schedule there. After a whole series of announced delays, my flight that was supposed to leave at 5:22 PM actually left about 9 PM. Since I had a connecting flight from Detroit to Lansing, that meant that I arrived in Detroit about an hour after my plane had left for Lansing.
I asked the gate agent for assistance as I came off the airplane. I was told that everything would be handled at the station at Gate A43. I was then at Gate A61. So a fifteen minute stroll later I arrived at the station at Gate A43. No one was there. As I stood there trying to figure out what was next, someone did come by, the nightshift agent for Northwest Airlines. Apparently it is Northwest Airlines policy to run their customers through a sequential gauntlet of liars (the gate agent who sent me to an unstaffed location for assistance) and the rude (the wandering agent whose job is apparently to do as little as possible that would actually help passengers). The one piece of useful information I got from the peripatetic and randomly abusive agent was that late-night service was limited to the Northwest Airlines baggage claim office. So I headed there. The staff at the baggage claim area were pleasant enough, but given that “weather” was down as the reason for the missed connecting flight, they only needed to reschedule me on the next available flight … which would be the following afternoon. They could get me a discounted rate at a hotel, but that was it as far as doing anything to assist me. So I checked the car rental places, figuring that if I could get a car rental cheaper than the hotel, I’d still be ahead. Out of about nine places, only six answered the phone at 12:30 AM, and of those, only two had cars to rent and would provide a daily rate quote, and both of those were over $99.
So around 1 AM I called Diane and asked her to book me a seat on the next Michigan Flyer bus, which would be a 6 AM departure. I didn’t see much point in doing the hotel thing for what would be about three hours of sleep. So I got a seat at door 402 at the terminal, which is where the Michigan Flyer would be coming. I settled in to do some programming and passed the time with that and naps. The Detroit Airport, like the Nashville Airport, offers the Boingo WiFi hotspot service, allowing people willing to part with $10 to hook up to the internet while they are in the airport. Since that didn’t include me, I just worked on things that didn’t need online access.
Eventually, 6 AM came around, and so did the Michigan Flyer. I got aboard, and got to wait some more for whatever paperwork the bus driver found necessary to do. We got moving around 6:30 AM, and had our first stop about fifteen minutes later at the other terminal. After another round of paperwork, we got moving again. There was a stop in Ann Arbor, and another in Jackson. We arrived in East Lansing about 9:30 AM. Diane came and picked me up. I’ve been off schedule over the weekend. I do hope I re-sync soon.
Viewed 4282 times by 1350 viewers
Antievolution & Computation Austringer on 15 Mar 2009
Dembski, “Weasel”, and Video-Level Evidence
A post over at Uncommon Descent with a long-running series of comments resulted in a link to a video segment that bears on a stance taken by William Dembski and others that Richard Dawkins’ “weasel” program somehow worked by locking-in correct characters, protecting those from further mutation. The video shows that no such protection was given to correct characters. I’ve sent an email to Dembski and Robert Marks to bring this directly to their attention. I’ll share it with you here.
“David Kellogg” on Uncommon Descent linked to a YouTube video of a 1987 BBC Horizon
sprogram on Richard Dawkins’ “The Blind Watchmaker”. It includes video closeups of Dawkins’ “weasel” program in operation. The video also plainly shows that letters that match the target string are not locked or latched, just as I informed you some time ago (2000/10/09 for Dr. Dembski and 2007/10/11 for Dr. Marks).View the following video:
http://www.youtube.com/watch?v=5sUQIpFajsg
The relevant part begins at 6:15 into the video. The camera is close enough to the screen to show the letters in the evolutionary computation clearly, and it plainly shows that there is no latching of any character in any position.
You have continued to present Dawkins’ “weasel” program as incorporating a latching mechanism for correct characters, and have gone so far as to term “weasel” a partitioned search. You concluded in drafts of papers that “weasel”’s performance advantage over blind search was due to it having a partitioned search as its mechanism.
I previously laid out the evidence that the description of “weasel” provided by you was incorrect, without apparent effect. I have a further blog post that plots the performance of the partitioned search as you described it, and an accurately implemented version of the “weasel” program.
dembski-and-marks-are-still-mischaracterizing-dawkins-weasel
While actual “weasel” is slightly less efficient than Dembski-partitioned-search, both are dramatically better than blind search. This is at variance with several of the claims that you have made.
Given that now the evidence is as clear that Dawkins’ did not use a partitioned search as it always has been that he never described a partitioned search, I would hope that you each and jointly will take steps to remove the inaccurate descriptions and invalidated conclusions that were made on an incorrect premise.
Wesley R. Elsberry
Why pay attention to persistent antievolutionist error over a toy pedagogical example from 23 years ago? Because the antievolutionists don’t seem to be able to understand even the simplest sort of illustration of evolutionary computation, and that implies that understanding the basics of the principles behind “weasel” is also far from them. The incorrect description of “weasel” is propagated in the text of a paper that Dembski has claimed has been accepted for publication somewhere, though the correction was brought to Dembski’s attention over eight years ago, and to his co-author’s attention in 2007. Not only is the description incorrect, but the incorrect elements of the description were the ones that were the subject of analysis and the basis for the erroneous conclusions that they drew. Further, the tenacity with which this error has been clung to has resulted in the incorrect description and conclusion being used by others in the religious antievolution movement, as may be seen in Meyer’s Hopeless Monster. Error this basic whose effects have been so protracted needs to be exposed assiduously.
Viewed 4486 times by 1452 viewers
Antievolution & Computation & Law and Politics Austringer on 23 Feb 2009
Kirk Durston and Misrepresentation of Avida
Kirk Durston wrote in his “Introduction to Intelligent Design”:
Recent computer simulations have failed to generate 32 bits of functional information in 2 x 10^7 trials, unless the distance between selection points is kept to 2, 4, and 8-bit steps.
The 2003 Lenski et al. paper on Avida is cited by Durston as supporting the quoted statement. The 2e7 number comes from the section describing how 50 runs failed to evolve the EQU function if no less complex functions were rewarded, and the 2e7 number refers to the 2.15e7 unique genotypes evaluated in those 50 runs (p.143). But the remaining numbers don’t match up to stuff in the paper. 2, 4, 8, and 32 are mentioned in the paper as values of merit awarded to organisms based on the number of NAND operations required for the task completed. Those aren’t measures of “functional information”. Durston also left out the “16″ number, which corresponded to the level of merit of two other tasks that were not rewarded in that experiment, and thus their absence is misleading.
Getting to bits isn’t difficult. I’ll be using a simple approach since the Avidian programs at issue all utilize a set of 26 instructions. Any instruction could be in any position in an Avidian genome, so each instruction in an Avidian genome can be considered to contribute
- log2 (1/26) = 4.70
bits to the Avidian.
If one were trying to express “functional information” of an Avidian in bits, one might assert that five NAND instructions being necessary out of an instruction set with 26 instructions in it would give you 23 bits, or that the minimal number of instructions needed for EQU of 19 gives 89 bits, or that the reported value of 35 instructions that were necessary for EQU in the knockout experiments reported yields 164 bits, or that the 60 instructions in the first Avidian to perform EQU yields 282 bits. None of those match the “32 bits” Durston mentions, and trying to assert the 23 bit figure would require using an idiosyncratic measure. All the other possible assertions yield more bits than Durston’s quote states.
If one is trying to specify how high the bar was that Avida failed to clear in that experiment, the lowest that one might reasonably argue for would be the 89 bits that one may derive from the minimal known program using 19 instructions. That’s a lot more than the 32 bit figure asserted by Durston.
If one is trying to figure out how large an informational difference exists between programs that accomplish the various logic tasks in Avida, Durston’s statement about “distance between selection points of 2, 4, and 8-bit steps” doesn’t seem to correspond to anything there. Any single insertion or deletion changes the information content of an Avidian by 4.7 bits (when one uses the standard instruction set of 26 instructions), not the even powers of two stated by Durston. Further, there is a handy table of the shortest known hand-coded Avidian programs.
| Task | Shortest known program length | Bits | Size of Merit Reward |
| NOT | 6 | 28 | 2 |
| NAND | 5 | 23 | 2 |
| OR_N | 6 | 28 | 4 |
| AND | 9 | 42 | 4 |
| OR | 15 | 70 | 8 |
| AND_N | 10 | 47 | 8 |
| NOR | 19 | 89 | 16 |
| XOR | 15 | 70 | 16 |
| EQU | 19 | 89 | 32 |
Based on the minimal program lengths, all the logic tasks are substantially more complex than Durston admits. The “32 bit” barrier Durston discusses was not reported in the experimental results that he cites. NOT and NAND tasks, at 28 and 23 bits, evolve exceedingly rapidly in Avida populations whether they are rewarded or not. I just pulled up Avida-ED, turned off all rewards but EQU, and let a 3,600 max population run go. AND_N, at minimum a 47 bit task, turned up by update 196 in the very first run I made. That it was unrewarded does not mean that it did not evolve. The only task reported not to have evolved without other tasks being rewarded was EQU, an 89 bit task. Besides not being based on anything in the cited paper, it is easy to do some runs of Avida or Avida-ED and see that Durston’s primary claim of that sentence is demonstrably false: logic tasks of greater complexity than 32 bits do evolve in Avida even if less complex tasks are unrewarded. I tried that directly in Avida-ED by turning off rewards for all sub-32-bit complexity logic tasks (NOT, NAND, and OR_N) and running it. My first run had AND_N and AND evolve by update 800, OR by update 1200, XOR by update 1345, and NOR by update 1549. A second run fixed on a population mostly doing ANT and NOR. My third run showed evolution of EQU by update 2400. All the logic tasks rewarded were over 32 bits in complexity in those runs, and none of the less complex tasks were rewarded as “steps”. There isn’t a handy tally of unique genotypes, but it can’t possibly hit 2e7 such until after 5555 updates, anyway.
Expanding on Durston’s erroneous discussion on functional distance rewarded, the differences between minimal length programs for different tasks are in {0, 5, 14, 19, 23, 24, 28, 42, 47, 61, 66} bit distances, not “2, 4, and 8 bits” as Durston mistakenly asserts. The knockout experiment reported in the paper discusses the case where a single point mutation changed an Avidian program that had previously performed the AND task into one that performed EQU instead:
Besides EQU, this genotype performed five of the eight simpler logic functions; AND was lost as a side-effect of the EQU-producing mutation, and NAND had been eliminated by the one-step-prior mutation.
Based on minimal program lengths, the step from AND to EQU is a distance of 47 bits. The Avidian also performed the NOR task both before and after the mutation that permitted it to perform EQU. A transition from performing NOR to performing EQU could be claimed to be a 0 bit distance, given that both have shortest program lengths of 19 instructions, but that was not what was observed in that case. The very source Durston cites as support rebuts his assertions.
The implication of Durston’s “unless” phrasing is incorrect as well. The 50 run experiment where only EQU was rewarded did not try out an alternative reward structure to get to EQU. Durston cannot be referring to the outcome of experiments where all nine logic tasks were rewarded because he specifically used the unique genotypes figure from the “only EQU is rewarded” experiment, and not the significantly smaller number of unique genotypes explored in getting to EQU in the main experiment (1.22e7) where all nine logic tasks were rewarded.
So about the only thing Durston managed to get right in that sentence was copying one number from the original paper, where he limited himself to one significant digit. That seems excessively non-functional.
The Lenski et al. paper does a lot more than repudiate Durston’s dolorous-but-derelict assertions, though. It demonstrates via evolutionary computation that complex functions can arise from modification of simpler precursors. Avida removes the usual mainstay of antievolutionist argumentation, that there isn’t enough information about a lineage of interest to demonstrate that only evolutionary processes need be invoked as efficient causes to get to the result. Durston essentially gives us an instance of response #4 from my 1998 essay on objections to evolutionary computation:
Natural selection might be capable of being simulated on computers, and the simulations may demonstrate good capacity for solving some problems in optimization, but the optimization problems are not as complex as those in actual biology.
This objection typically appears once the first three above have been disposed of. Computer simulation, once held to be either a potential indicator of merit or an actual falsifier of natural selection, is then treated as essentially irrelevant to natural selection. It is certainly true that computer simulations are less complex than biological problems, but the claim at issue is not that EC captures all the nuances of biology, but rather that EC gives a demonstration of the adaptive capabilities of natural selection as an algorithm.
Durston’s attempt to misrepresent a single Avida experiment of modest extent and use that misrepresentation to make a proscriptive negative claim about evolutionary processes in biology is risible.
A point to be made, though, is that evolutionary processes don’t have to be good at “poofing” things together all at once; that’s the special creation hypothesis. Many religious antievolutionists get stuck on this, thinking that unless evolutionary processes have the asserted capabilities of omniscient, omnipotent creative deities that they can’t be credited with the history and diversity of life on earth.
Update: Other places Google thinks Kirk Durston’s erroneous conclusions have propagated:
Evolution under fire? — Part 1
Mathematically Defining Functional Information In Biology
Does God Exist? – Part 2 of 3 (about 6:20)
Re: Kirk Durston on information theory
Viewed 4396 times by 1343 viewers
Computation Austringer on 27 Jan 2009
Linux and Marvell Topdog Wireless
I got a Gateway MT6458 laptop computer back in October of 2007. One of the first things I did was to resize the Vista partition, giving me about half the disk to install Linux on for a dual-boot system. I used the Xubuntu version of the Ubuntu Linux distribution.
A fly in the ointment was that Xubuntu did not recognize the built-in PCI-E wireless card, a Marvell Topdog card. My solution to this point was simply to carry an Atheros-based PCMCIA wireless card and plug it in if I was using Linux.
Unfortunately, I seem to have misplaced the PCMCIA card.
So I looked for people who had managed to get the Topdog card working. The main problem I had, it turned out, was trying to be too specific in my search string. Trying to locate “xubuntu gateway mt6458 wireless” didn’t work, but when I tried just “ubuntu marvell topdog” I hit paydirt. That thread has step-by-step instructions (not all in one comment, though) and a link to a working driver archive (on page 2).
To summarize, once you’ve unpacked the driver archive:
sudo ndiswrapper -i NetMW14x.inf
sudo ndiswrapper -a 11ab:2a08 netmw14x
sudo ndiswrapper -m
sudo depmod -a
sudo modprobe ndiswrapper
As another commenter noted, I had to repeat the last two commands, but my Xubuntu now can do wireless via the built-in card.
Viewed 3281 times by 1364 viewers
Antievolution & Computation & Law and Politics Austringer on 01 Jan 2009
A Capsule Summary of the Status of Dembski’s Explanatory Filter
William Dembski’s “explanatory filter” (EF) has been offered as a “rational reconstruction” of how work gets done in various scientific fields. However, it does not actually comport with such work. A direct refutation can be found in Gary Hurd’s chapter in “Why Intelligent Design Fails” from Rutgers University Press.
Taken at another level, Dembski’s EF has fundamental problems. See the paper by Wilkins and me from 2001, available online here. Dembski’s arguments for making a category of “design” a default choice fail to live up to the “rational” part of rational reconstruction. The problem of limited information is not satisfactorily handled by Dembski: on the one hand, he claimed that sufficient knowledge was in hand in 1998 to analyze the examples provided in biology by Michael Behe via his EF and that the results demonstrated “with the weight of science” that “design” was found, yet more recently he has admitted that even his one attempted explication of applying the EF to a bacterial flagellum was flawed by the problem of obtaining accurate probabilities. One wonders where the ensemble of calculations Dembski implied had already been done in 1998 had disappeared to. Despite the lack of consideration in Dembski’s framework for changes in knowledge sets driving decisions in the EF, Dembski repeated claims of absolute reliability, while inconsistently also claiming that partial function of his EF was only to be expected for a procedure in the natural sciences. Further, the issue of lack of warrant for extrapolating ordinary design inferences to rarefied design inferences has not been adequately addressed by Dembski. In “The Design Revolution”, Dembski manages only a handwave in response, saying why accept the framework in which the criticism of his work is made at all? Yet Dembski has been eager to utilize that inductive framework when he believes that it favors his argument, as in various books and articles where he claims that the successes of various “special sciences” provide support for his “rational reconstruction” via the EF. Applying Dembski’s own words to himself is apropos: “This is known as having your cake and eating it. Polite society frowns on such obvious bad taste.”
It seems obvious that despite the problems in the logic of the EF that there was something of interest in the concepts that Dembski brought up. Humans do go about distinguishing between and eventually favoring particular explanations for phenomena. So what might be at the basis of interesting cases, and how is it that explanations come to be preferred? Jeff Shallit and I took that up in an appendix to an online essay we wrote back in 2002. Therein we described the universal distribution, an application of algorithmic information theory to the problem of inductive inference, and showed how it could be cast in a way that corresponded to the tool that actually provides a rational reconstruction of work done in the sciences to achieve ordinary design inferences. We called it “Specified Anti-Information” or SAI to, so far as possible, utilize the terminology Dembski had provided. SAI differs from the EF in many important ways: it is not based on probability assessments, it is simple to apply, and it is based upon solid work in information theory. Perhaps the most important difference, though, is that the inference that application of SAI leads to is not to an overarching notion of “design”, but rather to the inference that a phenomenon is best explained as the result of a simple computational process. SAI is not burdened with the baggage Dembski loads upon his EF of not merely sorting explanatory categories, but also of standing in for an argument that would lead to an inference of an agency at work. SAI cannot, and does not attempt to, distinguish between a computational process crafted by an agent and one where no originating agent is apparent. This contrasts sharply with Dembski’s long-term fascination with a split between “apparent” and “actual” categories of “complex specified information”. For any phenomenon that might be explained as due to chance or not due to chance, any apparent success of Dembski’s EF can be more parsimoniously explained as a “pre-theoretic” approach to the far more applicable, reliable, and useful rational reconstruction of the SAI.
To summarize, the issues with Dembski’s EF are many and well-documented. Dembski’s EF fails to achieve its claimed status as a “rational reconstruction” of how humans empirically approach the problem of sorting competing explanations for natural phenomena. Better methods exist that serve as descriptions of how humans can “eliminate chance” in preferring alternative explanations for phenomena in the natural sciences.
Viewed 1854 times by 868 viewers
Computation Austringer on 31 Dec 2008
A Targeted Linux Site — LinuxSlate.com
During my recent trip, I was able to meet up with “Crossbow” from LinuxSlate.com. LinuxSlate is his site to give reviews and commentary on Linux as used for mobile and embedded systems. As stated on the sidebar, the site originated as a means to distribute Linux device drivers for the screens in Fujitsu pen-based PCs. Since then, though, “Crossbow” has been giving capsule and extended reviews of various products that either use Linux out of the box or may have Linux installed, plus commentary on market penetration of Linux in the mobile and embedded systems market. For example, there is an extended review of the Motorola Motozine ZN5 cell phone, and a mini-review from October of Target’s stripped-down version of the Asus Eee 900 portable mini-PC at a impulse buy price of $299. It’s a site worth checking out.
Viewed 2771 times by 1090 viewers
Computation Austringer on 08 Dec 2008
Keeping Up Appearances
I updated the theme using the WordPress 2.x capable version of “Shaded Grey”. There are some features that aren’t yet working, like the drop-down styling of things in the sidebar. But I’ve gotten back the content of the sidebar items that went missing for a while, and that’s a happy thing.
Viewed 2904 times by 1108 viewers
Computation & Education & Science Austringer on 27 Oct 2008
NCSE’s New Website
The National Center for Science Education has long planned a revision of their web pages. Now, the new version of their website is officially up and running. Check it out.
The previous version is still available, though. There will be a period of confusion until Google spiders both sites, as the old content has been incorporated into the new framework, which presents different URLs for access. The legacy site was coded and contributed by Ira Walter back in 1998. It served NCSE well for several years, but Walter’s time commitments prevented him from doing much in the way of updates. The combination of custom coding and targeting of a specific host setup caused NCSE, and me, a bit of a headache in 2006 just before Thanksgiving when Ira Walter died and shortly thereafter the server hosting the site died. It dropped into my lap to get the site running again on a new server at the same hosting company, whose different underlying software architecture required some basic changes in the way various functions worked. NCSE had been hobbling along with the patched website.
Now, it looks like they have a good basis for carrying their content into the future. The Drupal content management system is a widely-used, actively developed, open source system that is themable and flexible. If a change in presentation is needed, it only requires theme changes.
Viewed 3933 times by 1261 viewers
Antievolution & Computation & Science Austringer on 24 Oct 2008
The Real Weasel in JavaScript
I set up a page taking a first pass at providing a “weasel” program that follows the description Richard Dawkins gave in “The Blind Watchmaker”. It is done using a not well-behaved JavaScript routine, but as this was my first JavaScript coding of any sort, I figure refinement can come later. See it here. New version here. [Now also integrated into the main AntiEvolution.org CMS.]
One of the things that the page aims to debunk is the long-running falsehood that “weasel” (and every other form of evolutionary computation) must get its relative improvement over random search by somehow “locking in” parts of solutions that match a known target. William Dembski is the most persistent of people peddling this error, even though he was handed effective notice of its falsity over eight years ago. To that end, the “weasel” I wrote not only does not “lock” individual characters, but you can verify that it does not lock individual characters against mutation. You can set higher mutation rates and smaller population sizes to find values that cause “weasel” to sometimes step back to having worse performance in one generation than it had in the generation just previous. A simple verification can be had by setting the default target string, a population of 30, and a mutation rate of 8% per character. Run that several times, and most times you will see at least one generation that had a stepback in performance.
Dembski and others make much of the fact that the few lines Dawkins used to illustrate output from his “weasel” don’t show an instance of a stepback in performance. They insist that this means that Dawkins must have used a rule in his program that would not allow a stepback in performance, but that is simply induction gone wild and sloppy thinking. One can verify with my page that for reasonable mutation rates and population sizes, stepbacks are rare. Why is this? To get a stepback, every single string in a new population has to be have at least one less matching character than was present in its parent from the previous generation. The more strings that are generated in a new population, the smaller the probability that all of them will alter at least one of the already-matched characters from the parent string. So small population size is critical to observing stepbacks. Also, strings changing at least one already-matched character from the parent are less likely with lower mutation rates, so a high mutation rate is also critical to observing stepbacks. Neither of these conditions is didactically similar to the process that Dawkins wanted to make an analogy to, natural selection in biology.
I’ve already noted (and graphed) that there is a difference in performance between Dembski’s invention of an “oracle weasel” and what Dawkins’ description delivers. The “weasel” page of mine helps drive the point home by providing an interactive means for people to explore what a real implementation of what Dawkins described does.
Viewed 2960 times by 1182 viewers
Computation Austringer on 10 Oct 2008
Sender Policy Framework
I’d been assuming that there wasn’t much to be done about cases where spam got through my email filters by dint of having forged my own email address into the “From” field, or to inform others that, no, it wasn’t my server that’s been sending spam with the forged “From” information. I was wrong, there is something to be done, and one of those things is setting up Sender Policy Framework (SPF) information.
SPF is a standard for specifying which hosts may legitimately send email that bears a particular domain name. This is done via Dynamic Name Service (DNS) records that can be queried. If one is managing a DNS server directly, it can be specified in the zone record. If one is using a domain registrar’s DNS interface, then one is going to set a TXT record to do the job. The syntax is pretty simple, and there is an online form to help generate what goes in a TXT record to set up your SPF.
The TXT record contents I just set all my domains to use are as follows:
v=spf1 a mx ptr ip4:71.170.27.36 ip4:71.170.27.37 -all
The “v” field specifies the version of SPF being used. Following that, there are a series of exemptions that tell which hosts are legitimate senders of email for the domain. First, “a” means that all addresses in DNS A records are legitimate. Similarly, “mx” and “ptr” say that addresses associated with the domain’s MX and PTR records are legitimate. (The Pobox online form linked above deprecates the PTR record use.) There are two servers that specifically may send email, and I’ve included their IP4 dotted-quad addresses using the “ip4:nnn.nnn.nnn.nnn” format. The final parameter, “-all”, says that no other addresses legitimately may claim to send email on behalf of my domain. So, if you have a domain that never sends email, you could set SPF with the following TXT record:
v=spf1 -all
Now, this only helps if SMTP agents receiving email bother to check SPF information. One way this can happen is with the SpamAssassin email filtering system, when it is set to make SPF queries. This is the step that I’m working on on my servers. I have one that doesn’t have SpamAssassin installed, so adding SPF_QUERY is simply a configuration checkbox away. The other one has SpamAssassin running, but the configuration did not include SPF_QUERY, so I have to figure out how to get that enabled.
For those using the FreeBSD OS and wanting to set up SpamAssassin, here’s a helpful page.
I ran across this while investigating how to use SpamAssassin in conjunction with my bulletin board on Antievolution.org. Recently, porn and drug spammers have managed to enter comments on the BB, so I’d like to pass comment text to SpamAsssassin and get an indication of spam/not spam. I think that I can do this using “spamc -c”, but I still have a batch of work to do to get to that point. If anyone has an alternate approach I should consider, let me know.
Viewed 1737 times by 839 viewers
Computation & Science Austringer on 12 Sep 2008
Evolutionary Computation and Astronomy
Back in 1998, I was suggesting the use of evolutionary computation to investigate alternative hypotheses concerning data about foraging in a certain species of bat that preyed upon a particular species of insect. Nature has a summary of a paper in Astrophysics Journal where the researchers used a genetic algorithm to look in a space of over a quadrillion possible orbits for a spiral galaxy and a dwarf elliptical galaxy, and come up with an orbital scenario that closely matched the observed characteristics of the pair. This is a somewhat different application of evolutionary computation, essentially to identify better explanations for an observed state, rather than to produce an approximate solution to a problem given a specific set of conditions. I have no doubt that more of this style of application of evolutionary computation will be seen in the future.
Viewed 3168 times by 1415 viewers
Computation Austringer on 29 Aug 2008
Catching Up with the Past: A PDA and Skype
The era of multi-use phones is seeing the personal digital assistant (PDA) either acquiring phone capability or phones adding PDA-like features. Count me in the ranks of the academically underfunded, though; the bleeding-edge gadgetry simply isn’t in my budget. Earlier this summer, though, I was able to score a used PDA cheap off of eBay. I got a Dell Axim x50v PDA. It was cheap because its main battery doesn’t hold much charge and its charger had a bad connection to the PDA. Once the internal back-up battery drains, the PDA won’t launch no matter how well-charged the main battery is. I got a larger-capacity main battery and a three-function cable for it (charger, USB, and VGA out). My main goal was to have a WiFi-capable PDA to check out website accessibility for mobile users. However, I recently discovered a different use: telephone via Skype.
Skype has a Pocket PC version of their voice-over-IP application. I downloaded that to my Windows laptop and ran the install; it loaded on the x50v without incident. I’ve used it for a couple of long calls so far. Diane and I never bothered to get a landline here. We rely on our cell phones instead. So anything that could help us stay within our regular minutes on the cell phone plan, or enable us to drop down to a less costly plan, is all to the good.
I’ve had one bad consequence of running Skype so far: while Skype is active, the PDA does not time-out for power-down. I put the PDA in my vest the other day, and though I had hit the power-off button, it must have gotten pressed accidentally sometime. When I pulled the PDA out and tried to turn it on, I got nothing. The battery had run down completely. Simply recharging was not enough, the x50v would not load its WiFi drivers afterward. I had to do a hard reset and reinstall applications. If I had been relying on the x50v for the usual PDA connectivity, I would have been pretty put out. So for traveling, I need to turn off and then also remember to set the lock button so it won’t turn on accidentally again.
Otherwise, the unit seems pretty decent for phone communication. A headset for listening helps. The built-in microphone is the only thing going there, though I haven’t checked to see whether it can be paired with a Bluetooth headset/microphone combination. Fortunately, the built-in microphone seems sufficiently sensitive for my purposes.
I’d like to hear from others who are using PDAs or other mobile devices with Skype, and especially about Skype features I haven’t tried yet: SkypeOut for calling standard phones, and setting a phone number via Skype for SkypeIn. I have to be budget-conscious, so I’d like to know how those work out in practice.
Viewed 2658 times by 1086 viewers
Computation & Media Austringer on 20 Aug 2008
Checking China Censorship
WebSitePulse has a China firewall test that gives users a way to check if a particular website is being blocked by the Chinese government. A TV news item from the Olympics noted that multiple blogging athletes had no access to their weblogs from the Olympic Village. In the cases of sponsored athletes, part of the sponsorship deal apparently was having the sponsor mentioned regularly on the weblogs during the Olympics, which would pose difficulties under the circumstances.
I’m not sure of just how thorough the Chinese censorship regime is. While several websites I manage appear to be received just fine in China, I think that if I were going there and wanted to keep updates flowing on the weblog, I would be planning ahead for the eventuality that I might not be able to get direct access while in the country. Wordpress has a feature allowing it to post emailed entries, which seems like a good first resort to not seeing the blog page. One wouldn’t, of course, be able to view or moderate comments in that condition. Second would be to set up a friend to be able to post on my blog for me and email entries to them. Blocking that would require that all my outgoing email was filtered. It’s relatively easy to protect content by use of encrypted attachments, which means the next step for a censor would be either to block all my internet access — or otherwise restrain me.
Like I said, I don’t know how seriously the censorship is taken or how extensive, but I think one of the email workarounds would likely work for at least a limited amount of time. Any China internet experts care to weigh in?
Viewed 2653 times by 1064 viewers



CafePress Shop