Bruised Fancy

This blog gives me a place to comment on things which strike my fancy, hence the title. Topics may include computer software/hardware, science, space, beer, books/movies/television programs of a geeky nature, or almost anything else. It is not marked as containing adult content but be warned that I occasionally post about beer and sometimes forget to watch my language. I've been writing systems software since the days of core memory, paper tape, and front panel lights/switches.

Friday, July 25, 2008

MacVim

I'm a long time vi/vim user. I don't know that I'd recommend anyone who doesn't already know vi go through the heavy learning curve necessary to become proficient at using vi. There are a number of gui editors which are more easier for new users to learn. However vim (vi improved) is available on very nearly any computing platform you might use. I also find that vim allows me to accomplish some pretty complex editing operations faster than most of my former coworkers using other editors. Having used vi for over 20 years now, I have yet to find an editor which would make me more productive and believe me I've looked. I'm constantly searching for new programming tools in my spare time.

I've been looking for a decent port of vim for the Mac for a while now. The version pointed to by vim.org always seems to lag behind a version or two. It also has a few deficiencies. It used to have screen draw problems and would leave pixel residue behind after scrolling. It also never handled the "-" command line argument properly. This argument causes vim to read its data from stdin which is very handy for piping output from other commands into vim for easier manipulation.

My search is over. The team at Google Code has created a great port of vim for the Mac they call MacVim. It's fast and features none of the problems I'd experienced with other ports. Thanks guys for a great porting job!

Sunday, July 20, 2008

Data recovery, part two

I've managed to recover a fair number of files from my stepdaughter's failing hard disk. You might recall I used a program called dd_rescue to do a raw copy of the sectors of the failing hard drive to an image file on a larger USB hard disk. That was important because the old hard disk seemed to be getting progressively worse, as hard disks which have experienced a partial crash are wont to do.

There was good news and bad news regarding the copy. The good news was dd_rescue managed to copy about 33 GB from the 60 GB drive before encountering constant errors. The bad news is that left about 27 GB of data which hasn't yet been recovered.

Next, I used a handy (and free) program called PhotoRec to recover photos and a number of other data types from the partial image of her drive. All told, it found about 4000 jpg files large enough to be her pictures. Some of them probably come from a browser cache but a good deal of them are vacation photos which she'd be pretty upset to lose.

So far it's been at least a partial success. I'll post more if there are significant updates in the future.

By the way, let this be a reminder to you to go back up your data. Hard disks sometimes fail with no warning and not all drives fail gracefully enough to allow some data to be recovered.

No-name router problems

I spent part of this morning doing some remote troubleshooting of a problem my in-laws were having with their broadband connection. Their broadband provider supplied a no-name router. Somehow it had decided that my wife's laptop had made too many outbound connections and therefore must have a virus. Once having decided this (and quite erroneously so), this poorly designed router continued making this assertion even when her laptop was no longer plugged into the router. In fact it seemed completely unable to determine which were active computer connections and which had timed out. Its status page listed two computers which didn't match any computers currently connected to the network nor had there been any such computers connected that my in-laws could remember.

You might wonder how I could determine that the complaint about too many outbound connections was erroneous beyond the shadow of a doubt. I simply enlisted the use of the "netstat" command. The netstat command exists in all major OSes (Windows, MacOS, Linux, and BSD). It allows you to determine the state of network connections for the computer on which you execute the command. Using the "-b" option allowed us to see which programs had open connections. As I suspected, only iTunes, Thunderbird, and Firefox had network connections and none of the three applications had an unusual numbers.

And yet this silly router continued complaining about the number of outbound connections from this one computer even when the computer was disconnected to the network and through several power cycles of the router. So I walked my stepdaughter through the procedure to disable this poorly implemented portion of the firewall (the detection of number of outbound connections) because it obviously wasn't working properly.

My advice is to stick with a name brand router (Linksys or Netgear) whenever you're presented with the option. Sadly since this router was supplied by their ISP, they don't have a choice in the matter. Using a no-name router may cost you more than any initial cost savings realized by purchasing a cheap device in the amount of troubleshooting time you spend on poorly implemented features such as this one.

Sunday, July 06, 2008

Recovering data from a failing hard drive

I've been trying to recover the data off a failing hard drive for a family member. I've found a few programs which claim to be able to do just that but they always get hung up by the numerous retries the drive keeps doing in the failing areas. Then I came up with the idea of using the dd command to make a copy of drive image which I could then manipulate having gotten the retries out of the way during the initial copy process. I'd used dd pretty heavily during the development of an SD card driver I'd done at my last company. Once the drive image has been copied to a file, the resulting image file can be mounted using the mount command... well it can on Linux and Mac OS X at least. You poor folks running Windows are out of luck.

After looking around on the web, I discovered a great little program called dd_rescue which does intelligent retries if errors are encountered, slowly lowering the block size being requested to find the boundaries of the affected area. I think the standard dd command would try to do retries until the read worked or until the copy was aborted. dd_rescue also allows an offset to be specified when the command is invoked so the copy may be done in several stages. Since it's taken about 4 hours, off and on, to copy the first 33 GB from the failing 60 GB drive, I'm anticipating having to make heavy use of this feature to complete the copy process over the next day or two.

I made a few minor changes to the source to allow me to curtail the retries to speed up the copy. So far it's copied about 32 GB from the failing 60 GB drive. Once the data has been copied then I'll start trying to recover files from it. Wish me luck, I think I'm going to need it!

Monday, June 02, 2008

quick and dirty shell command

Today I was working on some old code at work. I discovered at least one duplicate include file which is a personal pet peeve. It's far too easy to allow multiple include files get out of sync so you have different versions for different source files.

What I needed was a quick way of finding all the duplicated include files within this project directory (and subdirectories). It turns out stringing together a few Unix/Linux/Mac OS commands with some I/O redirection makes this task pretty easy.

The first thing we need is to be able to locate all the include files. In the C programming language, these files typically end with the ".h" file extension. We can use the find command to give us a list of the files which end with .h.

The next problem to be solved is that the matching files will have not only their filenames but also the directory in which they're located printed out. So we need a way of extracting just the "base" filename. Fortunately bash has any easy method of accomplishing this with the basename command.

The next logical step in figuring out whether there are duplicate filenames is to sort the matching filenames to make it easier to see matches with the sort command.

Finally we can use the uniq command to show just the filenames which appear more than once. The uniq command has other options. You can choose to show just items which are unique as well.

If we put all the portions of this command together, we come up with the following command. It's doing a lot of work to save us the trouble of manually sifting through all the filenames ourselves. That's what computers are supposed to do for us, eh?

find . -name "*.h" -print | xargs basename | sort | uniq -d

Sunday, June 01, 2008

Palm Centro

I've used a Palm PDA without interruption since the first one was introduced in 1996. I was working for U.S. Robotics which owned Palm at the time and the employee pricing helped me decide to take the plunge into PDA life. After all these years, I've come to rely heavily upon a few key PDA applications (in addition to the standard PDA applications).

I use SplashID to securely store the multitude of passwords I need to remember both at work and at home. Without it, I'd have to resort to using weak passwords in order to stand a chance of remembering them all which compromises security.

I use SplashMoney to record credit card transactions while I'm away from my computer. This ensures I stay within budget and helps guarantee that I recognize any erroneous charges which might pop up.

JFile is invaluable for storing databases I design myself. I use this to keep track of all manner of data such as books I've got and those I'm interested in reading. Before I did this, I occasionally bought multiple copies of a book.

SlovoEd is a portable dictionary which allows me to look up words I don't recognize when reading without a print dictionary handy. My Centro takes up a lot less space on my nightstand than a conventional dictionary.

Adobe Reader for Palm allows me to read PDF documents on my PDA. This is handy to ready books in non-traditional settings. It's nice always having a book handy to read for those occasions when unexectedly left with extra time to kill.

A couple months ago, the time seemed ideal to upgrade my phone and PDA. My wife's phone was acting up and my stepdaughter wanted a cheap PDA. So it made sense to get a device which fulfilled both those functions for me, freeing up my phone and PDA for them. This also had the added benefit of allowing me to pare down the devices I carried from two to just a single gadget.

The Palm Centro is smaller than I expected but keyboard surprisingly useable. The software upgrades work to make the smaller sized device more intuitive to use than older Palm devices. Is it perfect? No, but it does seem a better compromise device than the other affordable multi-use devices I've seen.

If you're interested in an affordable combination mobile phone and PDA device, check out this review from Engadget.

Sunday, May 04, 2008

I/O Redirection

One of the most useful features of the Linux, Unix, MacOS, and to a lesser extent Windows (more on that later) is the concept of I/O redirection. In this discussion, we'll restrict ourselves to the pipe form of redirection which is invoked by the vertical bar character "|". This tells the command interpreter to take all output from the first part of the command and send (or pipe) it to the second part. The other characters which invoke I/O redirection are the less than "<" and the greater than ">" characters. Those are primarily used to send output to a file or cause the program to take its input from a file.

If you're a long time computer user, you may want to skip ahead to the examples below. You may have heard the term "I/O redirection" before but what does it really mean? I/O redirection gives you the ability to chance where a program's input and/or output is bound for. Normal command line programs have both their input, aka stdin (standard input), and output, aka stdout (standard output), directed to the console (which is just shorthand for saying the input comes from your computer's keyboard and the output goes to the portion of the screen where you're running the program). Note that if one of the portions of the command line produces errors, you may be surprised to find the error messages may not get redirected with the pipe command. This is because many Unix/Linux style programs also make use of a third I/O stream called stderr. WIthout taking special action, stderr output is almost always directed to the console to bring the error condition to the user's attention.

The simple example

For this example, let's suppose that you've got a huge tar file, aka tarball) which is really an archive file containing many other files. Now suppose you want to look to see whether it contains a text file but you really don't recall the name of the text file. Perhaps you recall something else about the text file such as it was located in the /projects directory. You could always get a listing of all the files within the tar file and manually search through them but computers were created to relieve users of the need to do such labor intensive tasks. How about we put I/O redirection to work?

To start with, we need to obtain a listing of all the files in our tar file which for purposes of illustration we'll call sample.tar. That can be accomplished with the command below.

tar -tvf sample.tar

That gives you a complete listing of all the files but chances are if it's a big tar file, the names and details of the files scrolled off the screen and perhaps overwhelmed even the scroll back buffer of your terminal window (aka command interpreter or shell). In any case, being lazy computer types we don't feel like searching through this huge amount of data.

The first thing we want to do is to weed out all the non-text files. Hopefully we've been disciplined about our file naming conventions and have added a ".txt" file extension to all our text files. So let's show only the files which end with that file extension. We'll use our old friend "grep" to match just the output lines which contain the string ".txt". Note that we're using the "-i" parameter to specify that we want to match the string ignoring the case of the letters in the string. This may be important if not everyone adding files to the tar file was careful about adding a .txt and not a .TXT file extension. Toy OSes like Windows don't make this distinction but you'll find they don't handle I/O redirection properly either. The Windows shell is very simplistic so even if you've added Linux style utilities like "tar" to its repertoire, you may be disappointed to find that it doesn't do multitasking. A proper OS will handle I/O redirection real time so when you type the command below, you'll see output data appear quickly. Windows creates a temporary file containing all output from the first part of the command which it then sends to the second part of the command once the first is completed. It makes the Windows command line feel much slower than it is and believe me it doesn't need much help. If you've ever manipulated large tar files on both Windows and Linux systems, you'll quickly discover that the Windows command line isn't performance oriented by any stretch of the imagination.

tar -tvf sample.tar | grep -i ".txt"

That gives us a listing of just the files which contain the string ".txt" which hopefully only appears as a file extension.

Something we notice about the output which makes life a bit tougher is that the tar shows you the file names and other details in the order they were added to the tar file. It would be nice if we could see the files sorted by the directory names in which they appear. Fortunately there's a simple solution to that desire.

tar -tvf sample.tar | grep -i ".txt" | sort

This command does the trick but it also illustrates some odd behavior. The output doesn't appear piecemeal the way it had been doing previously. If we think about it, the reason becomes obvious. The sort command can't really sort correctly unless all input to be sorted is present. So it must wait until the commands up to that point in the command line are complete before starting to sort the output.

Since we might be fans of the graphical version of the vim (vi improved) editor, we can add another labor saving twist to our command line. We can send the output of our command to gvim. This has the advantage of being able to search the output using editor commands. Doing this in gvim will also cause the search terms to be highlighted within the text making it much easier to pick out from the surrounding text.

tar -tvf sample.tar | grep -i ".txt" | sort | gvim -

Obviously the commands above were simple examples to make explanation easier. I'll add a few slightly more advanced examples below with a brief explanation of what they do. Once you get the hang of it, you'll find you quickly come to rely upon this powerful feature. Most of the Unix/Linux style command line utilities are written so they can be easily combined to create more powerful command lines similar or more sophisticated than the ones we've been exploring.

A few more advanced examples

The command below uses the find command to search for all files which end with the ".txt" file extension. It then searches them to see which of them contain the string "project". Note the "-l" parameter causes grep to only output the filenames which match the search criteria. If you omit the "-l" you'll see a list of matching lines from within the files. Also note the use of the xargs command which may seem unfamiliar. It's a method of appending multi-line output from previous commands to form arguments for the command specified after xargs.

find . -name "*.txt" | xargs grep -l project

This command does essentially the same thing but sends the list of matching file names to the vim editor. It issues the command to search for the string "project" so that term will be highlighted in the file and the cursor will be placed on the first occurrence within the first file.

find . -name "*.txt" | xargs grep -l project | xargs vim -c /project

Try coming up with ways to use I/O redirection which make your time at the computer easier. You'll be glad you did.