Monthly Archives: March 2007

Hard Drive Died

Well the hard drive in my MacBook Pro died yesterday. I started noticing drive read/write/seek errors about a week or so ago while working in Linux and yesterday it finally kicked the bucket. Anyways, I have the AppleCare Protection Plan. I am sending it in and it should take about a week or two to get replaced according to Apple. Damn.

Update:
I received my MacBook Pro back in 3 days. The service I received from Apple was amazing. I shipped it on 3/22/07, it was fixed and shipped back to me on 3/23/07, it was delived via DHL on Saturday 3/24/07 at 9:00am PST. I would recommend the AppleCare Protection Plan for this quality service alone.

Comment Spam

Within a week of switching to WordPress for my blogging software, I started receiving a lot of comment spam. I found this amazing because I have had a blog for a few years now without any problems. I have had the occasional spam comment, but lately I have been receiving 3-7 of them a day. I know this is very little compared to high-volume sites, but seems like a lot for a small site like mine. For the most part, the Akismet spam plugin WordPress ships with does an amazing job. It has let a few slip by, but that is no big deal.

This whole comment spam problem reminded me of a research paper I read a year or so ago. It was called Defending Against an Internet-based Attack on the Physical World. It was about the threat of using api’s such as Google’s SOAP API to automate filling out request forms for catalogues and other material on thousands of sites to some victim. This would cause the victim’s physical mail to become overloaded and very hard to manage. Imagine 100′s or 1,000′s of pieces of mail being delivered to your house every day. The point of this being that I figure spammers are using a technique similar to this to find WordPress blogs, then spam them automatically.

I decided to see how easy it was. First I went to see if I could sign up for Google’s SOAP API, but I found out that they no longer offer this service. Without this service, it is going to be a lot harder to get this done. Ignoring the whole api problem, I decided to find a search string to find comment pages on WordPress blogs. I was amazed at how easy this was. I just went to a blog using the default WordPress theme and looked for keywords that would always be there. After about a second I came up with this search string:

"Leave a Reply" Name Mail Website "proudly powered by WordPress"

Typing this into google found over 1,000,000 pages! Clicking a few of these verified that they were infact WordPress comment pages. Now I needed to write a program to automate parsing these links. Without the search api, I was stuck doing it manually. After about an hour I came up with this python script. This script will submit the search string I generated above to google, parse the first 100 results from the page, then submit a search for the next 100 and so on. While testing this script I noticed google started blocking my search, which is a good thing. I found a way around this by using different User-Agent strings and adding some timeouts. Because of this, the script defaults to saving the first 100 links. I have left out the code to fill out the comment forms becuase I feel that piece of code would do more harm than good.

Anyways, I think there is a huge problem with comment spam that needs to be fixed. The fact that so many pages can be found in a single search is amazing. Google blocking querys when it detects a bot is definitely a step in the right direction. The fact that I was able to get around this so easily is not.

Files:
http://www.mattweber.org/files/wp-link-finder.py

Python script: rename.py

I like to have my music, movie, and picture files named a certain way. When I download files from the internet, they usually don’t follow my naming convention. I found myself manually renaming each file to fit my style. This got old realy fast, so I decided to write a program to do it for me.

This program can convert the filename to all lowercase, replace strings in the filename with whatever you want, and trim any number of characters from the front or back of the filename. Here is the usage output:

usage: rename.py [options] file1 ... fileN

options:
  -h, --help            show this help message and exit
  -v, --verbose         Use verbose output
  -l, --lowercase       Convert the filename to lowercase
  -fNUM, --trim-front=NUM
                        Trims NUM of characters from the front of the filename
  -bNUM, --trim-back=NUM
                        Trims NUM of characters from the back of the filename
  -rOLDVAL NEWVAL, --replace=OLDVAL NEWVAL
                        Replaces OLDVAL with NEWVAL in the filename

Here is a few examples of what this program can do.

]$ ls -l
total 0
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 01-BandName_-_SongName-group.mp3
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 02-BandName_-_SongName2-group.mp3
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 03-BandName_-_SongName3-group.mp3
]$ rename.py -f3 -r "_-_" "-" -r "-group" "" *.mp3
]$ ls -l
total 0
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 BandName-SongName.mp3
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 BandName-SongName2.mp3
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 BandName-SongName3.mp3
]$ rename.py --replace="Band" "" -lv *.mp3
BandName-SongName.mp3 -> name-songname.mp3
BandName-SongName2.mp3 -> name-songname2.mp3
BandName-SongName3.mp3 -> name-songname3.mp3
]$ ls -l
total 0
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 name-songname.mp3
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 name-songname2.mp3
-rw-r--r--   1 matt  matt  0 Mar  4 14:03 name-songname3.mp3

Files:
http://www.mattweber.org/files/rename.py

Dual Boot MacBook Pro

Well I decided to dual boot my new Core2Duo MacBook Pro with Linux and OSX. For the most part I followed the excellent OnMac.net wiki article, however I did add a few extra steps as well as run into a few problems.
Read more »