Converting WordPress to static html

UPDATE: Check out the new post on a better way to do this here: Convert WP to Static HTML Part 2. Or see the page devoted to the script here: Make WordPress Static.

Usually people are wanting to convert their static html pages to some dynamic content management system. I’ve run into the issue of needing to go the other way.

A few professors at GMU love to use WordPress for their classes. It’s a really great way to get more student participation and involve some of those who aren’t so talkative in class.

But these blogs are usually only needed for one semester, and then just sit there. This can be a security risk if they are not kept up to date, and is cumbersome when trying to update many of them (one professor had over 30 blogs!).

Sometimes the content should still be viewable, but the need for a whole cms type back-end no longer exists. Sometimes the professor would just like a copy of the pages for their own future research or whatever.

So, I figured out a way to convert a dynamic WordPress site into static html pages.

Here are the basic steps I used:

  1. Change the permalink structure in the WordPress admin section. Alternatively, directly in the database change wp_options.permalink_structure.option_value to “/%postname%.html”.
    [code lang=”SQL”]
    UPDATE `database`.`prefix_options` SET `option_value` = ‘/%year%/%monthnum%/%day%/%postname%/’ WHERE `prefix_options`.`option_name` = ‘permalink_structure’ LIMIT 1 ;
    [/code] 

    UPDATE (2.12.08): Reading a post from Christopher Price (who linked to this post) about WP permalinks, I’m thinking using this structure (/archives/%post_id%.html) might afford the best results. I often found a page that displayed the raw HTML instead of being rendered. This just might fix that issue.

    UPDATE (3.11.08): I did some more dynamic to static conversions today, and found out the best permalink structure to use is just the post name. No extra categories and such. So the best structure to use would be this (/%postname%.html). The benefit is that the every page is unique with a descriptive name for the url (albeit sometime very long), and there are not as many subdirectory issues that arise.

    UPDATE (7.17.09): This time around, I have found that the following seems to work best for permalink: /%year%/%monthnum%/%day%/%postname%/ And cleaned up the SQL statement.

  2. Add the .htaccess to /path/to/wp/ if not already there (where /path/to/wp/ is from http://somedomain.com/path/to/wp/ ). If there already is a .htaccess file and it is set to have permalinks, then you can probably leave it as it is.
    RewriteEngine On
    RewriteBase /path/to/wp/
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /path/to/wp/index.php [L]
  3. Use wget to copy all of the files as static html files.
    [code lang=”bash”]wget –mirror –wait=2 -P blogname-static -nH -np -p -k -E –cut-dirs=3 http://sitename.com/path/to/blog/[/code]
    *** Change –cut-dirs to the appropriate number associated with how many directories are after the domain name. The trailing slash plays a part too. ****
    UPDATE (03.11.08): I found that the –cut-dirs doesn’t really do anything this time around.
    UPDATE (7.17.09): This time around, I find the following to work best, even the –cut-dirs. 

    wget --mirror -P wpsite-static --cut-dirs=3 -nH -p -k -E https://site.com/path/to/wp/

    This has the bonus of making the directory for you, thus negating the make directory step. Make sure to use two dashes and not an em dash.

  4. Copy the contents of wp-content to save uploaded files, themes, etc. This way copies a lot of unnecessary php files, which could be potentially dangerous, but is really easy if you’re just converting to archive. To remove the security threat, just pick and choose the files you need.
    [code lang=”bash”]cp -r /path/to/wp/wp-content/* /path/to/static/wp-content/[/code]
  5. Sometimes the files are created with folders in the archives folder. To fix this run the following three commands in the archive folder to fix that up. To get rid of the feed file in all of the directories:
    [code]rm -f */feed [/code]
    To delete all of the now empty direcotries:
    [code]find . -type d -exec rmdir ‘{}’ \;[/code]
    To rename the files ###.1 to ###
    [code]rename .1 ” `find . -type f -name “*.1″`[/code] That’s two single quotes after the first ‘.1’
  6. UPDATE (03.11.08): I have found that the old ‘rename‘ command [rename .1 ” *.1]only works on the current directory. If you want to do a recursive renaming you have to use the ‘find‘ command. The above code has changed to reflect this.
    UPDATE (7.14.09): When the rename with find doesn’t work, it’s probably because the post has comments, so there is a folder with the same name as the post’s filename. In this case, just move the file (with the .1 extension) into the folder of the same name, but change the name of the file to index.html

  7. move to wp folder. make a backup of database: [code lang=”bash”]mysqldump -u [userfromwp-config.php] -p –opt databasename > databasename.sql[/code]
    UPDATE (03.11.08): I found I needed to backup just a few tables from a database that contained many copies of wordpress. To do this more easily, I used a little script I wrote earlier to dump tables with a common prefix. This could also work if you just put in the full name of only the tables you wanted to backup.
  8. move one directory above wp install. make tar backup of old wordpress folder: [code lang=”bash”]tar -cf wordpress.tar wordpress/[/code]
  9. rename the old wordpress folder [code]mv wordpress wordpress-old[/code]
  10. move the static copy into place [code]mv static/wordpress/ wordpress/[/code]
  11. test out the site. If it’s totally broke, just delete the wordpress directory and restore the original from the tar file.
  12. remove the tar file and wordpress-old directory as needed.

Tabledump

I had the need once again to dump only certain tables from a database, instead of all 100+ tables. This was where I had a database with about 5-8 wordpress installs. I wanted to backup all of the tables for only one install. There is a way with mysqldump to do this, by listing out all of the tables you want to dump. So I just wrote a bash script to take care of making the list of tables to dump.

It has an array of database table names (without the common prefix) in the script. Then it prompts for the mysql user, database, and prefix. It could be changed to prompt for a file that contains a list or array of table names.

Anyhow, here it is for anyone’s use:

[code lang=”Bash”]
#!/bin/bash

#—————————————–#
# Ammon Shepherd #
# 09.05.07 #
# Dump a database with only the tables #
# containing the prefix given. #
#—————————————–#

echo “This will dump just the tables with the specified prefix from the specified database.”

echo -n “Enter the database name: ”
read dbase

echo -n “Enter the table prefix: ”
read prefix

echo -n “The mysql user: ”
read sqluser
echo -n “The mysql pass: ”
read -s sqlpass

# Get list of tables with the desired prefix
list=( $(mysql -u$sqluser -p$sqlpass $dbase –raw –silent –silent –execute=”SHOW TABLES;”) )

for tablename in ${list[@]}
do
if [[ “$tablename” =~ $prefix ]]; then
tablelist+=”$tablename ”
fi
done

`mysqldump -u$sqluser -p$sqlpass –opt $dbase $tablelist > $dbase.$prefix.bak.sql`

echo

exit 0

[/code]

WordPress updater

What’s with these multiple posts in a day…

Today marks the completion of my Multiple WordPress Updater Script. I’ve already posted a bunch about school stuff, might as well post about work stuff too.

We host over 55 blogs at CHNM. It’s up to me to update them when security patches or new versions come out. Doing them each by hand is a pain. I did a bit of searching but didn’t find anything that would help me update so many sites automatically. So I wrote a bash script that will do it for me. It reads a file that lists all of the wordpress install paths or prompts you for the path to one, prompts for the version to switch to, and a mysql user/pass that has permissions for all databases.

Then the script creates a copy of the database, makes a copy of the wp-content folder, updates the wordpress install using subversion, fixes some permissions, and saves the subversion output to a file in your home directory (which I’ll probably change to somewhere’s else).

I run this via sudo as root for easy updating. What I’m really pleased with is that I figured out how to get the script to pull the database name from the wp-config.php file, and grab the owner and group for later fixing of the permissions.

I hope it can be useful to someone. If you have any comments or suggestions, let me know.

Latest Version: 1.2.3 – 04/29/08
Download file

UPDATE 24.4.08: WordPress Updater has been updated. I also updated this post, took out the code in the post, and put up a link to the file for you to download instead.

History of special effects.

Who doesn’t love a good special effects movie? Of course, when you can’t tell that there are special effects, that’s when you know it’s a good movie.

I stumbled upon this article at AmericanHeritage.com, that describes the beginnings of Industrial Light & Magic, George Lucas’ personal special effects company, makers of all cool films (especially Star Wars). This article also describes another sort of paradigm shift in the film industry.

Personally, these types of effects are my favorite. Using real things in innovative ways. I think it’s unfortunate, in a way, that so many of the stunts and effects are digital. I like the good, old fashioned effects where objects are real, made from real things, like the mother ship on “Close Encounters of The Third Kind” (the movie I haven’t seen, but the ship I have).

MotherShip

Anyhow, it was a good article.

And, just as a side note, I always fear losing these web articles, until now. I use Zotero which allows me to store, sort, tag and view web pages, books, and all sorts of stuff. I’ll be using it to collect data for my research projects this year. It’s also made by the good guys at the Center for History and New Media, where I work. 🙂 – Shameless plug!

Nazi board games

Another rare double-day post.

I heard this on a PRI show “The World”. From the site: “The World’s Clark Boyd tells us about an auction taking place tomorrow in Britain. Some of the items up for bid are children’s board games made in Nazi Germany.”

The seller has to sell them in Britain because Nazi memorabilia is illegal in Germany.

Here’s a link to the show, complete with mp3 for your listening pleasure.

A different history of computers and Linux

Wow, two posts in a day…

I just skimmed through this interview with Con Kolivas a major Linux kernel developer who has quite the Linux development world in frustration. What caught my attention was his ‘history’ of computers. His recollection of the computers history is truly different than I had ever learned or thought of. Basically, he paints the picture that computers could have been extremely different if the hardware had ruled instead of software. While computers were in their nascent state, the hardware being developed was ever changing. New and different ideas were used in each computer company. Then a software operating system came out that changed all that. By becoming the default OS, there was no more need to create better, different, new hardware. Instead all of the hardware was built and developed to suit the software.

It makes one think, what would computers be like if hardware ruled? What would they look like, how would they perform, how would they work, if they were not limited to one operating system?

Timeplot and Exhibit

The folks over at MIT’s SIMILE have two new projects that are just mindbogglingly awesome. I used the Timeline Project for my research project a few semesters ago about World War II.

Now they have a new time line type tool called Timeplot. This project uses a plot graph to display numerical data along with historical events. Sort of a mix between numbers and dates. Analytic history, if you will. I love the simplicity of the look, the ease of use, and the way it merges cold, dead data with live historical events. I have always wondered what historical events were happening at the time when I see graphs of data. This is an awesome tool to allow that to happen.

The other project, Exhibit, is a digital historians dream. Do you have lots of spreadsheets of info, perhaps all your dissertation data stored in the old JSON file? Wondering how to show that on the web without creating an extensive database solution? Exhibit takes care of it for you. And it’s dynamic! Sort and search the data automatically included. Crazy goodness! Now I just have to think of where to use this too….

The history of a software application

While doing some work today I stumbled upon a bit of history. The author calls it a story, but that’s what history is, right?

Anyhow, it’s a very entertaining and enthralling look at the birth and death (or retirement) of a software program. It provides a lot of insight into the process of making an application, building a software company, and some intriguing behind the scenes information about the coming forth of major Apple products like iTunes, iPhoto and the iPod.

Read “The True Story of Audion” the application that could have been iTunes. Link: http://www.panic.com/extras/audionstory/