Converting WordPress to static html


UPDATE: Check out the new post on a better way to do this here: Convert WP to Static HTML Part 2. Or see the page devoted to the script here: Make WordPress Static.

Usually people are wanting to convert their static html pages to some dynamic content management system. I’ve run into the issue of needing to go the other way.

A few professors at GMU love to use WordPress for their classes. It’s a really great way to get more student participation and involve some of those who aren’t so talkative in class.

But these blogs are usually only needed for one semester, and then just sit there. This can be a security risk if they are not kept up to date, and is cumbersome when trying to update many of them (one professor had over 30 blogs!).

Sometimes the content should still be viewable, but the need for a whole cms type back-end no longer exists. Sometimes the professor would just like a copy of the pages for their own future research or whatever.

So, I figured out a way to convert a dynamic WordPress site into static html pages.

Here are the basic steps I used:

  1. Change the permalink structure in the WordPress admin section. Alternatively, directly in the database change wp_options.permalink_structure.option_value to “/%postname%.html”.
    [code lang=”SQL”]
    UPDATE database.prefix_options SET option_value = ‘/%year%/%monthnum%/%day%/%postname%/’ WHERE prefix_options.option_name = ‘permalink_structure’ LIMIT 1 ;
    [/code] 

    UPDATE (2.12.08): Reading a post from Christopher Price (who linked to this post) about WP permalinks, I’m thinking using this structure (/archives/%post_id%.html) might afford the best results. I often found a page that displayed the raw HTML instead of being rendered. This just might fix that issue.

    UPDATE (3.11.08): I did some more dynamic to static conversions today, and found out the best permalink structure to use is just the post name. No extra categories and such. So the best structure to use would be this (/%postname%.html). The benefit is that the every page is unique with a descriptive name for the url (albeit sometime very long), and there are not as many subdirectory issues that arise.

    UPDATE (7.17.09): This time around, I have found that the following seems to work best for permalink: /%year%/%monthnum%/%day%/%postname%/ And cleaned up the SQL statement.

  2. Add the .htaccess to /path/to/wp/ if not already there (where /path/to/wp/ is from http://somedomain.com/path/to/wp/ ). If there already is a .htaccess file and it is set to have permalinks, then you can probably leave it as it is.
    RewriteEngine On
    RewriteBase /path/to/wp/
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /path/to/wp/index.php [L]
  3. Use wget to copy all of the files as static html files.
    [code lang=”bash”]wget –mirror –wait=2 -P blogname-static -nH -np -p -k -E –cut-dirs=3 http://sitename.com/path/to/blog/[/code]
    *** Change –cut-dirs to the appropriate number associated with how many directories are after the domain name. The trailing slash plays a part too. ****
    UPDATE (03.11.08): I found that the –cut-dirs doesn’t really do anything this time around.
    UPDATE (7.17.09): This time around, I find the following to work best, even the –cut-dirs. 

    This has the bonus of making the directory for you, thus negating the make directory step. Make sure to use two dashes and not an em dash.

  4. Copy the contents of wp-content to save uploaded files, themes, etc. This way copies a lot of unnecessary php files, which could be potentially dangerous, but is really easy if you’re just converting to archive. To remove the security threat, just pick and choose the files you need.
    [code lang=”bash”]cp -r /path/to/wp/wp-content/* /path/to/static/wp-content/[/code]
  5. Sometimes the files are created with folders in the archives folder. To fix this run the following three commands in the archive folder to fix that up. To get rid of the feed file in all of the directories:
    [code]rm -f */feed [/code]
    To delete all of the now empty direcotries:
    [code]find . -type d -exec rmdir ‘{}’ \;[/code]
    To rename the files ###.1 to ###
    [code]rename .1 ” find . -type f -name "*.1"[/code] That’s two single quotes after the first ‘.1’
  6. UPDATE (03.11.08): I have found that the old ‘rename‘ command [rename .1 ” *.1]only works on the current directory. If you want to do a recursive renaming you have to use the ‘find‘ command. The above code has changed to reflect this.
    UPDATE (7.14.09): When the rename with find doesn’t work, it’s probably because the post has comments, so there is a folder with the same name as the post’s filename. In this case, just move the file (with the .1 extension) into the folder of the same name, but change the name of the file to index.html

  7. move to wp folder. make a backup of database: [code lang=”bash”]mysqldump -u [userfromwp-config.php] -p –opt databasename > databasename.sql[/code]
    UPDATE (03.11.08): I found I needed to backup just a few tables from a database that contained many copies of wordpress. To do this more easily, I used a little script I wrote earlier to dump tables with a common prefix. This could also work if you just put in the full name of only the tables you wanted to backup.
  8. move one directory above wp install. make tar backup of old wordpress folder: [code lang=”bash”]tar -cf wordpress.tar wordpress/[/code]
  9. rename the old wordpress folder [code]mv wordpress wordpress-old[/code]
  10. move the static copy into place [code]mv static/wordpress/ wordpress/[/code]
  11. test out the site. If it’s totally broke, just delete the wordpress directory and restore the original from the tar file.
  12. remove the tar file and wordpress-old directory as needed.
Share and Enjoy:
  • Print
  • PDF
  • RSS

Related Posts:


35 thoughts on “Converting WordPress to static html

  1. Pingback: WordPress, Canocality, Permalinks, and .html… The Saga Continues | Christopher Price .net

  2. Pingback: WordPress, Canocality, and html… The Saga Continues | Christopher Price .net

  3. Pingback: WordPress, Canocality, and html: The Saga Continues | Christopher Price .net

  4. Pingback: WordPress, Canonically, and html: The Saga Continues | Christopher Price .net

  5. wogahnct

    This is great and I followed it with my WordPress installation. One question though, the links all refer to the posts by postnumber (ie 40) not postnumber.html (ie 40.html) so my browser simply displays it as text when you click on the link. I can rename the files to postnumber.html and that’s great, but then all the links in the web pages are broken.

    Have you seen this before?

    Thanks!

    Cheryl

  6. ammon

    Cheryl,

    I have seen it a number of times, but I didn’t really investigate the issue more fully. Usually the page would display properly even without the .html.

    I think all of the comments worked, even though they showed up with the PHP GET information.

    Another option might be to change the .htaccess settings to show names instead of numbers, or even some other way.

    Good luck, and I’m glad it proved helpful.

  7. wogahnct

    Thanks. I changed the permalinks entry to be /archives/%post_id%.html and ran wget with the –html-extension option

    I did have to use Dreamweaver to edit some of the files, but it worked out great for me!

    Also, one thing to let people know… If they have installed widgets like ‘Recent Comments’, things that create links with php-type variables tacked on the end, remove the widgets BEFORE you use WGET. When wget flattens the site, these links still come through and are unusable without the php behind them.

    Cheryl

  8. ammon

    I added an update to the steps:

    UPDATE (2.12.08): Reading a post from Christopher Price (who linked to this post) about WP permalinks, I’m thinking using this structure might afford the best results. I often found a page that displayed the raw HTML instead of being rendered. This just might fix that issue.

    Which will probably help with Cheryl’s questions.

  9. Matthew Leingang

    Nice writeup! Definitely helped as I was trying to do this very thing. A couple of additions:

    1. I don’t have mod_rewrite on my server so I made the permalink structure “/index.php/%postname%.html”

    5. I added the -k (correct links) option for wget to make sure the links to pages that were downloaded pointed to the right place.

    7. I removed the feed directories but I didn’t have a problem with other directories. So I left them in their place.

    11. If you have other content in your site besides the wordpress stuff, use cp -r to instead of mv to avoid clobbering the rest.

  10. ammon

    Another update:
    UPDATE (7.14.09): When the rename with find doesn’t work, it’s probably because the post has comments, so there is a folder with the same name as the post’s filename. In this case, just move the file (with the .1 extension) into the folder of the same name, but change the name of the file to index.html

  11. Aaron

    Wow.

    That’s EXACTLY What I was looking for.

    My site is unfortunately running on IIS 6 so I’ll have to make some adjustments for it to work (We have only spotty support for ReWrite, using an ISAPI module).

    If I make any modifications, I’ll post here about them, and put them up online somewhere.

    Cheers!

  12. Aaron

    I ended up using HTTracker instead of wget, and just left the “/index.php/” in the URL re-writing. I did use a .php extension rather than making it a directory, though. Leaving it as a directory would have probably created a crazy directory structure. 🙂

    works great though! http://www.iue.edu/blogs/archives/

  13. Dale

    I have a wordpress theme that I really like and would prefer it in HTML. Is anyone willing to do this for me? Of course I would be willing to pay.

  14. Pingback: Convert WP to static HTML – part 2

  15. Pingback: WordPress, Canonical, and html: The Saga Continues | Christopher Price .net

  16. Pingback: Daily Digest — March 26th, 2011 — Amys Welt

  17. Shravan

    Excellent article mate…my site speed is kinda increased…but the caveeat is that i cant add new post that i used to ocassionally.is there any workaround?

    1. ammon Post author

      Yeah, that’s kind of the point of turning your WordPress into a static page. The process is not so much for speed increase, but for security. If the site is not active anymore, then I turn it into plain HTML pages and it can’t be hacked.

  18. Pingback: Make Static: Archiving, Accelerating and Securing Websites, or Converting Dynamic Sites to Static Pages | rejon is Jon Phillips.

  19. Gaianna

    There’s a really simple way to do it without having to hack. If you have the theme Magazine basic I know it works. Go to Magazine Basic under appearance. Hit the Front Page tab. under number of posts insert 1. Under excerpt or content, select content. For some reason this sets up the front page and the rest of the pages as a static site ( : Hope it helps!

  20. ahmed

    I am currently using wp only because of download and paid plugins, no more. wp is actually crap when it comes to SEO compared to HMTL static pages, I want to convert all their posts to static html pages and still use the plugins. how possible is that?

  21. lenny19

    My website has 6700 post which are made while wordpress permalink was set on default setting. When I try to change permalink on post-name and using .htaccess by entering script as mentioned above it won’t change my permalink structure. In fact links are all broken unless I switch back to default settings.
    I’m presently trying to convert this blog by running it on xampp. Thanks for any idea how to proceed!

  22. Pingback: Was your site part of the 162,000 strong DDOS? | News WP

Comments are closed.