Convert WP to static HTML – part 2

This is a followup to this previous post.

So I’ve been converting some more blogs to static html files, and this time around things seemed to be so different, that I made up a new how to. Here are the steps that I’ve been using to convert blogs using the default Kubric theme.

  1. Update the permalink structure for the site so that it uses the year, month, day, postname structure.
    UPDATE `database`.`prefix_options` SET `option_value` = ‘/%year%/%monthnum%/%day%/%postname%/’ WHERE `prefix_options`.`option_name` = ‘permalink_structure’ LIMIT 1 ;
  2. Make sure the blog does not block search engines. If the blog is set to block them, wget can only download the index.html file. And this took me a while to figure out. So, for the sake of search engines, if wget only downloads the index.html file or wget recursive gets only index.html file, then remember to check your robots.txt or similar settings. Either edit in the admin section (under Settings->Privacy) or via SQL.
    UPDATE `database`.`prefix_options` SET `option_value` = '1' WHERE `prefix_options`.`option_name` = 'blog_public' LIMIT 1 ;
  3. Add the .htaccess file if not already there, where
    /path/to/wordpress/blog/

    starts at the URL root, not the absolute file path. So http://sitename.com/path/to/wordpress/blog/ would have the .htaccess file below in the ‘blog’ directory.

    RewriteEngine On
    RewriteBase /path/to/wordpress/blog/
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /path/to/wordpress/blog/index.php [L]
  4. Get rid of the meta links through the sidebar widget in the admin, or delete the appropriate lines from the theme files (for default Kubric theme edit comments.php, sidebar.php, single.php, footer.php), or see the last step. Delete the code that puts in the search, comments, trackback, rss, and anything in the footer you want out.
  5. When all is good, run wget to grab the files.
    wget --mirror -P blog-static -nH -np -p -k -E --cut-dirs=5 http://sitename.com/blog/
  6. Rename the blog directory. mv blog blog-old
  7. Rename the static directory to be live. mv blog-static blog
  8. Copy the images directory from the old theme to the appropriate static directory.
    cp -r blog-old/wordpress/wp-content/themes/default/images/ blog/wordpress/wp-content/themes/default/
  9. Alternative to get rid of unwanted links, etc. Use the find command to find all html files, then use perl to delete the lines. Don’t forget to escape forward slashes in the search field. Unfortunately, this method requires you to do it for every line of code you want to delete. It’s much better to delete the lines out of the theme files. The code below has an unnecessary space in the opening H3 tag so it will render properly.
    find . -name \*.html | xargs perl -ni -e 'print unless /< h3>Leave a Reply< \/h3>/'

    Also, if you want to just search and replace instead of remove, this handy find and perl one-liner will find and replace text in all html files.

    find . -name *.html | xargs perl -p -i'' -e "s/search text here/replace text there/"

    The above would search for all the “search text here” phrases in all html files, and replace it with “replace text here”. You can obviously substitute whatever you want in those to places. If you have a ‘/’ (forward slash) character, it will need to be escaped with a ‘\’ (back slash) character. Perl uses the regular regular expression syntax, so look that up if you need help formulating a search and replace structure.

Share and Enjoy:
  • Print
  • PDF
  • RSS

Related Posts:

21 thoughts on “Convert WP to static HTML – part 2”

  1. Nice write-up, Ammon, thanks! Does this process also take care of URL redirects to the new .html?

    Also, why do you have to use the default theme? Can you do this so it uses the theme the blog had set prior to converting to HTML? I noticed when you did this to my Hist 120 class blog a few years ago, it reverted to the default theme instead of my custom class feed, which had some additional content that wasn’t in the database. Not complaining about that, just curious! 🙂

    1. Yeah, the process takes care of the URL redirects to the new .html pages too. There’s usually no clean up of the files after the wget process.

      Most of the blogs I’ve done so far used the default theme already, so I just stuck with that. It should work out fine with custom themes as well, since it just grabs what’s there (style sheets, images and all).

      Sorry about the missing content. Your blog unfortunately suffered from being one of the guinea pigs. I think I finally have this process down pretty good now.

  2. hello,

    First I want to thanks you for writing the script above for everyone, it is so helpful especially for beginners like me.
    sadly, I dont know how to run the script after uploading it to where wp-config.php is. Do I run it with MySQL? or using somthing else, Thanks!

    1. Hi tommy,

      Once you have the script on your server, you can run it from the command line. You’ll need to use a terminal (like PuTTy, if you’re using Windows). Make sure the permissions on the script are set so that your user has execute permissions (it would need to be 766, or rwxrwrw- if you do a ‘ls -lh’ to view the permissions). Then you’ll need to execute the script like so ‘./wpstatic’.

      Hope that helps.

  3. Hello,
    Thanks much for the information here. I was planning to use a plugin but it seems your process is more straight forward.
    Is there anyway these steps could be applied to multi-site installation?
    Thanks,
    Mark

  4. Hello.

    DESPERATE TO GO STATIC ASAP:

    The WordPress built-in editor randomly wipes out perfectly good code that had worked for several days. Suddenly, all corrupted. (Alternating blue-white table cells, alignment, all gone! — and other things.)

    I need to get out of WordPress. Is there any way to convert easily to a static site without losing all the stuff in my plugins?

    My pages are mostly empty or draft, but I don’t want to lose my sidebar & menu, and my video page image grid.

    Could you suggest HOW LONG it would take someone experienced to convert this to STATIC? And how much I should expect to pay for this – lowest end possible? I am very very limited in budget, which is why I was doing it all myself to start with. Thanks.

    1. Audentes,

      Sorry for the troubles. I would expect it would only take a half-hour or so for someone to run the script or commands to convert the WP install to static. They would need to make a copy of the database as well, if you wanted to save any of the plugin information. I’m out of the loop on how much to charge for contract work, so I can’t be much help there.

  5. Everything could be done in easy way:
    By employ plugin “wordpress.org/extend/plugins/static-html-output-plugin/”
    or using “httrack”

  6. Quick heads up, you can have wget ignore the robots text with the -e robots=off option, which is probably simpler than changing the site’s robots / settings. The resulting HTML files might still contain the noarchive meta tags, which would make for a more consistent mirror of the original site.

  7. Great script, thanks! My cheapo shared server was getting overloaded with light traffic.

    Only problems I had were because I’m running OSX, but a few minor changes fixed it:

    – change all occurances of “sed -r” to “sed -E”
    – fix path to mysql

  8. Very nice tutorial. Thanks for sharing it! I’ll have to do this for many blogs because hostgator shots e down too often for server issues.. 🙁

  9. #2… I got stuck on that one for ages. I was successfully creating static copies of every other site besides the one I wanted to! Thank very much for guide.

    In case anyone is interested, another plugin that does static version creation for WP is really-static. It is a little finicky to setup but seems to do the job well. http://wordpress.org/extend/plugins/really-static/

    In my case I want my static server to get the files rather than my WP install sending them, so wget is a better solution for me.

Comments are closed.