What is ‘tput sgr0’ doin in my Bash prompt?

I have a new work computer, so this time around, I’m doing everything from scratch, because that’s the best way to learn.

In playing with my Bash prompt, I used this site to generate the prompt: http://bashrcgenerator.com/

Another one that is great is here: http://ezprompt.net/

The prompt that is generated uses a command to clear the text color that I hadn’t seen before: tput sgr0

My prompt (which I put in the ~/.bash_profile file) is:

#PROMPT
# To enter an emoji, while in insert mode type Ctrl-v, then enter the UTF8 code
# for the emoji, ex. U26A1 (note, use capital letters), then type ESC key. You
# can get a list of UTF8 emoji codes here: http://graphemica.com/
export PS1="\[\033[38;5;39m\]\u\[$(tput sgr0)\]\[\033[38;5;15m\]@\[$(tput sgr0)\]\[\033[38;5;229m\]\h\[$(tput sgr0)\]\[\033[38;5;15m\] [\[$(tput sgr0)\]\[\033[38;5;76m\]\w\[$(tput sgr0)\]\[\033[38;5;15m\]]\n\[$(tput sgr0)\]\[\033[38;5;215m\]⚡\[$(tput sgr0)\] "

So, of course, I spent the next 40 minutes trying to figure out all I could about that command, and more specifically, what ‘sgr’ meant.

I first scoured Google search results. Mostly just information about tput. Then I took to the manual pages: man tput was helpful in learning about what tput does. That led to man terminfo and finally to man ncurses. None of those man pages define ‘sgr’, but ‘ncurses’ did give a better clue by stating that “The  ncurses  library can exploit the capabilities of terminals which implement the ISO-6429 SGR 39 and SGR 49 controls”

So a Google search of ‘ISO-6429 SGR 39’ turns up an old 1990’s ECMA standardization for “Control Functions and Coded Character Sets”, Standard ECMA-48, https://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf

(More on ECMA history here: https://www.ecma-international.org/memento/history.htm) [sidenote: ECMA may sound familiar. ECMAScript. Wait isn’t that Javascript? See here: https://medium.freecodecamp.org/whats-the-difference-between-javascript-and-ecmascript-cba48c73a2b5]

And there we go! Page 75 of the PDF (page 61 internally numbered), section 8.3.117!

SGR – SELECT GRAPHIC RENDITION

And the 0 means = “default rendition (implementation-defined), cancels the effect of any preceding occurrence of SGR in the data stream regardless of the setting of the GRAPHIC RENDITION COMBINATION MODE (GRCM)”

To make things a little more readable, I made the color codes into variables:

#PROMPT
# To enter an emoji, while in insert mode type Ctrl-v, then enter the UTF8 code
# for the emoji, ex. U26A1 (note, use capital letters), then type ESC key. You
# can get a list of UTF8 emoji codes here: http://graphemica.com/
BLUE='\[\033[38;5;39m\]'
PALE_YELLOW='\[\033[38;5;229m\]'
RESET='\[$(tput sgr0)\]'
GREEN='\[\033[38;5;76m\]'
export PS1="${BLUE}\u${RESET}@${PALE_YELLOW}\h${RESET} [${GREEN}\w${RESET}]\n⚡${RESET} "

And there we go. Much too much time learning stuff! And my prompt looks like this:

 

 

 

 

 

And all of that to figure out what ‘sgr’ means.

Grab all of the domain names in Apache host file

Quick script I whipped up today to grab all of the domain names on a server.

#!/bin/bash

if [ -e alldomains ]
then
  rm alldomains
fi

alldomains=( $(find /etc/httpd/conf.vhosts/ -name *.conf) )

for domain in ${alldomains[*]}
do
  cat $domain | egrep "ServerName|ServerAlias" | egrep -v "#" | sed -e 's|ServerName||' -e 's|ServerAlias||' -e 's|www.||' -e 's|:80||' | tr -s ' ' '\n' | tr -d ' ' | sed -e '/^\s*$/d' >> alldomains
done

sort alldomains | uniq | sort -o alldomains

 

This gets all of the domains from ServerName and ServerAlias lines, takes out all of the white space and empty lines, and creates a file with just a list of the unique domain names.

This accounts for subdomains that use ‘www’ or have port :80 on the end.

For instance, www.somedomain.com and somedomain.com are the same, so the script takes out the ‘www.’ which leaves to copies of somedomain.com, which it then deletes one of them in the final output to the file. The same for ‘:80’.

 

THATCamp 2012

Believe it or not, this was my first ever THATCamp experience. Seems hard to imagine since I have worked at CHNM since before the inception of THATCamp.  Ah, well, the stars finally aligned, and was able to attend this year. And it was great.

 

I attended three workshops on Friday, and a couple of sessions on Saturday. Here are some thoughts…

Digital Data Mining with Weka: This was a neat crash course on what data mining is and is not, and one free tool to help do some of that. It was more of a “here’s what’s out there, how does it apply to the humanities” than a hands on, hands dirty, do some real work type of work shop. But it was good in that it opened up the possibility to do some data mining in the future. Since they were relatively small, here my notes:

What is data mining?

examples:

  • recommend ads, products to users based on purchases
  • finding patterns in large amounts of text (writing algorithms to find those patterns)

Goal of data mining is to predict what users/people will do based on data from other users/people

Data mining Tasks: classification, clustering, association rule discovery, sequential pattern discovery, regression, deviation detection

Goal is to predict the classification (or whatever) of future data, based on the current data that you analyze.

Cross validation (when done correctly you get true results, when not done correctly you get false results)

have multiple sets of data, one for training and the other for testing. Build the algorithm on the training data, then run it on the test. Then cycle through each testing data set and have it act as a training set. Do this way because you know the results for each, so you can tell if your algorithm is correct. When it’s good then you can use it on future data where you don’t know the result.

Interesting Things You Can Do With Git: This one was highly anticipated. I have been wanting/needing to learn git for a while now. For being in the IT field, having written some code, and even having a GitHub account with code on it, I’m ashamed to say I still don’t know how to use git effectively. There is not much you can do in 1.5 hours, and this was more a theoretical “here are some ideas”, than a “here is how to do it” approach.

The session on using blogs as assignments was maybe a bit premature for me. The session was really good, great ideas, tips, etc. But me teaching is still too far away for me to have put the mental effort in to following along much. I spent most of the time trying to find a good twitter client, but in the end just stuck with Adium.

Then I took some time to enjoy the hacker space. I decided it was time, and this was the perfect place to set up a transcription tool for my dissertation archive. So sitting at the very table where it is coded, I installed the Scripto plugin for Omeka. That’s a bit of a misnomer, since it is really a wrapper that lets Omeka and a Wikimedia install play nicely together. I went ahead and transcribed one of the documents in the archive as well. The archive is just in the testing phase, but here it is anyways: http://nazitunnels.org/archive

Nazi Tunnels Archive

The final event of THATCamp for me was one last session proposed by a fellow “camper”. She wanted help learning about the shell/terminal/command line. So I volunteered to help out with that. It ended up that there were about eight people that wanted help learning the command line, and four of us that knew what we were doing. So it ended up being a great ratio of help needed to those who could offer it. We started with the very basics, didn’t get much past a few commands (ls, cd, rm, nano, grep, cat), but we went slow enough that everybody who was trying to follow along was able to, and they all left with a clearer understanding of what the shell is for, and why it is useful. The proposer found a great tutorial/book for learning the command line that I’ll have to go through as well. You can always learn something new.

What was also great about that session, since it was basically ran by those who needed the help, I saw how those who struggle with these concepts learn them, so I will hopefully be better able to teach them to others in the future.

 

UPDATE: I forgot to mention the many cool sites and projects mentioned during Saturday morning’s Dork Shorts. Here’s a list of the ones I took notice of.

http://afripod.aodl.org/
https://github.com/AramZS/twitter-search-shortcode
http://cowriting.trincoll.edu/
http://www.digitalculture.org/2012/06/15/dcw-volume-1-issue-4-lib-report-mlaoa-and-e-lit-pedagogy/
http://luna.folger.edu/luna/servlet/BINDINGS~1~1
http://www.insidehighered.com/blogs/gradhacker
http://hacking.fugitivetexts.net/
http://jitp.commons.gc.cuny.edu/
http://www.neh.gov/
http://penn.museum/
http://www.playthepast.org/
http://anglicanhistory.org/
http://podcast.gradhacker.org/
http://dhcommons.org/
http://ulyssesseen.com/
http://m.thehenryford.org/

Filling in the missing dates with AWStats

Doh!

Sometimes AWStats will miss some days in calculating stats for your site, and that leaves a big hole in your records. Usually, as in my case, it’s because I messed up. I reinstalled some software on our AWStats machine, and forgot to reinstall cron. Cron is the absolutely necessary tool for getting the server to run things on a timed schedule. I didn’t notice this until several days later, leading to a large gap in the stats for April.

What to do?

Fortunately, there is a fix. Unfortunately, it’s a bit labor intensive, and depends on how you rotate your apache logs (if at all, which you should). The AWStats Documentation (see FAQ-COM350 and FAQ-COM360) has some basic steps to fix the issue, outlined below:

  1. Move the AWStats data files for months newer to a temporary directory.
  2. Copy the Apache logs with all of the stats for the month with the missing days to a temporary directory.
  3. Run the AWStats update tool, using AWStat’s logresolvemerge tool and other changed paramaters, to re-create the AWStats data file for that month
  4. Replace the AWStats data files for the following months (undo step 1).

The Devil’s in the Details

Again, depending on how you have Apache logs set up, this can be an intensive process. Here’s how I have Apache set up, and the process I went through to get the missing days back into AWStats.

We have our Apache logs rotate each day for each domain on the server (or sub-directory that is calculated separately). This means I’ll have to do this process about 140 times. Looks like I need to write a script…

Step 1. Move the data files of newer months

AWStats can’t run the update on older months if there are more recent months located in the data directory. So we’ll need to move the more recent month’s stats to a temporary location out of the way. So, if the missing dates are in June, and it is currently August, you’ll need to remove the data files for June, July, and August (they look like this awstatsMMYYYY.domain-name.com.txt where MM is the two digit month and YYYY is the four digit year) to a temporary directory so they are out of the way.

Step 2. Get the Apache logs for the month.

First step is to get all of the logs for each domain for the month. This will work out to about 30 or 31 files (if the month is already past), or however many days have past in the current month. For me, each domain archives the days logs in the following format domain.name.com-access_log-X.gz and domain.name.com-error_log-X.gz where the X is a sequential number. So the first problem is how to get the correct file name without having to look in each file to see if it has the right day? Fortunately for me, nothing touches these files after they are created, so their mtime (the time stamp of when they were last modified) is intact and usable. Now, a quick one-liner to grab all of the files within a certain date range and put their content in a new file.

We’ll use the find command to find the correct files. Before we construct that command, we’ll need to create a couple of files to use for our start and end dates.

touch --date YYYY-MM-DD /tmp/start
touch --date YYYY-MM-DD /tmp/end

Now we can use those files in the actual find command. You may need to create the /tmp/apachelogs/ directory first.

find /path/to/apache/logs/archive/ -name "domain-name.com-*" -newer /tmp/start -not -newer /tmp/end -exec cp '{}' /tmp/apachelogs/ \;

Now unzip those files so they are usable. Move into the /tmp/apachelogs/ directory, and run the gunzip command.

gunzip *log*

If you are doing the current month, then copy in the current apache log for that domain.

cp /path/to/apache/logs/current/domain-name.com* /tmp/apachelogs/

This puts all of the domains log files for the month into a directory that we can use in the AWStats update command

Things to note: You need to make sure that each of the log files you have just copied use the same format. You also need to make sure they only contain data for one month. You can edit the files by hand or throw some fancy sed commands at the files to remove any extraneous data.

Step 3. Run the AWStats logresolvemerge and update tool

Now comes the fun part. We first run the logresolvemerge tool on the log files we created in the previous step to create one single log file for the whole month. While in the <code>/tmp/apachelogs/</code> directory, run:

perl /path/to/logresolvemerger.pl *log* > domain-name.com-YYYY-MM-log

Now, we need to run the AWStats update tool with a few parameters to account for the location of the new log file.

perl /path/to/awstats.pl -update -configdir="/path/to/awstats/configs" -config="domain-name.com" -LogFile="/tmp/apachelogs/domain-name.com-YYYY-MM-log"

Step 4. Move back any remaining files

If you moved any of the AWStats data files (awstatsMMYYYY.domain-name.com.txt like for July and August in our example) now’s the time to move them back where they go.

 

Yeah, that fixed it!

 

Phew! The missing dates are back!