Category Archives: Unix

Safety Belts Off

When should we use secure practices and when not to?  Like all good advice secure practices need to be weighed against practicality.

In a perfect world your system will have full test coverage, automated testing, database migration scripts, automated deployments, adhere to common coding practices, have identical dev and test environments, as well as excellent documentation.

In that perfect world the best way to protect production systems is not to allow logins – and to do everything through automation.

But what if that is not the case?  What if you’re supporting a legacy system that is hacky, poorly designed – and the business is relying on it at the same time?

In this case your best line of defense is think creatively and have full access to all production systems – so you can react quickly when things go wrong.  If automation can not be relied upon – then its hands on test, tweak if you have to, retest, and commit.   Until you can replace the system with something more stable.

Sometimes you need to take your safety belt off, if only for a short time.

Take some lines

Do you ever want to take some lines from the middle of a file say, lines 300-400?   You can do that in a kludgey way with Unix head and tail:

   cat ids.csv | head -400 | tail -100

The take command from vbin tools does just that.

   cat ids.csv |take 300 400

Here we flatten the results and build an SQL IN statement:

   cat ids.csv |take 20001 30000 |tr '\n' ' ' | sed "s/ /, /g" > get_customers.sql

By adding a little SQL before and after the list we get something like this:

select
   distinct u.customer_id, u.state
from
   imports i
   join usage u on i.id = u.import_id
where
   i.id in (
40722, 41483, 50364, 52623, 53049, 54795, 73451, 
 ... (thousands of ids here) ... 
986764, 986764, 986764, 986764, 986764, 986764
);

 

So, go ahead, takes some lines.

Also see clip, for panning right to left.

csv as a database table

CSV files can be used on the Unix command line like database tables.  grep can act as a where-clause, and awk can be used as column selector.  However the csv utility from vbin makes it easier.

Here it is.

   $csv -p books.csv
   id, isbn,          name,                                    publicated, type
   1,  9781557427960, The Picture of Dorian Gray,              1891,       Novel
   2,  9780140283297, On the Road,                             1957,       Novel
   3,  9781851243969, Frankenstein; or, The Modern Prometheus, 1818,       Novel
   4,  9780345347954, Childhood's End,                         1966,       Novel
   5,  9780451457998, A Clockwork Orange,                      1962,       Novel
   6,  9780440184621, Tai-Pan,                                 1982,       Novel
   7,  9780486266848, Another Turn of the Screw,               1898,       Novel
   8,  9780486280615, Adventures of Huckleberry Finn,          1884,       Novel
   9,  9780143104889, A Princess of Mars,                      1917,       Novel
   13, 9781614270621, The Prophet,                             1923,       Poetry
   21, 9780374528379, Brothers Karamazov,                      1880,       Novel

The -p makes the output pretty and easy to read. Similar to MySQL’s desc output.  Here is another example.

   $ csv -s books.csv
   1. id
   2. isbn
   3. name
   4. publicated
   5. type

The -s shows header info. Useful for choosing or rearranging columns by number:

   $ csv -c4,3 books.csv |grep ^18 |sort -n
   1818,Frankenstein; or, The Modern Prometheus
   1880,Brothers Karamazov
   1884,Adventures of Huckleberry Finn
   1891,The Picture of Dorian Gray
   1898,Another Turn of the Screw

This example chooses “published” and “name” columns (switching their order – something Unix cut can not do), and selects only those in the 1800s.

Here’s a humdinger:

   ./providers.py list | grep -ve '^id' -e '^$' |csv - -c2| sort | while read p; do echo -n "$p,"; ./providers.py customers $c | wc -l ; done | csv - -p

This takes the list output of some providers.py script, remove lines not beginning with id, and blank lines, graps the second column, sorts them, and then sends them back into the providers.py script to get a count of customers for that provider. The output might look something like this:

   ACE,      96
   NYSE,     1300
   OPC,      1400
   PGEG,     560
   VERT,     131
   VERT-SCO, 1430

The dash (-) allows csv to process <STDIN> rather than a given filename, like Unix’s gzip does.

grep ‘^$’

That’s wizardry for you.   How many times have you typed grep ‘^$’?

Do you grep?

grep, the Unix command that searches files for patterns, is one of the most useful Unix utilities.   This example here grep ‘^$’ passes a regular expression with two characters:  a carrot (^) which means the beginning of line, and a dollar sign ($) which means, end of line.  Taken together they say, return me ever blank line – that is, has nothing between beginning and end of line.

Here it is in action.  Show me everything that is meaningful in my apache configuration file – that which is not a comment and is not a blank line.  The -e allows multiple patterns.  The -v of course reverses things, it shows everything not matching the pattern.

grep -ve '^#' -e '^$' apache2.conf | more

 

sav, savdiff, and unsav

vbin has a very useful command called sav.

You always need to make a backup copy of a file before messing with it.  A simple

   $ cp book.xml -p book.xml.sav

will do.  The -p preserves the original date and ownership of the file.

sav does this for you. In other words:

   $ sav book.xml
   cp -p book.xml book.xml.sav

Its that’s simple.  Its just a convenient way to make a quick backup before you edit something.  If you want to see the changes since you backed it up, you can use savdiff:

   $ savdiff book.xml
   diff book.xml.sav book.xml

If you want to revert back you can use unsav:

   $ unsav book.xml
   cp -p book.xml.sav book.xml

Why is Unix Beautiful 2

Here are some more examples of beautiful things you can do in Unix.

1.) Execute an SQL script against a database called db3, translate the output into CSV and report how long it took to execute.

$ time cat usage.sql | db3 |tr '\t' ',' > usage.csv

real 0m0.605s
user 0m0.004s
sys 0m0.001s

2. Compare the output with yesterdays output

$ diff usage_20141022.csv usage.csv > usage.diff

3.) Show a frequency histogram of changes since yesterday on columns 6 and 7.  (See future posts on histogram, and csv )


$ grep '^>' usage.diff |cut -c3- |csv - -c6,7 | histogram

| 172403 CONED,CONEDG
|   5000 NYSEC,NYSECG
|   4308 LPH,LPHG
|     15 NULL,CGO2
|      1 orig_util,new_util

4.) Look for all cases of ‘MATCH’ and for columns containing GAS, ignoring case, display to the screen and also capture to a file.

$ grep -ie match -e ",gas," usage.csv |tee gas_matches.csv

Why Unix is Beautiful

Unix with the Bash shell is beautiful because you can string a list of simple commands together to instruct computers to do complex things.  Folks sometimes refer to this as “sedgrepawk”.  Those are Unix commands with cryptic names – they sound like mysterious incantations of wizardry.  And in effect they are.

Here are some examples:

1. What users are on this box?

The /etc/passwd file contains a list of all users the box.  To see the last 5 people added to system you can type:

$ cat /etc/passwd | tail – 5

The dollar sign is the prompt (you don’t type that)

the cat command lists out the contents of a file.

Piping that (|) into tail -5 lists the last 5 rows.

The results may look like this:

dlink:x:2000:2000:David Link:/home/dlink:/bin/bash
mysql:x:104:111:MySQL Server,,,:/nonexistent:/bin/false
jzlink:x:2001:2000:Julia Link:/home/jzlink:/bin/bash
ftp:x:105:112:ftp daemon,,,:/srv/ftp:/bin/false
postfix:x:106:113::/var/spool/postfix:/bin/false

This shows  name, x, user id, group id, description, home directory and default shell for each user.  x is a place holder for the password.

What is Unix?

The Unix Operating System, born 45 years ago, and the Bash Shell, born 25 years ago – are the two most powerful programming tools ever created.

Unix was the brain child of Ken Thompson and Dennis Ritchie while working in the mid-1960s.  Today most of us use a Linux, Linus Torvalds’ variant of Unix release in 1991.

Bash was developed in 1977.   It is published under the GNU Public license.

Combined, these two pieces of software make the most formidable tool in any developers arsenal.   Unix, is a multi-user, mult-tasking, file system centric core that talks to the computers CPU (its brain), which in turn controls all the computer resources.    Bash is a command line shell wrapped around Unix giving humans the ability to interact with the Unix core in a pseudo English way.

This example, typed after the bash prompt ($) reads ‘List files in long format’, or simply ‘L S minus L’:

$ ls -l

The results might look like this:

drwxr-xr-x 2 dlink dev 4096 Sep 28 20:03 200px
-rw-r–r– 1 dlink dev 230678 Feb 8 2014 Angel.jpg
-rw-r–r– 1 dlink dev 46252 Dec 1 2013 Black_Horse.jpg
-rw-r–r– 1 dlink dev 1379310 Apr 25 13:20 Buddha_watercolor.png
-rw-r–r– 1 dlink dev 22673 Sep 28 19:45 Chacmool.jpg
-rw-r–r– 1 dlink dev 52911 Dec 1 2013 Flowers_3-29-2013.jpeg
-rw-r–r– 1 dlink dev 32078 Sep 28 19:45 Good Nature.jpg

The Machintosh Operating System is a Unix variant.   The Windows Operating System is not, and should be avoided for all serious development work.