Category Archives: Uncategorized

Unix is the most powerful development tool

The Unix operating system is the most powerful development tool.  It doesn’t matter if you’re working with a LAMP/LEMP stack, Java, NodeJS, or SQL or any other coding language, Unix allows you to develop code and computer systems more efficiently.

Developed in the 60s in Berkeley California, curiously at the same time LSD was being experimented with in Berkeley, it still proves today to be one of the great general purpose operating systems.  It has evolved very much since then and most us use a version of it today now called Linux.

The main Unix philosophy is that everything is a file, and can be treated as such.  Source code, executables, I/O, and processes are files.   This allows you to manage and manipulate everything on the computer system in a unified way.

Simple Unix commands like grep, sed, awk, cat, cut, head, and many others can be strong together to do very sophisticated things, like parse a spreadsheet and pull out meaningful data from selected parts, or managing large computer programming projects.

These days I work mostly with Python, PHP, SQL, HTML/Javascript/CSS and API calls, but what really pulls it all together and makes work a pleasure is Unix.  It is were it all starts.

 

 

Data is King, is KING!

Good Data makes life easy; If we model all our the business logic in the database then the application layer is trivial.

Case in point:  I’m implementing a User Access Module.   First I define a list of Access terms:  Things like dashboard (allow view dashboard),  orders (allow view orders report), download (allow CSV downloads), etc.

Then I come up with a list of Roles: CS Agent, and Executive, will do for now.

Then I create a role_access many-to-many join table with just id, role_id, and access_id.

Then I add a role_id to the users table.

Then I painfully craft all the static data in access, roles, and access_roles tables in csv files that gets loaded when the tables are created.

When that’s done it looks like this.

User_Access_Data_Model

From there the work in the application layer is easy.    The caller simply looks like this

if self.user.has_access('download'):
    output += self._getDownloadCSVButton()

Implementation of this functionality is also trival.  We add the following two lightweight methods to the user model class:

class User(object):
    ...
    @lazyproperty
    def access(self):
        '''Return users access list'''
        sql = '''
           select a.code as access
           from   users u
                  join role_access ra on ra.role_id = u.role_id
                  join access a on ra.access_id = a.id
           where  u.id = %s'''
       results = self.db.query(sql, params=(self.id,))
       return [r['access'] for r in results]

   def has_access(self, access_name):
       '''Return True or False if this user has the given access'''

       if access_name in self.access:
           return True
       return False

And that’s all there is to, it.  Express your business logic in the database and the rest follows easy.

 

Always diff before you commit

If you don’t diff before you commit you don’t know what you’re committing.

When programming anything even remotely interesting, we’re making changes all the time.   Change, change, change.  Phase 1, phase 2,  experiment 1, experiment 2, etc.  And we’re using Version Control Systems, like git.  (If not, stop reading and learn git)

If you believe as I do, you should only commit working code, then we must safeguard our git repos from danger from ourselves.  That’s why you should always run git diff and scrutinize the changes before git commit, to make sure debug, merge conflicts, and other unwanted changes are not making into the big time.

It’s like look both ways before you cross.   Diff before you commit – make it a part of your stitch in time.

Unix House Keeping

Where is everything? Where do files go in Unix? There are no bonafide rules. But we’re free to invent some. So here are mine.

The Heap

/data – Everything starts off in this subdirectory. Consider it the Heap from which all other areas are carved out of. We’re using the term data in its most broadest sense.

/data/apps – Installed apps that don’t install themselves into the woodworks.

/data/backups – Back up files

/data/downloads – Downloaded files

/data/www – Website Root dir

Symlinks

We then symlink them to root to make them easier to get to

# ln -s /data/apps /apps
# ln -s /data/backups /backups
# ln -s /data/downloads /downloads
# ln -s /data/www /www

This gives:

/apps
/backups
/downloads
/www

Websites

/apps/crowfly – Production git master branch

/home/dlink/crowfly – Development version – git feature branches

/www/crowfly.net –> /apps/crowfly/www – A symlink to the web portion or the public portion of a production application

/www/dev.crowfly.net –> /home/dlink/crowfly/www – A symlink to the web portion of a development branch.

/www/crowfly.net/purple-bat.com –> /apps/purple-bat/www – A symlink to the web portion of an application that does not have its own domain name.   It lives over the parent domain.   The source code of the parent needs to list the symlink in its .gitignore file.

Backups

/backups/flower_images_20141203.tgz – A backup of images tarred and zipped – with timestamp in name.

/backups/db/books_20150220.dump.gz – A database dump, zipped – with timestamp in name.  These are created like this:

mysqldump -uuser -ppass -hhost db | gzip - > /backups/db/books/20150220.dump.gz

and they are recovered like this:

alias bookdb='mysql -uuser -ppass -hhost db'
echo 'create database books' | booksdb
gunzip -c /backups/db/books_20150220.dump.gz | booksdb

Logs

There is such a long tradition of placing logs in /var/log that we don’t  moved to /data.

Application Logs

/var/log/plant/plant.log – Production log
/var/log/plant/stage_plant.log – Staging Log
/var/log/plant/dev_plant.log – Development Log
/var/log/plant/dlink_plant.log – Alternative Development Log (using username)
/home/dlink/log/dev_plant.log – Alternative Development Log (using user homedir)

/var/log/plant/plant_traceback.log – Production Traceback Log
/var/log/plant/stage_plant_traceback.log – Development Traceback log

Web Logs

/var/log/apache2/crowfly.net.log – Apache Log for site.
/var/log/apache2/crowfly.net.error.log – Apache Error Log for site (periods (.) not dashes)

Advantages

What’s nice about adopting a standard for where files go on all your systems (Your local mac included) is you can easily move files around.  Here is an examples.

cd /backups/db
scp flyingcrows.com:$PWD/world_2014*.dump.gz .

Data Layer is King

It is far better to have an excellent Database Design and a crappy Application Layer, than an excellent Application Layer with a crappy Database design.

Why is that?  Simply put, you can always skim your Application code off the top and write a new one.   But you can not simply swap out the Data Layer without completely undermining the Application.

The closer we can model our business logic in the database itself – the better it is for the business.  When new unforseen questions arise about the business, we can always run adhoc queries from a well designed database schema to get the answers.

For example:

  • Who are our top customers?
  • What is revenue month over month?  Year over year?
  • What is the distribution of sales by product types across all income channels?
  • What percentage of our customers pay late over 30 days?
  • What time of day do we have the most volume?
  • What is the percentage of repeat customers vs first time buyers?
  • Has revenue gone up since the last website revamp?

All of these questions are very easily answered with a little SQL magic, when the database is well designed.  And all of these questions can be horrendously hard to answer when it is not.

For all important systems, the Data Layer must be King.

 

 

Why SHOUT in your SQL?

Somewhere someone started the tradition of capitalizing keywords in SQL.   Its like shouting all the unimportant parts of the code and making what’s important harder to read.

   SELECT b.id as book_id, a.name as author, b.pub_date as publication_date FORM books b LEFT JOIN authors a on b.id = a.book_id WHERE foo_bar = 'Fibonacci';

I say stop doing that.  And use instead super indentation and formatting.  Pseudo English is easier to read when it is not capitalized.

   select
      b.id       as book_id,
      a.name     as author,
      b.pub_date as publication_date
   from
      books
      left join authors a on b.id = a.book_id
   where
      foo_bar = 'Fibonacci'

See: sql_prettyHelp