Category Archives: Design

Saleor Products Data Model

I’m looking to replace our Magento Ecommerce Server with something “Nice” – Something that is not complicated, not over abstracted, something that is well … fun to work with.

Saleor is an elegant solution. It is an Ecommerce app written in Python and Django.  I was able to download it and get it up and running in about 2 hours.    That says a lot.

Now, I need to write a product upload script.  I could not find a diagram of the Product Schema, so I drew one.  I did this by looking at products/models/base.py and images.py and the postgresql database tables.  It is open for comments and corrections.  Here it is:

 

 

Make is Simple

Complexity is only Simplicity x 1000 — A mathematician friend once told me that. The idea struck a chord with me and has stuck with me ever since.

Everything in this world is complex.   When we design computer systems our goals should be to encapsulate that complexity into simple business level concepts.  They in turn can be broken down into smaller and smaller simple systems – and then no part of the system will be complex on its own.

Here is an example.  Say you need to download and update pricing information on a regular bases.   The new process should simple be called ‘Update Prices’.

I – The process could have the following simple breakdown:

  • Update Prices
    • Download Pricing
    • Load New Pricing Data
    • Update Pricing Data

II  – The first one could further be broken down into

  • Download Pricing:
    •  Get External Resource URI from config
    • Get External Source Dir from config
    • Get Local Destination Dir, from config
    • SFTP Download files

III – The first one here could further be broken down into

  • Get External Resource URI from config
    • Create a Config Module that reads YAML config file
    • Create environment variable to set config file name
    • Provide getter methods for each config value
    • Create YAML config files, (one for dev, staging, and prod)

IV – The first one here again, can be further be boken down into

  • Create a Config Module that reads YAML config file
    • Load YAML parser
    • Set config filename = config.basedir + / + ENV(CONFIG)
    • yaml.parse(read filename)

And nothing in it is complicated. So break it down.  Make it simple!   Everyone will be happy.

 

 

DB, Config and Logging

The Database, Config and Logging are at the core of all application development.  Everything starts there.

A Database System stores all of the applications data.  The storage itself can be anything from a loose collection of files like jpegs, a nosql db like mongodb,  or a full fledge relational database like mysql or postgresql.   This subsystem sits right below any Model Objects.

A Configuration System allows you to configure your application without changing code.  It primarily determines where things are, such as: subdirectories, external web services, database credentials, log files etc.    You will have a different configuration file for each type of environment: dev, stage, and prod

A Logging System should be built in from the word go.   All backend tasks should be logged.  One philosophy is that all events write a single line entry of Success or Failure.   Each event can optionally also write multiple Warning or Debug logging messages.  When done this way your logs are themselves databases of all system events.   You can can easily answer questions like how many transactions processed today, and how many of them failed.

These three subsystems are needed for all application development. Here is a link to a library I have written that provides these functions.  Which I reuse it all the time.

Python Application Development – Core Library Classes – https://github.com/dlink/vlib

 

 

Less Is More

You know you’re doing a great job when, adding new functionality to a code base, and you wind up removing more code then you’ve added.  Less is More.

I once heard boast a programmer that he’d written million lines of C code.  That seemed undesirable to me.  Better you should boast that you’ve written an entire system in under 1,000 lines of code.

Doing more with less, requires good design and a lot of reusable code.   Doing more with less means you have less to maintain, and less that can go wrong.

So pride yourself on how small your code base is, not how large it is.

Programs should be like Gardens

I believe programs should be like gardens.   They should be lovely to see.   Gardens are things we walk around in and return to often.  Gardens are things we wish to spend time in with others – to relax in and enjoy its beauty.

If we strive to make your programming systems like gardens, then our days as programmers will be filled with pleasure.

 

Variable Naming Conventions

A simple rule to help improve convention and maintainability: Camelcasing should be used when dealing with Objects,  Everything else should use underscores.

Objects
  • Class definitions should be CapCase
  • Objects should be mixedCase
  • Class Methods should be mixedCase
class BookAuthors(object):    # <-- Class Definition
   def getAuthors(self, book_id):  # <-- Class Method
      ...
      return results

bookAuthors = BookAuthors()   # <-- Object
Everything Else

When not dealing with objects we should use underscores and lowercase

  • Simple instance variables
  • Functions
  • Filenames / Modules
  • SQL schema, table and column names
sql_pretty(sql):
   ...
   return sql2

num_books = len(books)
fp_debug = open('/tmp/debug.log', 'a') 

sql = 'select book_id, author_id from books.book_authors where book_id = ?'

This varies slightly from the PEP8 Naming conventions.   The PEP8 suggest class methods and non-class methods (or functions) be treated the same way, with underscores. However I prefer naming class methods with mixedCase to remind us we’re in the domain of objects.

With regard to sql schemas, table and column names, because column names can become very long, underscores makes them more readable.

Defensive Programming is the way to go.

Defensive programming is like defensive driving: Anticipate everything that might go wrong. If a function is passed an Id to a database table, do not assume that it is a valid Id, or an integer, or even that it has a value.

What is most important in defensive programming is to communicate clear and precise error messages when things are not as they aught be.

Here are some examples error messages:

Less then ideal error messages:

AttributeError: 'NoneType' object has no attribute 'last_name' 

_mysql_exceptions.OperationalError: (1054, "Unknown column 'Jerry' in 'where clause'")

IndexError: list index out of range

KeyError: 'Jerry'

Better ones:

BookError: Book not found: id = Jerry

AuthorError: Author not found: id = 506

getCustomers command: Expected 3 parameters, only 2 given.

FoomWebsiteError: Unable to read from http://foom.com: HTTP 500

These better error messages are not hard to do if we think about it ahead of time. Here are some examples:

class BookError(Exception): pass

class Book(object):
   def get(self, id):
      results = self.db.query(select * from books where id = ?, id)
      if not results:
         raise Bookerror('Book not found: id = %s' % id
   return results[0]

Another example:

URL = 'foom.com'
class FoomWebsiteError(Exception): pass

class FoomWebsite(object):
   def scrapePage(path, params):
      website = Website(URL)
      page = website.go(path, params)
      if website.error:
         raise WebsiteError('Unable to read from %s: %s'
            % (URL, website.error)
      lines = page.split('<p>')
      name = lines[3]
      return page

 

It is okay to have bugs if they are easy to find and easy to fix. Applying a little defensive programming to everything we write make debugging a breeze, and helps everyone using the system.

Working Towards the Ideal

Whether building a new system, or trying to untangle some untenable rats nest of code — One should have an image of the ideal state in their mind.

If we draw up the best plan we can, thinking in terms of the perfect – regardless of its immediate feasibility – we can then put that on the wall and work towards it.

All changes to code come in two forms:  Bug Fixes and Feature Enhancements.   There are many decisions to make in each case.  By holding up an Ideal, it helps us with those decisions.   We choose the path that helps the code converge on some well conceived plan  rather than letting it oscillating around, as is often the case.

The ideal plan is rarely realized but that’s not the point.  By aiming at one target we help the general direction of all the arrows.

So we must spend a lot of time designing the ideal.  To do that we need to create documents such as these:

– Problem Statements
– Use Case Diagrams
– Data Model Entity Relationship Diagrams (ERD),
– Class Diagrams
– Sequence Diagrams
– Wireframes, and
– Mock Reports

It is not enough to simple understand a single aspect of the system and go to work implementing it.   We must take the extra time to see how that component fits into the larger whole. The benefits of which creates flexible, easy to use and fun to maintain code.

It was Abe Lincoln who said: Give me six hours to chop down a tree and I will spend the first four sharpening the axe.

Libraries over Frameworks

What is the benefit of using a framework like Ruby-on-Rails, Pylons or Drupal?  Simply put, it helps us start and develop code from nothing quickly.   But it does not have long lasting staying power.   If you plan to be using the system for years to come – the advantage of the quick start up is out weighed by the restrictions placed upon you by the framework.

So often in my experience working on legacy systems in various industries the Framework itself becomes the enemy of quick bug fixes and feature enhancements.

I prefer to use libraries rather than using frameworks.  We all need libraries because we don’t want to reinvent everything like database connectivity, yaml parsing, json parsing, and interesting things like zipcode distance calculations, etc.

But frameworks are something we can write ourselves very easily.   The advantage of doing so is the ability to understand and have complete control over everything in the system, and only have those things we really need, and nothing else.

Lightweight “frameworks” like Cherrypy and Flask for python and Sinatra for ruby, which can be considered libraries for http-routing, rather than frameworks, are more elegant solutions than full fledged MVC frameworks.

 

 

Data Layer is King

It is far better to have an excellent Database Design and a crappy Application Layer, than an excellent Application Layer with a crappy Database design.

Why is that?  Simply put, you can always skim your Application code off the top and write a new one.   But you can not simply swap out the Data Layer without completely undermining the Application.

The closer we can model our business logic in the database itself – the better it is for the business.  When new unforseen questions arise about the business, we can always run adhoc queries from a well designed database schema to get the answers.

For example:

  • Who are our top customers?
  • What is revenue month over month?  Year over year?
  • What is the distribution of sales by product types across all income channels?
  • What percentage of our customers pay late over 30 days?
  • What time of day do we have the most volume?
  • What is the percentage of repeat customers vs first time buyers?
  • Has revenue gone up since the last website revamp?

All of these questions are very easily answered with a little SQL magic, when the database is well designed.  And all of these questions can be horrendously hard to answer when it is not.

For all important systems, the Data Layer must be King.