Argparse -- or how to tell your program what you actually want

Computer programs usually need some input to operate. That can be the click of a cursor or some text that the program is operating on. There is, of course, an easy way to avoid dealing with input in Python scripts: you can just adapt the script every time you want to run a different setup.

While this might a convenient way of dealing with input if one is playing around with a script, it becomes quickly unfeasible as soon as we want to run large samples of input, e.g. on a cluster. We would have to edit the file every time, we change the input. Apart from being a nightmare for the user who has to touch the code every time, this also does not play nice with the idea of version management (git, etc).

Luckily, there is an easier way: we can just pass parameters (aka input) via the command line.

TL;DR

Python offers multiple ways to parse command line parameters. The most convenient one is to use the argparse package.

Don’t write it yourself – use argparse

The most direct way to access command line parameters is the system package sys. Via

sys.argv

you obtain a list of strings representing command line call that was executed to start the program. As an example,

python foo.py -a 3 -b 5

leads to

['foo.py', '-a', '3', '-b', '5']

In principle, this already does the job. You have access to all command line parameters and you can parse them yourself now.

Wait, parse the arguments? Yes, all variables are strings and to add the two values for a and b someone has to convert them to the correct number format. Also, the inputs -a 3 -b 5 should lead to the same result as -b 5 -a 3, but the two lists returned by sys.argv are different.

While writing a parser is a nice exercise, it is usually not the goal of the project. Originally, we just wanted to give some input parameters to our code.

Luckily, Python has the argparse package which already does the heavy lifting of parsing for us. We can define an ArgumentParser class which defines an interface, automatically generates a help for the user and converts the input to the intended types.

Defining a Command Line Interface

  1. As tradition demands, we start by importing the package import argparse. Since commandline arguments are only used when the program is called directly as an executable, we can move this import to the execution guard if __name__ == "__main__:" (see full code below).
  2. We can now instantiate the ArgumentParser object, that will do all the heavy lifting for us.
parser = argparse.ArgumentParser()

If you like a properly formatted help text including defaults of the variables that you can access with the command line argument --help, we can just add a pre-defined formatter class to the call

parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
  1. Now, we are ready to define the actual interface that our users are supposed to use. By adding variables prefixed with two dashes (--x), they will be considered as optional inputs. Required inputs are specified with a single dash (-x). If you have to match a legacy convention, you can overwrite this behavior with the function argument required=<True/False>. The following snippets defines two arguments a and b and adds the help texts.
parser.add_argument("--a", help="Parameter a")
parser.add_argument("--b", help="Parameter b")
  1. Finally, we obtain the parsed parameters with a call to the method parse_args our parser object.
args = parser.parse_args()

We can access the values of a and b simply as args.a and args.b. In principle, we could end here, but argparse has a couple of additional features that can make our life even simpler.

Setting defaults

Since we declared both parameters as optional, we should probably add a reasonable default value. Here, I chose to set a to 1, if it is not given.

parser.add_argument("--a", default=0.0, help="Parameter a")

Enforcing types

More importantly, argparse allows us to define types for the arguments.

parser.add_argument("--a", default=0.0, type = float, help="Parameter a")

Now, the user can only add values that can be cast to float. Otherwise, argparse throws and error and we don’t have to deal with type-checking in our code. It makes the resulting program significantly more compact.

argparse has a plethora of options like subparsers and more advanced settings that I did not cover in this post. If you are interested, have a look here.

The full code

The snippets above are taken from an example that I already used in the post about the logging functionality.

Here, we add a command line interface to the same script which reads the number a and b and uses them as input parameters.

import numpy as np
import logging
import sys,os

def args2logname(args):
    """Construct a name for the log file from the given arguments"""
    return f"log_a_{args.a}_b_{args.b}.log"

def add_numbers(a,b):
    """Dummy function to add two numbers

    Args:
        a (float): First Number
        b (float): Second Number

    Returns:
        Float: Addition of the two numbers
    """
    return a+b

def main(args):
    # Set up the logger
    h_stdout = logging.StreamHandler(stream=sys.stdout)
    h_stderr = logging.StreamHandler(stream=sys.stderr)
    h_stderr.addFilter(lambda record: record.levelno >= logging.WARNING)
    logging.basicConfig(
        level = args.level.upper(),
        format = "%(asctime)s [%(levelname)s] %(message)s",
        handlers = [
            logging.FileHandler(args2logname(args)),
            h_stdout,
            h_stderr
        ]
    )
    
    #Start of main function
    # Let's print the initial parameters given via the command line as INFO level
    logging.info("=====================")
    logging.info(f"Parameter a: {args.a:.2f}")
    logging.info(f"Parameter b: {args.b:.2f}")
    logging.info("=====================")
    logging.debug(f"Starting computation with parameters {args.a} and {args.b}")
    c=add_numbers(args.a,args.b)
    logging.debug(f"End computation with parameters {args.a} and {args.b}")

    logging.info(f"Result of computation: {c:.2f}")

# This is an execution guard.
# This parts only gets executed when you call the file directly via "python <....>" 
if __name__ == "__main__":
    # This takes care of the command line interface
    import argparse
    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)

    parser.add_argument("--a", required=True,type=float, default=0.0, help="Parameter a")
    parser.add_argument("--b", type=float, default=1.0, help="Parameter b")
    parser.add_argument("--level", default="info", help="logging level")

    args = parser.parse_args()
    print(args)
    main(args)
Patrick Emonts
Patrick Emonts
Postdoctoral Researcher

My research interests include tensor networks, lattice gauge theories and quantum information.