7 Lesser-Known Command Line Tools That Ship with Python

Like most people, I mostly interact with Python using the default REPL or with IPython. Yet, I often reach for one of the Python tools that come with the standard library. All these tools are implemented as “mains” in the various scripts and modules. Here are 7 I use on a semi-regular basis.

1. & 2. Decompress and Archive Files

It’s not uncommon for me to be using a remote server, or someone else’s machine, where I don’t readily have access to tools to compress and decompress files from the command line. For .zip files, I reach for the zipfile module. I can unzip a file into the current directory with:

$ python -m zipfile -e myarchive.zip .

Or create one with:

$ python -m zipfile -c myarchive.zip file1.txt file2.jpg file3.bin

The tarfile module offers the same capabilities for tar files.

3.  Serve Files Locally

When doing web development, I sometimes need to spin up a server to make sure things work properly. In a pinch, I reach for Python’s built-in server, http.server. By default, it serves the content of the current folder to the address http://localhost:8000.

$ python -m http.server

You should use this server for local development only. The docs make it clear that the server is insecure.

4. Inspect JSON Data

When quickly exploring web APIs I haven’t worked with, I’ll often reach for the curl command line tool instead of going straight to Python. In that case, I’ll pipe the output through json.tool to make the JSON more readable. It works with files too, if you have the data locally.

$ cat data.json
[{"name": "hydrogen", "atomic_number": 1, "boiling_point": 20.271},
{"name": "titanium", "atomic_number": 22, "boiling_point": 3560}]

$ cat data.json | python3 -m json.tool
[
    {
        "name": "hydrogen",
        "atomic_number": 1,
        "boiling_point": 20.271
    },
    {
        "name": "titanium",
        "atomic_number": 22,
        "boiling_point": 3560
    }
]

When I need to do more than look at the JSON file, like filtering it or transforming it, I reach for the amazing jq JSON processor.

5.  Look Up Documentation

I’m nearly always looking up documentation within IPython and Jupyter by appending a ? to a function call or by using the Shift-Tab-Tab shortcut, but sometimes I need a little more. There I reach for the pydoc module. Given a module name, pydoc extracts all the docstrings, including private and dunder methods. The help() function is built on top of it.

$ python3 -m pydoc os.path
Help on module posixpath in os:

NAME
    posixpath - Common operations on Posix pathnames.

MODULE REFERENCE
    https://docs.python.org/3.11/library/posixpath

    The following documentation is automatically generated from the Python
    source files. It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
   implementations. When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    Instead of importing this module directly, import os and refer to
    this module as os.path. The "os.path" name is an alias for this
    module on Posix systems; on other systems (e.g. Windows),
    os.path provides the same operations in a manner specific to that
    platform, and is an alias to another module (e.g. ntpath).

    Some of this can actually be useful on non-Posix systems too, e.g.
    for manipulation of the pathname component of URLs.

FUNCTIONS
    abspath(path)
        Return an absolute path.

[… continues for a while …]

You can search by keyword in your whole environment. It’s pretty handy (but very slow!):

$ python -m pydoc -k array
ctypes.test.test_array_in_pointer
ctypes.test.test_arrays
test.test_array - Test the arraymodule.
test.test_bytes - Unit tests for the bytes and bytearray types.
array
numpy.core - Contains the core of NumPy: ndarray, ufuncs, dtypes, etc.

You can even start a web server to navigate the documentation of all the packages in your environment with the -b option:

$ python -m pydoc -b

6.  Check the Date

I’ll admit, I’m a nerd. I may or may not have considered using a terminal-based calendar application as my main calendar…it didn’t stick, but I still sometimes reach for the calendar module when I need to see some dates in context, especially for dates “far” into the past or future. Turns out Y2K was a Saturday.

$ python -m calendar 2000 1
    January 2000    
Mo Tu We Th Fr Sa Su
                1  2
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31

7.  Visit the Tab Nanny

I have used the Tab Nanny once or twice in my life, but modern editors take care of this for you: they don’t let you mix tabs and spaces in your Python code.

# The -t displays tab characters as ^I
$ cat -t that_is_no_good.py
def hello_world():
    who = "reader"
^Iprint(f"hello {who}")

$ python -m tabnanny that_is_no_good.py
that_is_no_good.py 3 '\tprint(f"hello {who}")\n'

But what I like most about the Tab Nanny is the first line of the file: “The Tab Nanny despises ambiguous indentation. She knows no mercy.”

 

Author: Alexandre Chabot-Leclerc, Vice President, Digital Transformation Solutions, holds a Ph.D. in electrical engineering and a M.Sc. in acoustics engineering from the Technical University of Denmark and a B.Eng. in electrical engineering from the Université de Sherbrooke. He is passionate about transforming people and the work they do. He has taught the scientific Python stack and machine learning to hundreds of scientists, engineers, and analysts at the world’s largest corporations and national laboratories. After seven years in Denmark, Alexandre is totally sold on commuting by bicycle. If you have any free time you’d like to fill, ask him for a book, music, podcast, or restaurant recommendation.

 

Share this article:

Related Content

Revolutionizing Materials R&D with “AI Supermodels”

Learn how AI Supermodels are allowing for faster, more accurate predictions with far fewer data points.

Read More

Digital Transformation vs. Digital Enhancement: A Starting Decision Framework for Technology Initiatives in R&D

Leveraging advanced technology like generative AI through digital transformation (not digital enhancement) is how to get the biggest returns in scientific R&D.

Read More

Digital Transformation in Practice

There is much more to digital transformation than technology, and a holistic strategy is crucial for the journey.

Read More

Leveraging AI for More Efficient Research in BioPharma

In the rapidly-evolving landscape of drug discovery and development, traditional approaches to R&D in biopharma are no longer sufficient. Artificial intelligence (AI) continues to be a...

Read More

Utilizing LLMs Today in Industrial Materials and Chemical R&D

Leveraging large language models (LLMs) in materials science and chemical R&D isn't just a speculative venture for some AI future. There are two primary use...

Read More

Top 10 AI Concepts Every Scientific R&D Leader Should Know

R&D leaders and scientists need a working understanding of key AI concepts so they can more effectively develop future-forward data strategies and lead the charge...

Read More

Why A Data Fabric is Essential for Modern R&D

Scattered and siloed data is one of the top challenges slowing down scientific discovery and innovation today. What every R&D organization needs is a data...

Read More

Jupyter AI Magics Are Not ✨Magic✨

It doesn’t take ✨magic✨ to integrate ChatGPT into your Jupyter workflow. Integrating ChatGPT into your Jupyter workflow doesn’t have to be magic. New tools are…

Read More

Top 5 Takeaways from the American Chemical Society (ACS) 2023 Fall Meeting: R&D Data, Generative AI and More

By Mike Heiber, Ph.D., Materials Informatics Manager Enthought, Materials Science Solutions The American Chemical Society (ACS) is a premier scientific organization with members all over…

Read More

Real Scientists Make Their Own Tools

There’s a long history of scientists who built new tools to enable their discoveries. Tycho Brahe built a quadrant that allowed him to observe the…

Read More