One of the primary reasons people use Python is for analyzing and manipulating texts. If your program needs to work through a file, it is usually best to read in the file one line at a time for reasons of memory space and processing speed. This is best done with a while loop as follows:
fileIN = open(sys.argv[1], "r") line = fileIN.readline()while line: [some bit of analysis here] line = fileIN.readline()
This code takes the first command line argument as the name of the file to be processed. The first line opens it and initiates a file object, 'fileIN'. The second line then reads the first line of that file object and assigns it to a string variable, 'line'. The while loop then executes based on the constancy of 'line'. When 'line' changes, the loop restarts. This continues until there are no more lines of the file to be read. The program then exits.
Reading the file in this way, the program does not bite off more data than it is set to process. It thus processes the data it does input faster, giving its output incrementally. In this way, the memory footprint of the program is also kept low, and the processing speed of the computer does not take a hit. This can be important if one is writing a CGI script that may see a few hundred instances of itself running at a time. To read more about while loops, see the tutorial Beginning Python.