std::list - Example and Explanation

Our discussion of std::list, like our discussion of std::vector will start with a bit of history. It wasn’t until the IT industry was inundated with tiny x86 minds that people “tried to do it all in RAM.” Before the PC we had real computers with real operating systems. Other than Assembler, our languages were robust. Every platform supported indexed files and so did the compiler for every language. When we needed a multiply linked list we did something like the following.

VAX BASIC

! VAX BASIC

        WHEN ERROR IN
            L_ERR% = 0%
            OPEN drawing_stats$ FOR OUTPUT AS FILE #dstat_chan%,    &
                ORGANIZATION INDEXED FIXED,                         &
                ALLOW NONE,                                         &
                RECORDTYPE FORTRAN,                                 &
                RECORDSIZE zillionare_stats_record_size,            &
                PRIMARY KEY D_STAT::ELM_NO,                         &
		MAP D_STAT_MAP
        USE
            L_ERR% = ERR
            PRINT "Unable to open drawing stat output file"; drawing_stats$
            PRINT "Error: ";L_ERR%;" ";ERT$( L_ERR%)
        END WHEN

DIBOL (DIgital’s Business Oriented Language)

    ;-----
    ;   This is DIBOL
    ;-----

    if (.NOT. file_err_flg)
    begin
        try
            open( mega_stat_chan=%syn_freechn,
            & o:i, 
            & MSTAT_FILE_NAME ,
            & FDL:"@MEGA_DAT:MEGA_STATS.FDL" ,
            & RECTYPE:1)
    
        catch (e, @Exception)
         begin
            l_x = %SYSERR
            writes( tt, %string( l_x) + 
            &       " result trying to open " + MSTAT_FILE_NAME )
            close( mega_chan)
            file_err_flg = %TRUE
            eof_flg = %TRUE
         end
        endtry
    end

COBOL

*  COBOL
...
FILE-CONTROL.


    SELECT DRAW-STATS
        ASSIGN TO 'DRAWING_STATS'
        ORGANIZATION IS INDEXED
        ACCESS MODE IS RANDOM
        RECORD KEY IS ELM_NO IN DSTATS-REC
        LOCK MODE IS AUTOMATIC
        FILE STATUS IS D-STAT.

DATA DIVISION.

FILE SECTION.

FD  DRAW-STATS
    IS GLOBAL
    LABEL RECORDS ARE STANDARD.

    COPY 'CDD_RECORDS.ZILLIONARE_STATS_RECORD' FROM DICTIONARY
        REPLACING ZILLIONARE_STATS_RECORD BY DSTATS-REC.

PROCEDURE DIVION.

*
*       Create new output files
*
    CALL 'LIB$SPAWN' USING BY DESCRIPTOR
        'CREATE/FDL=MEGA_DAT:MEGA_STATS.FDL DRAWING_STATS'.

    OPEN I-O DRAW-STATS.
...

FORTRAN 77

     ! FORTRAN 77

        INCLUDE 'MEGA_TEXT_LIB (FTN_MEGA_RECS)'

...
 220    OPEN (UNIT=K_DSTAT_CHAN, 
     1        FILE=DRAWING_STATS,
     2        STATUS='NEW',
     3        ORGANIZATION='INDEXED',
     4        ACCESS='KEYED',
     5        RECORDTYPE='FIXED',
     6        FORM='UNFORMATTED',
     7        RECL=K_ZILLIONARE_STATS_RECORD_SIZE/4,
     8        CARRIAGECONTROL='FORTRAN',
     9        KEY=(1:4:INTEGER),
     1        DISP='KEEP',
     2        IOSTAT=L_D_STAT,
     3        SHARED,
     4        ERR=996)

Our Key Was Part of the Data

Unlike std::map, std::set and the other sorted containers, our key was not separate from the data. Our key could be the entire record and the record could consist of just a single field. We had random access, and sequential access. If you had multiple keys for a record, once you performed a keyed hit to establish a key of reference, every sequential read after that read in that key’s order.

Take a look at continuation 9 in the FORTRAN IV snippet. It identifies bytes one through four as a key of integer type. Guess it doesn’t mean much to you without the record layout.

Drawing_Stats 
Elm_no          integer k0 
Hit_count       integer 
Last_draw_no    integer
Since_last      integer
Curr_seq        integer
Longest_seq     integer
Pct_hits        double
Max_btwn        integer
Ave_btwn        double

In the world of professional software development we number our keys. K0 is always the primary key.

Yes We Could Sort

If need be we could write things to a sequential file then use the system sort utility to create a new sorted file and read that as input. Usually we did this in a designated “scratch” area so the system managers could quickly reclaim the space if we happened to forget.

A large part of this discussion is pointing out the fact we would never sort more than a tiny handful of items that physically could not number in the triple digits.

Consider Resource Limitations

It was true in the 1970s and it is still true today; You will always have more physical storage than physical RAM. Not all storage may be online at all times, (i.e. tape library machines) but that number will always be bigger.

If you think these things are going away any time soon, think again. IBM can store 330TB uncompressed on a palm sized tape cartridge. Take a moment and count just how many tape cartridge slots you can see in this dark robot library picture.

If You Ever!

If you ever create something like QSqlTableModel or QTableView where the model has to physically load all of the cursor results into RAM instead of using the cursor contents directly, when I die I will come back and haunt your children in a manner that makes Freddy Krueger seem like Jimminy Cricket.

Freddy Krueger from Nightmare on Elm Street

I will haunt their offspring and every generation thereafter until your heirs decide it is best not to procreate and thus your DNA is eliminated from the gene pool.

Today’s databases can return massive amounts of data from a single table.

When you issue

SELECT * FROM MY_TABLE;

It is theoretically possible the resulting cursor could contain 32TB if the default was taken at database creation time. It could also return 128TB. I am ranting about this because I have seen people select from a database and load it into a list or other in-memory container without ever once checking the size of the data. Yes, I saw it because a program that had “run for years” was crashing nobody knew why.

Sadly QSqlTableModel is in production in a lot of places using PostgreSQL databases like a ticking time bomb where one day the application will just crash because the table in the database got too big to load. Don’t you create ticking time bombs!

A crash is actually the best possible outcome. Worse yet would be to just ignore what didn’t fit without crashing or throwing any kind of exception to be caught and handled. The user will continue on for years not having all of the data, eventually not having even correct data, because someone’s code quietly failed.

Never Jump Right In With the New Stuff

I need to issue you yet another cautionary tale like I did with trigraphs.

The day your shop changes to C++20 as it’s compilation standard you lose a bunch of operators for std::list.

Academics Hose Things

C++ is a language at an age where new academics are replacing the dying old academics. Judging from many of the decisions being made neither worked a day in their life writing and supporting production code. They certainly never spent even one day in a COBOL class.

    01 STATUS-VARIABLES.
       05 IN-STAT           PIC X(2).
       05 DRAW-STAT         PIC X(2).
       05 ARE-THERE-MORE-RECORDS    PIC X VALUE 'Y'.
          88 THERE-ARE-MORE-RECORDS       VALUE 'Y'.
          88 THERE-ARE-NO-MORE-RECORDS    VALUE 'N'.
       05 IS-OPEN-FLAG      PIC X.
          88 FILE-IS-OPEN   VALUE 'Y'.

...

    PERFORM B010-LOAD-DATA
        UNTIL THERE-ARE-NO-MORE-RECORDS.

Things that are yes/no boolean type entities start with either “are” or “is” depending on what will be more grammatically correct. 88 levels are where we assigned names to values a field could have. You only had to assign names for interesting values.

This rant is because everyone of these containers has an empty() method that should be isEmpty() [or is_empty() if we finally purge Pascal from C++]. If it had that name none of you would code it assuming it emptied the container. No, to empty a container you must use clear() because empty() only tells you if the container is empty. Isn’t that obvious?

The Code

// global includes before local
#include <iostream>
#include <list>
#include <algorithm>
#include <random>
#include <functional>

#include "config.h"

void dumpList(std::list<double> &dblList)
{
	if (!dblList.empty())  // this is the proper use of empty()
	{
		std::cout << " { ";
		for (double dbl : dblList)
		{
			std::cout << dbl << ", ";
		}
		std::cout << "}; \n";
	}
	else
	{
		std::cout << " The list is empty\n";
	}
}

bool tooBig( std::list<double> &dblList)
{
	bool retVal = false;

	if ((dblList.size() + 1000) > dblList.max_size())
	{
		retVal = true;
	}

	return retVal;
}

bool getUserInput(std::list<double> &dblList)
{
	if (tooBig( dblList))
	{
		return false;
	}

	bool retVal = true;
	double dbl {};

	std::cout << "Enter a floating point value or 999 to quit: ";
	std::cin >> dbl;

	if (dbl == 999)
	{
		retVal = false;
	}
	else if ((dblList.size() % 2) == 0)
	{
		dblList.push_front( dbl);	// add to front of list
	}
	else
	{
		dblList.push_back( dbl);	// add to back/end of list
	}

	return retVal;
}

int main(int argc, char **argv)
{
	std::cout << "ListContainerExample" << std::endl;
	std::cout << "Version " << ListContainerExample_VERSION_MAJOR << "."
			<< ListContainerExample_VERSION_MINOR << std::endl;

	std::list<double> myList { 3.14159, 0.32, -0.60, -505.123, 987.01 };

	std::cout << "list before user input: ";
	dumpList( myList);

	while (getUserInput(myList));  	// this is not an infinite loop
									// some might want to use a do while here
									// but that creates needless temporaries
	std::cout << "list after user input: ";
	dumpList( myList);

	// NEVER USE AUTO IN PRODUCTION CODE!
	//
	std::list<double>::iterator it = find(myList.begin(), myList.end(), -505.123);
	if (it != myList.end())
	{
		myList.insert(it, -0.60);
	}

	std::cout << "list after insert: ";
	dumpList( myList);

	myList.sort( std::less<double>());
	std::cout << "list after descending sort: ";
	dumpList( myList );

	myList.unique();
	std::cout << "list after removing duplicates: ";
	dumpList( myList );

	myList.sort();
	std::cout << "list after ascending sort: ";
	dumpList( myList );

	std::cout << "adding some more values\n";

	double lowerBound = 0;
	double upperBound = 10000;

	std::uniform_real_distribution<double> unif( lowerBound, upperBound);
	std::default_random_engine rndEngine;

	for (int jjj=0; jjj < 20; jjj++)
	{
		myList.push_back( unif(rndEngine) );
	}
	std::cout << "after adding numbers: ";
	dumpList( myList );

	// You really have to use a lambda here. One of the few times I support
	// use of a lambda. Mostly they should be avoided, especially if they return
	// values and could be executed across threads like a signal/slot arrangement.
	//
	myList.remove_if( [](double d) { return d < 0.0;});
	myList.sort();
	std::cout << "after removing negatives and sorting: ";
	dumpList( myList);

	return 0;
}

cmake_minimum_required(VERSION 3.10)

# Set some basic project attributes
project (ListContainerExample
	VERSION 0.1
	DESCRIPTION "A std::list Project")

set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_STANDARD 17)

# This project will output an executable file
add_executable(${PROJECT_NAME} ListContainerExample.cpp)

# Create a simple configuration header
configure_file(config.h.in config.h)

# Include the configuration header in the build
target_include_directories(${PROJECT_NAME} PUBLIC "${PROJECT_BINARY_DIR}")

ListContainerExample
Version 0.1
list before user input:  { 3.14159, 0.32, -0.6, -505.123, 987.01, }; 
Enter a floating point value or 999 to quit: 21.1
Enter a floating point value or 999 to quit: 18.3
Enter a floating point value or 999 to quit: 0.00004
Enter a floating point value or 999 to quit: 0.00001
Enter a floating point value or 999 to quit: 1.0006
Enter a floating point value or 999 to quit: 33
Enter a floating point value or 999 to quit: 254
Enter a floating point value or 999 to quit: 88.7
Enter a floating point value or 999 to quit: 999
list after user input:  { 88.7, 33, 1e-05, 18.3, 3.14159, 0.32, -0.6, -505.123, 987.01, 21.1, 4e-05, 1.0006, 254, }; 
list after insert:  { 88.7, 33, 1e-05, 18.3, 3.14159, 0.32, -0.6, -0.6, -505.123, 987.01, 21.1, 4e-05, 1.0006, 254, }; 
list after descending sort:  { -505.123, -0.6, -0.6, 1e-05, 4e-05, 0.32, 1.0006, 3.14159, 18.3, 21.1, 33, 88.7, 254, 987.01, }; 
list after removing duplicates:  { -505.123, -0.6, 1e-05, 4e-05, 0.32, 1.0006, 3.14159, 18.3, 21.1, 33, 88.7, 254, 987.01, }; 
list after ascending sort:  { -505.123, -0.6, 1e-05, 4e-05, 0.32, 1.0006, 3.14159, 18.3, 21.1, 33, 88.7, 254, 987.01, }; 
adding some more values
after adding numbers:  { -505.123, -0.6, 1e-05, 4e-05, 0.32, 1.0006, 3.14159, 18.3, 21.1, 33, 88.7, 254, 987.01, 1315.38, 4586.5, 2189.59, 6788.65, 9346.93, 5194.16, 345.721, 5297, 76.9819, 668.422, 6867.73, 9304.36, 5269.29, 6539.19, 7011.91, 7621.98, 474.645, 3282.34, 7564.1, 3653.39, }; 
after removing negatives and sorting:  { 1e-05, 4e-05, 0.32, 1.0006, 3.14159, 18.3, 21.1, 33, 76.9819, 88.7, 254, 345.721, 474.645, 668.422, 987.01, 1315.38, 2189.59, 3282.34, 3653.39, 4586.5, 5194.16, 5269.29, 5297, 6539.19, 6788.65, 6867.73, 7011.91, 7564.1, 7621.98, 9304.36, 9346.93, };

Summary

Don’t be upset if you didn’t know half of those STL creations, I had to look some up too. STL has gotten too big to utilize effectively and gotten so old parts of it are rotting away. When you are writing production code that needs to run correctly for years, possibly decades stay in the center lanes. I have a system at a client site I wrote nearly 30 years ago, it still handles around $50 million in orders per day and despite massive changes to the business has not required modification in all that time. You IoT developers don’t manage to have functioning code in production a week.

std::list does not have random access, you have to use find() to obtain an iterator.

Never use auto in production code outside of a template!

Being fat and lazy does not produce quality code even if your TDD says otherwise. The rules aren’t hard core for auto. With non native data types you are one include file away from having a different class derived from the same base class chosen as the “auto” type. This produces wild unpredictable code paths in code that wasn’t changed. Someone added the include to the include you are including and the happy path seemed to work.

Notice the tooBig() function? Never ASS-U-ME you have enough room.