Information Technology

Interesting PostgreSQL Oddity

I was working on a tiny database to send out some emails with varying subjects. Decided to use PostgreSQL so I could have a real database as the back end. SQLite doesn’t share (on most platforms) so I couldn’t add data while the program was running. Being able to add stuff and check on progress while running is one of those design things you can’t kick when you’ve been in IT as long as me.

Nothing odd about the definition of the table. It only has two columns.

CREATE TABLE agile_book_subjects (
	subject VARCHAR(80),
	last_used VARCHAR(60)

Opted to be a bit lazy when I created the test data. I created a text file with the following:

Subject Line 1,
Subject Line 2,
Subject Line 3,
Subject Line 4,
Subject Line 5,
Subject Line 6,
Subject Line 7,
Subject Line 8,
Subject Line 9,
Subject Line 10,
Subject Line 11,

Next I created a simple import script.

if [ -z "$1" ]; then
    echo "must pass csv file name to import"
    echo "first line must have column names"

export PGUSER=roland
export PGPASSWORD=spooge_1

psql -q -d email_marketing -c "copy agile_book_subjects from '$1' csv header;"

psql -q -d email_marketing -c "select count('*') from agile_book_subjects;"

Here is what happened

When I run it all looks well. I ass-u-me it would have gagged if it didn’t like the missing value.

11 records as expected

While the application is running I see only one subject being used. This is odd. I perform a few queries in another terminal.

Ascending sort seems borked

Well, if that doesn’t look like a Microsoft product! My ascending sort puts the only record with a value in the sort column on top. The select statement in the Java program is as follows:

SELECT subject FROM agile_book_subjects ORDER BY last_used ASC LIMIT 1;
Result of actual select used

If it is unable to sort on a null value, how did it get the first row? Better yet, explain this:

Nulls have a higher value than text?

How is it the nulls sort above the record with actual text? I’m on Ubuntu 20.04 LTS using stuff out of the repos if anyone is interested in the platform.

Of course the fix was to stick a 0 in for last_used then recreate the database and repopulate the table. Still, if it can’t handle nulls in a column as one would expect nulls, it shouldn’t have given me the first row. Is it really putting something like COBOL HIGH-VALUES in the column?

A value should always sort higher than a null value.

Kind of reminds me of this song