VERSION, SOVERSION, and Tiny x86 Minds

The tiny x86 mindset keeps making the same mistakes over and over again. Lately they’ve done it with SOVERSION. It wouldn’t be so bad if their decisions didn’t jack up the world for at least a decade. These stupid decisions go all the way back to IBM and it’s first PC where they opted to place the address space for adapter cards (like video) in the address space 640-1024K, thus creating a 640K barrier for DOS and a perpetual memory hole until things finally went a little more Motorola linear memory. Because of the desire/requirement for backward compatibility with the original x86 instruction set, we in IT endured the fallout from that for a very long time.

Everybody has wined and complained about Make since Make was first introduced. Be honest! We’ve all wined about make and source control systems at some point but Make seemed to suffer particularly venomous attacks. I think it is because, at least in the early days, every Linux distro tweaked it and the environment just a little bit. Many had different C/C++ compilers as well. You kids today so used to everything just using Gnu, that was a long journey with a lot of commercial products trying to shove out the OpenSource.

Everybody Had Their Own

Every PC based C/C++/Fortran compiler system had its own Make and link. It got really ugly with the different commercial overlay linkers trying to dance inside of that 384K above the great 640K barrier.

Cross platform development was brutal. I remember not being able to fork over money fast enough for Watcom when their IDE allowed me to build for DOS-16, DOS-32, Windows 3.x, and OS/2.

Watcom IDE

I even wrote some books on Zinc because it was one of the first real cross platform application frameworks on the market. Well, it was slightly more than a GUI and all of the others were just a GUI.

Cross Platform Make

So, yes, I understand the desire to have one cross platform make that will work everywhere. Sadly CMake already has a bunch of one-offs for Mac. It also kinds fails at library packaging which is where I found myself lately.

I’ve gotten into Debian and RPM packaging after having been dragged there kicking and screaming. Now when I work on some new piece of OpenSource one of the first things I try to do is create packages for it. You don’t really understand how useful it is to have the packages until you create them. Knowing you have to create the packages influences your design. Knowing it has to work on Debian, RPM, and possibly Arch based systems means you stay in the center lanes.

Yes, I’m looking at you KDE developers!

One cannot install KATE on a non-KDE desktop without pulling in roughly two thirds of KDE or so it seems like with the list of additional dependencies.

SOVERSION

SOVERSION and SONAME were supposed to be a salve to help heal the wound that is library naming.

An Elephant is a mouse designed by committee.

We will skip discussing MAC since I don’t develop there. You can read the CMake documentation to learn of all the one-off things for MAC. In my fork of Scintilla to add CopperSpice support (called CsScintilla) I have a high level CMakeLists.txt containing this:

In the source level CMakeLists.txt (anyone else find using the same file name at two different directory levels a real problem?):

Target Properties

What gets created in the build directory is this:

roland@roland-HP-EliteDesk-800-G2-SFF:~/sf_projects/csscintilla_build$ ls -al
total 3352
drwxrwxr-x  5 roland roland    4096 Aug  3 12:53 .
drwxrwxr-x 29 roland roland    4096 Aug  3 12:53 ..
-rw-rw-r--  1 roland roland   77538 Aug  3 12:53 build.ninja
-rw-rw-r--  1 roland roland   19372 Aug  3 12:53 CMakeCache.txt
drwxrwxr-x  4 roland roland    4096 Aug  3 12:53 CMakeFiles
-rw-rw-r--  1 roland roland    1734 Aug  3 12:53 cmake_install.cmake
-rw-r--r--  1 roland roland    3670 Aug  3 12:53 CPackConfig.cmake
-rw-r--r--  1 roland roland    4175 Aug  3 12:53 CPackSourceConfig.cmake
-rw-r--r--  1 roland roland    1683 Aug  3 12:53 csscintilla.spec
drwxrwxr-x  2 roland roland    4096 Aug  3 12:53 deb_build.etc
lrwxrwxrwx  1 roland roland      19 Aug  3 12:53 libCsScintilla.so -> libCsScintilla.so.5
-rwxrwxr-x  1 roland roland 3201664 Aug  3 12:53 libCsScintilla.so.1.0.1
lrwxrwxrwx  1 roland roland      23 Aug  3 12:53 libCsScintilla.so.5 -> libCsScintilla.so.1.0.1
-rw-rw-r--  1 roland roland   75432 Aug  3 12:53 .ninja_deps
-rw-rw-r--  1 roland roland    5551 Aug  3 12:53 .ninja_log
-rw-rw-r--  1 roland roland    2237 Aug  3 12:53 rules.ninja
drwxrwxr-x  3 roland roland    4096 Aug  3 12:53 src

Reality

I’m sure somewhere in someone’s head linking the .so to the .so having the SOVERSION at the end made sense. I can even understand linking the SOVERSION back to the VERSION (build version). A SOVERSION of 5 was deliberately chosen because Scintilla is currently at version 5.x.y. Could not marry my build VERSION to Scintilla though as that would make it impossible to fix a bug in just CsScintilla. Most of the examples you will find always use the Major version of the build number as the SOVERSION. They do this to hide the fact SOVERSION is a bad design.

lrwxrwxrwx   1 root root      19 Jan  5  2020 libmidori-core.so -> libmidori-core.so.0
lrwxrwxrwx   1 root root      21 Jan  5  2020 libmidori-core.so.0 -> libmidori-core.so.0.6
-rw-r--r--   1 root root  358616 Jan  5  2020 libmidori-core.so.0.6
drwxr-xr-x   2 root root    4096 Nov 22  2020 libqmi
lrwxrwxrwx   1 root root      28 Mar 11  2020 libqscintilla2_qt5.so -> libqscintilla2_qt5.so.15.0.0
lrwxrwxrwx   1 root root      28 Mar 11  2020 libqscintilla2_qt5.so.15 -> libqscintilla2_qt5.so.15.0.0
lrwxrwxrwx   1 root root      28 Mar 11  2020 libqscintilla2_qt5.so.15.0 -> libqscintilla2_qt5.so.15.0.0
-rw-r--r--   1 root root 6521056 Mar 11  2020 libqscintilla2_qt5.so.15.0.0
lrwxrwxrwx   1 root root      16 Mar  5  2017 libregina.so.3 -> libregina.so.3.6
-rw-r--r--   1 root root  458808 Mar  5  2017 libregina.so.3.6
	

Please look at how Ubuntu names libraries. Just follow libqscintilla2. The .so links directly to the final target. Then, rather elegantly, and seemingly pointlessly, they stair step .so.Major to the final target. After that .so.Major.Minor gets linked there as well. It’s seemingly elegant. Kudos to whoever did it.

Sadly, that is not the norm.

lrwxrwxrwx  1 root root      12 Dec 16  2020 libm.so.6 -> libm-2.31.so
-rw-r--r--  1 root root  104396 Dec 16  2020 libnsl-2.31.so
lrwxrwxrwx  1 root root      14 Dec 16  2020 libnsl.so.1 -> libnsl-2.31.so
-rw-r--r--  1 root root   38804 Dec 16  2020 libnss_compat-2.31.so
lrwxrwxrwx  1 root root      21 Dec 16  2020 libnss_compat.so.2 -> libnss_compat-2.31.so
-rw-r--r--  1 root root   26264 Dec 16  2020 libnss_dns-2.31.so
lrwxrwxrwx  1 root root      18 Dec 16  2020 libnss_dns.so.2 -> libnss_dns-2.31.so
-rw-r--r--  1 root root   50920 Dec 16  2020 libnss_files-2.31.so
lrwxrwxrwx  1 root root      20 Dec 16  2020 libnss_files.so.2 -> libnss_files-2.31.so
-rw-r--r--  1 root root   22188 Dec 16  2020 libnss_hesiod-2.31.so
lrwxrwxrwx  1 root root      21 Dec 16  2020 libnss_hesiod.so.2 -> libnss_hesiod-2.31.so
-rw-r--r--  1 root root   55028 Dec 16  2020 libnss_nis-2.31.so
-rw-r--r--  1 root root   59096 Dec 16  2020 libnss_nisplus-2.31.so
lrwxrwxrwx  1 root root      22 Dec 16  2020 libnss_nisplus.so.2 -> libnss_nisplus-2.31.so
lrwxrwxrwx  1 root root      18 Dec 16  2020 libnss_nis.so.2 -> libnss_nis-2.31.so
-rw-r--r--  1 root root   13852 Dec 16  2020 libpcprofile.so
-rwxr-xr-x  1 root root 2454116 Dec 16  2020 libpthread-2.31.so
lrwxrwxrwx  1 root root      18 Dec 16  2020 libpthread.so.0 -> libpthread-2.31.so
-rw-r--r--  1 root root   88000 Dec 16  2020 libresolv-2.31.so
lrwxrwxrwx  1 root root      17 Dec 16  2020 libresolv.so.2 -> libresolv-2.31.so
-rw-r--r--  1 root root   38980 Dec 16  2020 librt-2.31.so
lrwxrwxrwx  1 root root      13 Dec 16  2020 librt.so.1 -> librt-2.31.so
-rw-r--r--  1 root root   22104 Dec 16  2020 libSegFault.so
lrwxrwxrwx  1 root root      19 May 29 02:49 libstdc++.so.6 -> libstdc++.so.6.0.28
-rw-r--r--  1 root root 1947492 May 29 02:49 libstdc++.so.6.0.28
-rw-r--r--  1 root root   42940 Dec 16  2020 libthread_db-1.0.so
lrwxrwxrwx  1 root root      19 Dec 16  2020 libthread_db.so.1 -> libthread_db-1.0.so
-rw-r--r--  1 root root   14000 Dec 16  2020 libutil-2.31.so
lrwxrwxrwx  1 root root      15 Dec 16  2020 libutil.so.1 -> libutil-2.31.so

Near the end of this list and scattered throughout it you will notice the .so is the final target. The .so.SOVERSION links back to the build. What really fries my bacon is the inconsistency when it comes to the placement of build version. I ASS-U-ME that 2.31 is the build and .so.1 is the SOVERSION, don’t you?

CMake tried to straddle some unseen fence and didn’t do a good job. Perhaps the Linux version they started on had that wacky naming convention?

I can get behind a .so.SOVERSION pointing to a .so.VERSION because you can easily upgrade/downgrade by changing the link. What is really annoying in all of this anarchy is the inconsistency of placement.

ABI vs. API

ABI = Application Binary Interface.This has to do with very low level binary things. When this changes it is generally not a subtle thing.

From the days of the original IBM XT computer

Original IBM PC XT courtesy of vintage-computer.com

to the days well past the 486 based desktops

AST 486SX/33

every compiler defaulted to using the original x86 instruction set. This kept software locked into horrific SEGMENT:OFFSET memory addressing and rather trapped us into the DOS 640K world. When a developer (or the compiler vendor) decided to switch compiler default options to compile for 32-bit instead of 16-bit this was an ABI change. Stuff compiled to use SEGMENT:OFFSET addressing would no longer work.

When it comes to ABI changes, the IT world tries to minimize the ones that break everything. We try to milk something for all it is worth. A really big ABI change in the Linux world happened with the move from libc5 to libc6. That caused a lot of pain and many rally cries to make the Linux kernel and the C library a single project.

API = Application Programming Interface.

In general, you can assume the API changes at least slightly every time you do a build. In C/C++ and many other languages the concept of optional parameters was introduced long ago. If you needed to add a parameter to some existing function or class method, you could add it to the end as an optional parameter. In this way you could add new behavior and capabilities without breaking old. That is how we are currently at version 2.34 of Gnu libc yet still putting forth a libc6 API.

6 would be the SOVERSION. It’s the API.

2.34 would be the build VERSION.

The SOVERSION Linkage

The linkage is basically the gist of this rant.

CMake VERSION and SOVERSION in action

The .so links to the .so.SOVERSION. Then the .so.SOVERSION links to the .so.VERSION. Unless the runtime environment is smart enough to track this down only once, I gotta believe there will be a wee bit of performance degradation.

One would think that Debian could at least force a naming and linkage standard.

Oh, all bets are off when you get to Windows and MAC.

By seasoned_geek

Roland Hughes started his IT career in the early 1980s. He quickly became a consultant and president of Logikal Solutions, a software consulting firm specializing in OpenVMS application and C++/Qt touchscreen/embedded Linux development. Early in his career he became involved in what is now called cross platform development. Given the dearth of useful books on the subject he ventured into the world of professional author in 1995 writing the first of the "Zinc It!" book series for John Gordon Burke Publisher, Inc. A decade later he released a massive (nearly 800 pages) tome "The Minimum You Need to Know to Be an OpenVMS Application Developer" which tried to encapsulate the essential skills gained over what was nearly a 20 year career at that point. From there "The Minimum You Need to Know" book series was born. Three years later he wrote his first novel "Infinite Exposure" which got much notice from people involved in the banking and financial security worlds. Some of the attacks predicted in that book have since come to pass. While it was not originally intended to be a trilogy, it became the first book of "The Earth That Was" trilogy: Infinite Exposure Lesedi - The Greatest Lie Ever Told John Smith - Last Known Survivor of the Microsoft Wars When he is not consulting Roland Hughes posts about technology and sometimes politics on his blog. He also has regularly scheduled Sunday posts appearing on the Interesting Authors blog.