Friday, December 22, 2006

Linking 101

For many software developers, this post will be linking-101. Yet, recently I was looking into an issue on Windows (yes, I know), where an application developed using Visual Studio 2005, was linking against a libary developer with Visual C 6. I needed to understand a little more on how linking works in Windows. Many of these bullet points are known secrets. But if you lived too long in the Unix world, the Windows secrets will be a mystery and vice versa. So here's a short summary:

Windows
  • .dll is the extension for a Windows dynamic library. DLL Hell exists and even has a Wikipedia entry.
  • .lib is the extension for both a DLL wrapper, as it is used for static libraries. A DLL wrapper or "import" LIB file, which is used to link against. For each function the DLL exports, the LIB file contains a stub which will load and call into the DLL. You only need these .lib files when linking, but not when deploying.
    • If on Windows a DLL is missing, you will get hit during run time, as it can not find the DLL module to load. On Unix, the application will not start up. This has pro's an cons. Pro: if you do not want to use a particular feature of the application, which requires an expensive 3rd party library, you can still run the application. The con: if your application was doing a lot of work and then aborts, you will slap the machine a couple of times.
  • Convention: Files ending with z.lib or zd.lib are the static libraries. (d for debug)
  • C Run Time (CRT) libraries are development environment dependent. If your program links against both, you have to be very very careful. Since memory management are (slightly) different between CRTs, linking against malloc()/free() from different CRTs can corrupt your heap. The bug will unfortunately show up in weird and unrelated ways. The take away is avoid linking against two different CRTs. But if you have to, make sure you keep your program modules separate. For reference:
    • msvcrt.dll is used by VC6.0
    • msvcr70.dll for VS .NET
    • msvcr71.dll for VS .NET 2003
    • msvcr80.dll for VS 2005
    • with "d" added to the name for debug CRTs
Unix
  • Static libraries (archives) are created and updated by the ar (archive) utility. Convention dictates that static libraries have a ".a" extension on their filename.
  • Dynamically linked libraries are created by the link editor, ld. The conventional file extension for a dynamic library is ".so" (shared object).
  • You tell the compiler to link with, libpthread.so, by giving the option -lthread.
  • The compiler finds libraries as specificied by LD_LIBRARY_PATH or using the compiler option -Lpathname.
  • nm utility to the rescue when you run into Undefined Symbols
  • The order of statically linked libraries on the compiler command line is significant: (i..e they should be listed after your own code). Although I believe gcc can be instructed to make a few passes through the static library arguments to make sure undefined symbols, defined in previously stated static libraries are picked up.
  • Knowing which library files to link against is sometimes a mystery. Take a look at the #includes and guess the library name. Often the man pages might mention something as well.
An excellent book on gotchas, weird bugs and behaviors is Expert C programming - Deep C secrets, by Peter Van Der Linden. It is not a C programming text book. It is a collection of C programming experiences (aka the stuff for which experienced programmers get paid the big bucks). I happen to work on a few projects ((USB/CDRW) with Peter, when he was leading the I/O team in the Sun workstation group. He is a very interesting guy, who wrote many programming books from C to Java and Linux. Here's an interview with Peter from ITConversations.

Technorati:

No comments: