Your God is not everybody else's God. That's why there were four solo albums.--Dennis Miller
This article is about building your own Perl under Win32 with non-proprietary solutions. In other words, I won't be talking about building Perl with Microsoft's or Borland's development tools, nor about ActiveState's precompiled version of Perl. In fact, this article is my sales pitch to you, the Win32 developer, to build your own Perl from the source kit that you'll find on CPAN. I'll address various mechanisms for building Perl and discuss why you might or might not want to use them.
Why would you want to build your own Perl binary with non-proprietary solutions? Five reasons:
I'll show you two free development environments for Windows: Cygwin32 and Mingw32. Cygwin32 has a Unix emulation library that provides signals, sockets, fork(), and all the other things Windows lacks. Mingw32 is a set of Unix-like compiler tools (ld, ar, etc.) that doesn't attempt to emulate Unix at all (although it does fix a few of the things that Windows got wrong, like select()).
I'll show you how the two differ in their support for things like sockets, fork(), and permissions. Then I'll give step-by-step instructions for building Perl under each. Finally, I'll talk about building extensions--Perl modules that use C code under the hood.
In this article, Perl and examples were built and tested on a 200MHz Pentium Pro with SCSI and 64M RAM running Windows NT Workstation 4.0.
Cygwin32 is a complete Unix emulation library, with a whole suite of Unix-like tools--essentially all the /bin and /usr/bin programs Unix folks are familiar with.
It comes with ports of the GNU development tools (gcc, ld, ar, and so on). There's also a replacement for gcc called egcs (pronounced "eggs"). egcs is far superior to Cygwin32's gcc for both performance and reliability.
Developers interact with Cygwin32 through the bash shell, although you can install others if you wish. All programs built with Cygwin32 rely on the CYGWIN32.DLL library for Unix emulation, so this library must be distributed with any programs you compile.
Although Cygwin32 allows users to create libraries in the Unix style with ar and ranlib, one of the Cygwin goals was to have compilers and tools capable of working with Win32 object files written in Microsoft's Portable Executable (PE) format. The other goal was to support the Win32 API, specifically modifying the GNU binutils so that they could build DLLs (Dynamically Linked Libraries).
Since CYGWIN.DLL uses the Win32 API, it runs on all Win32 hosts--but with Windows 95/98 instead of NT your mileage will certainly vary. Given that Windows NT implements a POSIX subsystem and Windows 95 does not, you'll find that Windows 95/98 won't pass many of the tests that NT will. In my opinion, there's so much missing in Windows 95/98 that you should spend the $200 on Windows NT Workstation so that your software will build correctly.
In short, Cygwin32 is trying to emulate a Unix kernel. There is appropriate process control; processes can communicate with one another (fork() with parents and children); processes can trap signals, and so on.
The Minimalist GNU Win32 (Mingw32) package was written (according to its maintainer) "out of personal frustration with Cygwin32. I didn't need the Unix stuff, and thought that it was too slow and cumbersome." Mingw32 is nothing more than a set of header files and a library to let a GNU compiler (gcc or egcs) link against the standard Windows runtime libraries. As with Cygwin32, egcs is recommended.
Because Mingw32 doesn't have a Unix emulation layer, you don't need to distribute an emulation DLL with your compiled programs. However, you sacrifice the convenience of simply being able to compile any Unix-reliant source. When you build Perl, you'll be building a Windows port, not a Unix port. This means, for instance, that fork() won't work.
Like Cygwin32, Mingw32 can build standard Windows DLLs and link against PE libraries. Both Mingw32 and Cygwin32 are supported ways of building Perl on Windows. But each has their own traps and pitfalls, and we'll explore them now.
No matter how much a Win32 product resembles Unix, remember that you're still dealing with Win32, so building software still has its gotchas. This section describes them: filesystem permissions, files and paths, linefeeds, symbolic links, how Perl is invoked, select(), crypt(), and threading.
Filesystem Permissions. Unlike the Unix model of file permissions, Windows NT implements a security model which employs Access Control Lists (ACLs). Cygwin32 takes these ACL entries and works with them in a Unix-like fashion. Windows 95/98 doesn't even have ACLs, so Cygwin32 makes all files appear to be owned by a default user and group.
Neither Windows NT nor 95/98 can chown() files because neither operating system has any concept of UIDs under the Win32 API. You'll likely want to read the Cygwin documentation about security problems with shared memory and stored processes that can be modified by intruders.
Files and Paths. Cygnus supports both Win32 (C:\this\that) and POSIX-style (/this/that) paths. On a Unix machine, /floppy and /cdrom might represent different devices where these "mounts" (I use the term loosely) appear on a Win32 system as a:\ and d:\. Current Win32 Perl ports allow you to open files on different devices like so:
#!perl $floppy = 'a:/'; $cdrom = 'd:/'; opendir(FLOP, $floppy) or die("Can't open floppy $floppy: $!"); opendir(CD, $cdrom) or die("Can't open cdrom: $cdrom: $!");
Cygwin allows you to mount (strictly speaking, 'map') these devices under a slash partition. A:\ could get mounted on /a and D:\ could get mounted on /d. Cygwin32's mount command works just like its Unix sibling:
mount <win32:deviceletter> </unix_mount_point> mount D:/ /cdrom
Mounting devices this way makes your code more portable in an environment where code is shared by both Unix and NT machines. With mounted devices, you can read a directory just like you would on a Unix machine:
#!perl $floppy = '/floppy'; opendir(FLOP, $floppy) or die("Can't open $floppy: $!");
Wow! So this is just like my Unix machine! Well, not exactly. You've probably noticed that Win32 can preserve case when it wants to; you can save a file under Windows NT as FoOBaR.TxT and Windows NT will tell you that there's a file named FoOBaR.TxT. But you cannot write another file in the same directory called FOOBAR.TXT, foobar.txt, or FooBar.txt, because Win32 filesystems remain case insensitive. The Cygwin32 developers believe that few Unix programs rely on case distinction, and chose not to add the overhead that case-sensitivity would require.
Linefeeds. By default, Cygwin32 uses Win32 linefeeds and end-of-line terminators. If you're going to use any fopen(3) or open(2)calls in your programs (that's not Perl's open(), but the system's native open()), you must use the associated binary flags (b and O_BINARY respectively). If you don't want to mess with every piece of software that opens files, you can also mount your working directory with the -bflag. One way or another you'll need to address this issue; if you don't, you'll experience undesirable behavior like missing characters and truncated files when opened for writing. Programs like gdbm won't build or work correctly; nor will Perl's filehandles and modules or File::Copy.
Symbolic links. A symbolic link is an entry in your filesystem that isn't a file at all, but really a pointer to another file elsewhere. In the case of CYGWIN.DLL, a file gets generated with a "magic cookie" or "magic header" when you create a symbolic link. It isn't really all that "magic" if you think about it--under Cygwin32, a symbolic link is nothing more than a file with another file listed in its header. If you're using Windows 95/98 (or any FAT file system), you'll be disappointed to note that the symbolic link command won't fail; it will just copy the original file wherever you've created the link. Yes, this stinks--but it's true. Symbolic links under Windows 95/98 mean loads of duplicate files.
Invoking Perl. If you're like me, you've been spending most of your time in a Unix environment. You've probably written some shell scripts, and if you're reading this article, you're also likely to have programmed in Perl. If this is the case, you're familiar with Unix's "shebang" path notion-- #!/use/this/program/to/execute/the/rest/of/the/file.
Cygwin32 supports shebang paths for programs executed in the Cygwin32 environment. This means that if you run your program from bash, the Cygwin32 Unix emulator will understand the #! notation, but if you run your program from NT's cmd, 95's COMMAND.COM, the 4DOS shell, or from the desktop or Explorer, Cygwin32 is not involved and your Perl script will not be recognized.
cmd is also impaired when it comes to quoting and writing lengthy one-liners. Under cmd, one-liners like this won't work:
perl -MTk -e '$mw = MainWindow->new(); \ $mw->Button(-text=> qw(Hi), -cmd => \ sub {exit})->pack(); \ MainLoop;'
Sadly, any simplistic one-liner using single quotes will fail under cmd:
perl -e 'foreach $number (1..100) { print("$number\n"); }'
cmd doesn't support single quotes at all! You'll need to use double quotes. Even then, this version of the one-liner above is broken:
perl -e "foreach $number (1..100) { print("$number\n"); }"
You'll need to escape the quotes inside of print:
perl -e "foreach $number (1..100) { print(\"$number\n\"); }"
So, if you plan on building Perl with Mingw32 so that it can be executed from cmd, please be prepared to flog yourself. Or, you can wrap your Perl script in a batch file with the pl2bat utility, as documented in your Perl distribution's README.win32.
Select. Cygnus's developers were mortified when they realized that Win32's select() only supported socket handles. In other words, Win32 decided to break the Unix dictum of Everything Is A File; Cygnus unbroke this implementation. They needed to rewrite select() to support a number of file descriptors, including sockets and pipes; now select() works fine under Cygwin32. Mingw32 also has a working select().
Crypt. Mingw32 and Cygwin32 are missing crypt() because of US government restrictions on the export of cryptographic software. If you need crypt(), obtain the libdes library listed at the beginning of this article. Versions newer than 3.06 will not work because some of the functions in fcrypt.c have changed.
You can build Perl without des_fcrypt(), but Perl's crypt builtin will then fail with a crypt() not implemented due to excessive paranoia error when crypt is called at run-time.
Threading. Mingw32 is thread-safe; Cygwin32 is not. Cygwin32's libc and libm aren't reentrant, which means that threading won't work--although it should in the future. Cygwin32 also locks memory shared by multiple processes, which can lead to race conditions in threaded programs. Mingw32 is thread-safe because it's using Microsoft libraries capable of threading; you cannot use this threading if you choose to use other libraries that aren't thread-safe.
Cygnus. You'll want a beefy machine with a good processor, lots of RAM, and SCSI. As noted above, Cygwin32 is incredibly effective but rather slow; the giveaway was when Configure took 35 minutes to run with the -ders flags. You should download the distribution and egcs replacement from the sites listed at the beginning of this article.
If you've already installed Cygwin32, you're now ready to build Perl. Be sure to mount the directory where you're keeping your Perl source in binary mode.
You should also be aware that some of the Cygwin32 socket calls are a bit flaky. I've been able to build Perl with vanilla Win32 sockets (winsock) instead of Cygwin32's Unix equivalents. According to the Cygwin32 FAQ:
To use the vanilla Win32 winsock, you just need to #define Win32_Winsock and #include "windows.h" at the top of your source file(s). You'll also want to add -lwsock32 to the compiler's command line so you link against libwsock32.a.
This means that you'll have to include windows.h in the appropriate .h file in your Perl distribution and add -lwsock32 to the list of libraries that you'll add when you run Configure.
If you attempt to run Configure interactively (that is, without a -d flag), you'll find that sh.exe skips every other line. This is a known Cygwin32 bug and the remedy is to use Sergei's "coolview" patch set, listed on the "Related Links" page of the Cygnus website. And if you're using Cygwin b20 or newer, this problem should have disappeared entirely.
I've found that the best (and fastest) way to get around a slow Configure process is to copy hints/cygwin.sh to $PERLSOURCEDIR/config.sh. Edit config.sh and plug in values to match your system's configuration; when you're done, run Configure with the -ders options.
Once Configure finishes, execute a make depend and then a make. Please also run a make test; Under Windows NT, you'll pass about 90-95% of the tests. Then make install.
Mingw32. I've had the most success with Mingw32 (98% of the tests passed with Perl 5.005_02) and egcs-1.1. The Perl build, test, and install took about a half hour (26 minutes longer than it took under FreeBSD-2.2.7). Although the egcs compilers are still considered experimental, there are plenty of anecdotes about peoples' experiences on the perl5-porters mailing list archives. Mumit Khan, egcs developer, has been quite active in these discussions.
To begin the process, retrieve and install Mingw32. Don't use any Cygwin-based tools (like tar) to extract these distributions because of the end-of-line problems mentioned above; you need WinZip or Pkunzip. You shouldn't need to reboot your machine: just modify your PATH and hit the Apply button under Control Panel/System/Environment. Make sure that you restart your DOS shell before proceeding. Run mingw32.bat; this sets up your GCC_EXEC_PREFIX, necessary for things to build (see the Mingw32 README for details). It is suggested that you write a simple C program to ensure that gcc is working correctly before trying to compile Perl.
Under NT, make sure to run cmd from your DOS prompt. You might have weird results if you don't use cmd, and it's quite likely that tests will fail (for no reason) even if Perl seems to build correctly. Change directories into C:\base\perldir\win32 and edit makefile.mk.
CCHOME should point to your compiler, which will initially be gcc; you should have installed egcs-1.1 in C:\EGCS-1.1, so CCHOME should be set to C:\EGCS-1.1. Make sure that INC (or INCLUDE) points to $CCHOME\include and LIB (or LIBS) points to $CCHOME\lib. If you don't do this, you will get a slew of errors early in the build process.
You can build a threaded version of 5.005_02 under Mingw32, but you can't currently enable both threading and PERL_OBJECT, which is specific to Perl for Win32. You can get only one of them in a Perl interpreter.
If you choose to build with crypt() support, point CRYPT_SRC at the file that implements des_fcrypt()--fcrypt.c. Or, if you've already built libdes, you can link against CRYPT_LIB. Frankly, it's pretty easy to just use CRYPT_SRC.
Type dmake to build the software, dmake test to run the test suite, and dmake install once you're convinced that everything is okay. When all is done, you should have a functional perl.
You'll need to include Perl's location in your PATH, as shown in README.win32:
set PATH c:\perl\5.005\bin;c:\perl\5.005\bin\MSWin32-x6;%PATH%
One of the advantages of building your own Perl is the ability to work with your own compiler--and therefore XS, so you can create Perl hooks into speedy compiled C code. When you built your own Perl, a file called Config.pm was written containing all of the information about your system and development environment (compiler type, libraries, paths, and so on). When you build modules, particularly those which call external code via XS, you'll be working with make and your compiler. Like your Unix compatriots, you should be able to do this:
C:\> perl Makefile.PL C:\> dmake C:\> dmake test C:\> dmake install
And now things should work.
Here's a list of popular modules that I've gotten working with little or no hacking under Cygwin32 and Mingw32:
Module | Cygwin32 or Mingw32? |
---|---|
Archive::Tar | both |
CPAN | Cygwin32 only--needs gzip |
Compress::Zlib | both |
Data::Dumper | both |
IO:: | both |
LWP | both |
MD5 | both |
MIME::Base64 | both |
Perl/Tk | Mingw32 only |
Term::ReadKey | both |
Term::ReadLine | both |
WWW::Search | both |
libnet | both |
Nathan Patwardhan is a consultant with Collective Technologies where he works with Unix, Perl, and occasionally C. Nathan has published two Perl books with O'Reilly and Associates, including the recent release: Perl in a Nutshell. Nathan wishes that Flavor Flav had his own flavor of Unix, or at very least a timeserver.