Apparently, from what I’ve been reading today, Python is COM compliant, but Perl is not. What is COM and why do I need it? Did I just learn the wrong scripting language? How many am I going to have to learn?

COM = Component Object Model.

It’s a very long and sordid story, but COM (which was originally cooked up by people at DEC) was for a long time the preferred way to develop component oriented software for Microsoft Windows operating systems.

A COM API presents a binary interface, so you can create bindings/wrappers to a COM interface from multiple languages. The majority of people doing COM development on Windows use C++ or VB, but lots of other languages allow you to call methods in COM interfaces.

COM was followed by COM+ sometime around the release of Windows 2000. COM+ added some useful extensions to COM, but kind of got lost in the .NET frenzy that soon followed it. As it turns out, many of the .NET interfaces don’t do much more than wrap existing COM and COM+ interfaces.

If you want to build Windows applications, you can always write directly to the Win32 APIs. This can be very ugly and painful and is guaranteed to shorten your life, especially if you are not programming in C or C++. COM makes life easier, with usually only a minor performance penalty. Assuming you are very careful and use it in exactly the right way. I’ve seen a lot of really bad, really slow COM code in my day.

You can call COM interfaces from Python because Mark Hammond and others built a set of Python libraries as an add-on to the core Python release. You need to download these additional libraries if you want COM support. I believe that ActiveState’s distribution of Python for Windows includes them.

Mark and Andy Robinson wrote a fine book called Python Programming on Win32. I have a copy of it and I highly recommend it if you need to access Windows specific features from Python. This book contains a couple chapters on COM programming from Python.

The good news, assuming you consider programming Windows specific code to be a good thing, is that people are hard at work integrating Python with the .NET Common Language Runtime. You will then have direct access to all the .NET APIs from Python. Theoretically, that will put Python on the same footing as the other languages that work with .NET (although the reality that Microsoft doesn’t want to talk about is that if you aren’t using C#, you are at a disadvantage with respect to both features and performance). Still, it’s better than a sharp stick in the eye, a.k.a., Win32 and COM.

There are also similar projects ongoing to integrate Perl with .NET.

There are in fact several independent projects working on Python and .NET integration. Early results were decidedly bad. Same for the Perl experiments. Interpreted languages and the CLR don’t mix well. I’m not sure if they have made any better progress since I last checked in on them six months or so ago.

[quote]Did I just learn the wrong scripting language?

IMHO, no. Python is awesome.

[quote]How many am I going to have to learn?

It depends on what you need to do and who is paying you. If you want to be flexible on the job market, you really need to learn a couple languages. Sometimes the APIs you need to access are available only from certain languages, or even just one.

Also, different tasks are often suited to different languages. Perl is great for string processing, among many other things. If you can get past the ugly, ugly syntax, it’s a fine all purpose tool. Python is also an excellent all purpose tool, but with better syntax.

Some people consider Ruby to be even better than Python, but you will find less documentation, less sample code, and fewer fellow programmers to ask questions to. JavaScript is great for client side browser scripting. PHP is goodness when you need to do a relatively even mix of server-side data access and HTML generation, but you don’t want to (or aren’t allowed to) use Java servlets and JSPs.

After you learn a couple languages, learning new ones become easier. You will find most of the same programming primitives (variables, loops, method calls, etc.) in each language. The hard part is keeping the syntax straight. I’m in my 15th year of professional programming, and I would estimate that Python is somewhere around the 20th language I have needed to learn. That sounds like a lot, but many times I needed to learn only enough to read other people’s code and figure out what was going on with it. Also, I have almost completely forgotten the first ten or so. Except for C.


So what is .NET?

I saw some interesting demos of Perl scripts that change values in Excel spreadsheets and fonts of Word Docs at . These use those Perl Win32 OLE libraries I guess, but I don’t know what more COM and .NET would do for me.

What I’m up to is wondering if what I’ve learned with Perl to do CGI forms for MySQL queries can be applied to programming the new revision of Arc/Info from ESRI, Arc 8. But since Python does ArcInfo scripting, has COM, has .NET, has CGI and DBI libraries, maybe I’d better go back and start learning Python.

ESRI has distressed many of their users by moving from UNIX to NT(XP) with their latest release, Arc 8, and are dropping their own scripting language for VB. This is the third platform change for me since I started out with Arc/Info 5 on a Prime Computer platform in 1989.

I’ve been trying to learn how to program Arc 8 with VB, but have been getting discouraged. I did some searches of on Perl, and found some user papers from the 97-98 era where they used Perl CGI forms to collect user input and send them to ArcView or Arcplot to make the map, then display the output.

Searches on Python had lots of hits straight from ESRI announcements about Arc 9 and scripting languages, often in a phrase like “programmable with many popular scripting languages, like Python” but never a mention more direct than that to Perl. Interestingly there is even a python extension to AV 3 where the authors summary says ESRI won’t let them post open source software on ArcScripts, they only publish a link. The ArcScripts search form pull down for languages doesn’t have Perl or Python. They’re going to have to modify their page and policy if Python is going to be used extensively in Arc 9. But it looks now like I should have learned Python because Python is COM compliant, has .NET extensions, and has CGI libraries and in the future because Python is in the sights for Arc 9.

In short, I think the answer is yes. It appears that the Win32::OLE Perl package will help you script ArcInfo using Perl.

It’s too bad ESRI abandoned their Unix version for a Windows only version. I remember looking into ArcInfo long ago when I was writing software on SGI and Sun boxes for an ocean bottom mapping application using high resolution sonar.

From browsing the ESRI website, it looks like ESRI decided to make their product scriptable by adding a COM interface to ArcInfo, ArcView, and ArcEditor. This means you can script those apps using any programming language that allows you to call COM interfaces. VisualBasic is probably the most commonly used language for COM programming, but you can also use Visual C++, Python, and others.

I’m not sure if this is what ESRI has in mind, but Python is easy to embed as the scripting environment for another application. This makes a lot of sense if you need a OS platform independent scripting interface. JavaScript is also relatively easy to embed. I don’t know about Perl.

OLE Automation and Its Relationship to COM
Before Visual Basic 4, you couldn’t use VB to directly call a standard COM interface. You could use VB only to call an IDispatch interface, which is a special kind of COM interface. If a COM object implements the IDispatch interface, it is said to provide an OLE Automation interface.

The problem was that you could not use VB (nor most scripting languages) to access C pointers. A raw COM interface is a vtable, an array of pointers to function pointers. The special IDispatch interface provided the information that VB code needed, without requiring access to pointers. I fortunately haven’t had to deal with COM and OLE Automation for several years, so some of the details have slipped away from my memory. I think the above is reasonably accurate, though.

Automation interfaces are typically slower, since the IDispatch interface adds an extra layer of indirection.

Microsoft’s website has lots of info on .NET here.

Like OLE, ActiveX, and many other Microsoft marketing terms, .NET is a term used to cover a wide range of technologies. To most developers, .NET is the programming model Microsoft would like you to convert to. That is, they would prefer that most developers stop trying to write applications that use low level Win32 and COM calls, since a large numbers of these Windows apps memory like a sieve and crash frequently. And when I say “they”, I specifically mean the people working with Microsoft Consulting Services that I have met with.

While some new Microsoft apps will be written on top of .NET, the majority will be written using the low level interfaces. Microsoft created .NET in part to keep IT developers from getting into so much trouble creating buggy apps with the older, difficult to use correctly, programming interfaces.

One thing that .NET provides is a Common Language Runtime. The CLR is very much like a Java Virtual Machine. In fact, much of .NET is modeled on Java and J2EE, except the part of Java that doesn’t lock you into the claws of a convicted monopolist. That was a purely Microsoft developed innovation.

COM vs. .NET
So why use .NET if you’ve got COM interfaces? Well, the .NET runtime manages memory allocation for you (very much like the JVM does). Memory allocation errors are very possibly the greatest cause of unpredictable instability in applications. Memory allocation errors are not only easy to make, but they are often very difficult to debug.

If you are using COM interfaces, you have to deal with reference counting of the interfaces. If you don’t keep track of it properly, you’ll leak memory all over the place. Microsoft added another technology called ATL (ActiveX Template Library) that made C++ COM programming a little easier, but it’s still tough sledding for most programmers.

Also, the .NET interfaces border on sanity and cleanliness. The Win32 interfaces appear to have been designed by 57 independent groups of programmers who disagreed violently on interface conventions. When Microsoft added COM interfaces to many of the Windows subsystems and to apps like Access, Excel, and Word, they cleaned this up a bit. They apparently formed only 43 independent groups of programmers who disagreed violently on interface conventions.


Ironically, Friday I opened the next module in my VB class and it’s all COM. I’ll see if I can figure this out.

You can use the Comprehensive Perl Archive Network, or CPAN for short, to find Perl modules that will help you achieve your goals.

ArcView isn’t an exception.
A quick search for “shapefile” yielded two modules that could help you create new shapefiles from information curried out of DBI::MySQL and such.

This is just a quick and dirty answer, however it’s got potential. Being a Geography undergrad, I’d like to work more on generating shapefiles using Perl.

Best of luck to you.

John J Reiser
newrisedesigns dot com