Recently, the The Free Software Foundation (FSF), founded by MIT (News - Alert) alum Richard Stallman, has filed a lawsuit against Cisco Systems for copyright infringement. The suit contends that Cisco distributed software originally written and distributed under the FSF’s General Public License (GLP), and has thus failed to fulfill the requirements of the GPL under which the software is published. The GLP terms oblige distributors to disclose that their products contain code licensed under the GPL and they must make that code available to any end user who requests to see it. As it happens, Linksys (now owned by Cisco (News - Alert)) distributed various Linux-based products that rely on GPL-licensed components, but Linksys is said to have repeatedly failed to fulfill the GPL-specified obligations.
This is the first lawsuit that the FSF has ever filed for GPL infringement.
To explore the tangled matter of open source, intellectual property and the Cisco case, Your Truly just sat down and talked with Dr. Mahshad Koohgoli (News - Alert), CEO, of Protecode (www.protecode.com). Protecode has unique products that can examine a software developer’s project and automatically detect, log, identify and do pedigree-tagging of software content and report on associated intellectual property and licensing attributes and compliance against an organization’s policies, thus establishing intellectual property ownership and creating a software Bill of Materials (BoM).
RG: You have two principal products that can analyze software?
MK: We have products and services that analyze an enterprise’s software ‘portfolio’ of product code, and we can detect open source and other third-party software that has gone into that portfolio, library, or product. We can automatically identify the licensing and copyright obligations for various stretches of code and check them against the policies that an organization has established. If we see a violation, we flag it. Basically, we get involved during any transaction where the is ownership of intellectual property must be established, whether you’re trying to license your product, or sell it, when you have to provide IP indemnity ‘background checks’, or are trying to co-develop something with a partner, or if a merger and acquisition is occurring with your company and everything is being vetted. So, any time you want to exactly know what code you have in your product or organization, and any obligations associated with it, we have solutions that provide such capability. These can work either in real time as you’re developing the program and are bringing in bits and pieces of code, or they can be run in a broad analysis mode, examining what you’ve already developed, or what you have in your library of useful subroutines. We then generate a detailed report.
RG: Tell me about the bulk analyzer that just methodically examines everything.
MK: Our Enterprise IP Analyzer is a software application that you run and point to a directory, or part of the repository of code that you want to analyze. Our application will go through every file there and it tries to find out any similarity between a file and any open source code that’s available in the public domain, or any proprietary code that can be identified from the signatures in our database. The Analyzer looks for the signatures and generates a report of all the open source projects detected and all of the licensing associated with whatever code you have in that directory.
You can specify that you don’t want any code in your project having GPL [GNU General Public License] licensing obligations, for example, and our solution will flag anything that was originally done under GPL. That’s our bulk analyzer.
RG: And what about real-time analysis?
MK: As I said, we also offer a real-time analyzer, called IP Assistant. It’s an Eclipse plug-in on the development platform that sits in the background. As developers work on their software, they may bring in a piece of code that’s a cut-and-paste from something on the web, or from another file somebody has carried in on a USB memory stick, or somebody’s library of routines on a CD. In any case, the IP Assistant runs unobtrusive in the background while the developer is working away, and any time it ‘sees’ an external content entering your programming workspace, it springs into action. First it logs it as say, a C file that came from location X and it went into this file at a certain time under the auspices of a certain person. Then it analyzes the code and attempts to identify the nature of the content via a number of techniques. We have databases of code signatures that we have accumulated over the years. We probably have a signature of every example of open source in our database, as well as those for much commercial proprietary code. Our program looks inside the file and looks for ‘footprints’ such as ‘Copyright by’ and it will look at the URL from which it came and try to glean something from that. Any information the program discovers is also logged in real time. We then check it against the policies that the project manager or administrator has established.
So with our products, developers know exactly what they have in their organization and in a particular software project in terms of the intellectual property attributes of all the code.
RG: Has this become a major problem? Are there a lot of lazy programmers out there who are pressed for time?
MK: Well, these days it’s so easy to access code. Good developers these days don’t really have to write much code – they know where to get code. We see open source growing by leaps and bounds. Open source is wonderful, it has many great attributes if it’s controlled in terms of access. If open source is managed correctly, it can hasten development, reduce cost and shorten a software time-to-market. But open source is not really ‘free’. Every piece of open source is governed by a license. That license may not have monetary value but there are obligations specified by that license with which anyone using the code must comply. You have to be careful because open source code can sneak into your project from anywhere – you may have outsourced some development or you may have subcontractors contributing code. You’re getting code, but you don’t know where exactly it came from. You have no idea whether any specific piece of code was created from scratch or somebody in an outsourcing company grabbed it from, say, the web as a last minute fix. After all, Google (News - Alert) has a search engine called Google Code Search, a great utility for software developers who are looking for any kind of code. Anything you want is out there, but it isn’t really free. It’s copyrighted. Somebody owns it. As long as you know the license associated with what you’re bringing into your project, and you know that the license fits into your business model and objectives, then that’s fine. The problem is that you can’t expect developers or even their managers to interpret the license terms and make that decision. First of all, they’re difficult to interpret, worse than a mortgage contract. You need expert legal interpretation on what license is acceptable and what isn’t.
Unfortunately, because of this complexity, some organizations just outright ban the use of open source. That’s a shame, because if you’ve got controlled, managed access to open source, it can benefit your organization greatly. That why we created our tools, to take away the pain of interpreting licenses from developers, project managers and their organizations, so they can concentrate on their development and their business. If they bring in something from the outside that has a license that violates the organization’s policies, then we can alert them via an email to the legal department, or we can ask the developer to insert a comment in the code to rewrite it or remove it, or we can do nothing and simply log it and later highlight it in a generated report.
RG: What about the Cisco brouhaha?
MK: What happened with Cisco started years ago. It’s an interesting story. A chip company which shall remain nameless, had an Ethernet controller chip, and they engaged outsourcers in a Far East country to develop some software drivers for it. They sold the chip, together with the software drivers of course, to a company called Linksys (News - Alert), which incorporated it into the design of their four-port router. It was one of the most popular of the smaller routers. In the process of doing this, Linksys added their own code around these software drivers and then they marketed the unit. It just so happens that the outsourcer the chip company had employed at some point inserted a piece of code subject to a GNU General Public License, or GNU GLP.
One of the properties of the GLP license is that you can use it and you don’t have to pay anything, but any code that’s written around this piece of code, and any modifications or derivates of the code, also becomes GPL. That means that you have to make the code publicly available. You basically must open up your code. Now Linksys probably wasn’t aware that they had such code. Then, of course, Cisco acquired Linksys and inherited the technology and the router and the code. Cisco then began using the code in other sections of their own products. So, suddenly, a whole bunch of products at Cisco are now ‘contaminated’ with GPL code, which means that Cisco is in the unenviable position of having to ‘open up’ their valuable proprietary code to the public. This is bad. But I’m pretty sure it was unintentional.
Of course, if Cisco had used our technology, they would have received a bill of software materials and they would know what exactly they were getting in the code and all aspects of the intellectual property of the code. You know, in the world of hardware, there are inventory systems, and there exists the concepts of approved components, approved suppliers, qualified components, and so forth. But there’s nothing like that in software. Our company Protecode, is probably the only organization right now that can create a software bill of materials and supports concepts such as approved software components and suppliers, automatically and without any manual-record keeping, which is impractical in most cases.
RG: I guess it’s time that programmers shape up and start taking more seriously the pedigree of the code snippets on which they increasingly rely.
HK: Yes indeed. We’re ready and waiting to help them.Richard Grigonis is Executive Editor of TMC�s IP Communications Group. To read more of Richard�s articles, please visit his columnist page.
Edited by Tim Gray