Is there a point to distributing .egg files?

June 26, 2008

So far I’ve found that distributing .egg files is mostly useless:

  • Source tarballs as created with the distutils/setuptools sdist command are not only equivalent to eggs, they often contain more information (such as top-level README.txt or INSTALL.txt files). Also, not everybody has embraced easy_install yet and a tarball is the least surprise to the old-school folks.
  • .egg files are marked with the Python version they were created with. So if you only upload an .egg file that was created with, say, Python 2.4, and you don’t provide a source tarball, Python 2.5 users will be out of luck trying to easy_install your package (even though it may perfectly work on Python 2.5).
  • If the package contains C extensions, you pretty much can’t risk uploading an .egg file because it’ll contain binaries. With Linux and MacOSX this is unacceptable due to the various ways Python itself can be built on these platforms (and will therefore likely be incompatible with anything you’ve built). One notable exception is Windows which (thanks to its homogeneousness) makes it possible to distribute binaries w/o problems. Then again, because Windows users rarely have the right compiler installed, it pretty much requires you to distribute binaries.

So why are we still uploading .egg files to PyPI? Isn’t it enough and even better to just upload the source tarball? (And a Windows binary egg only if the package contains C extensions.)

20 Responses to “Is there a point to distributing .egg files?”

  1. Malthe Says:

    There is no point. I think the primary reason for this practice is that the first example/tutorials on distributing your package using pypi had a “bdist_egg” in it.

    Personally, I haven’t uploaded .eggs since I first learned what it meant.


  2. “Eggs are to Python as Jars are to Java” — i.e., they’re a deployment format, not a distribution format.

    In other words, you’re pretty much correct. The main reason to distribute an .egg file is to support adding plugins to applications that use them, but don’t use easy_install or some variation thereof to install them.

  3. Rocky Says:

    Regarding eggs with C extensions … definitely distributing non-windows compiled eggs is a very bad idea.

    As for “why distribute eggs at all?” … it speeds up the build/install process. For a project with a zillion eggs (read: plone) that can really make a difference for developing. That’s it really. So basically my rules of thumb are:
    1) Never distribute .egg’s with C extensions unless it’s for windows
    2) *ALWAYS* distribute sdists
    3) Optionally distribute .egg’s without C extensions to improve build/install time

  4. Tarek Ziadé Says:

    I agree with what you are saying.

    Alhtough I think eggs can be a good way to distribute packages that are not (yet) compatible with the latest version of Python, to make sure the users use the right Python when they easy_install it.

  5. Martijn Pieters Says:

    We do still upload eggs when the tarball will trigger the infamous tar module bug present in 2.4 (paths of certain lengths are always treated as directories). In that case only a python2.4 egg is created. The alternative is fielding all the support requests that state that your tarbal is broken.

  6. Tarek Ziadé Says:

    @Matrijn, Ah right, I had to do it with a few packages, 2.4 tarfile is a pain … I am wondering if we could’nt patch it inside setuptools or zc.buildout, to get Python 2.5 one.


  7. I upload eggs to PyPI for ConfigObj because Turbogears requires an easy_install compatible distribution method.

    easy_install *wouldn’t* work for ConfigObj without eggs because setuptools treats any file served with a ‘.py’ extension as the name for the file. The source distribution for ConfigObj is served from a CGI ‘downman.py’ (my download manager) – and easy_install thus names the resulting zip or tarball ‘downman.py’, even though it sends the right http header with the correct name.


  8. How portable are Windows eggs? Aren’t there any 32- versus 64-bit OS issues?

  9. philikon Says:

    @Marius: Since regular Windows binary eggs are marked with the ‘win32’ platform, I imagine that those for Windows on 64bit would be marked with ‘win64’ or something along those lines. Also, I would expect that ‘win32’ eggs would work on a Windows 64bit platform.

  10. Brodie Rao Says:

    I always use sdist when uploading to PyPI, and I’ve personally stopped explicitly supporting setuptools in my setup.py scripts. While I love using easy_install myself, I prefer to have a more consistent user experience in terms of installation, and I was basically completely turned off to it when I discovered that it “sandboxes” setup scripts, not allowing them to write outside of the package’s path/egg.

    I don’t think it’s worth the trouble to support, it doesn’t provide that many compelling features on the package maintainer’s end, and you can still usually just easy_install any source distribution anyway.

    Now what I’d really like to see is an easy_install that doesn’t use setuptools and just searches and downloads from PyPI.


  11. Many users of my PyEphem module seem not to have their “python2.4-dev” package, or whatever their distribution calls it, installed; and might not even have the right compiler installed for building extensions to Python, in addition. My impression had been that, merely by having Python on their system, it is possible for them to install “setuptools” and then “easy_install” my package — and, if I’ve provided a binary egg, that’s all they need to do. Lacking the binary egg, they’ve got to figure out how to install the Python development suite for whatever operating system they are using, which is way beyond the knowledge of lots of people who just want to try writing astronomy code in this new language they’re heard of called “Python”.

    Maybe you’re thinking of a population that’s more developer-heavy?


  12. I hadn’t appreciated this before and am guilty of uploading many a .egg file. I’ll stop, though.🙂

    To PJE’s point, JAR files are also the main distribution mechanism in Java land, so I think the analogy there could cause some confusion. WAR and EAR files are more of a deployment concern for web apps, but will contain many libraries and other files, like templates and configuration files, so that’s not a perfect analogy either.

    Cheers,
    Martin

  13. Chris Galvan Says:

    Without any additional modifications, yes eggs do not lend themselves as a distribution mechanism, but there is a potential there that has not been realized by most people. The main obstacle is binary compatibility, but this can be overcome with post-install scripts that fix up the rpaths and mach-o headers, for Linux flavors and Mac OS X respectively, of the binaries inside the eggs.

    As an example, hdf5’s extensions are linked against zlib libraries and you can’t be certain what version of zlib an end-user may have on their system. However, if you ship the zlib runtime libraries with the egg and then fix up the rpaths(or mach-o headers) to point inside the egg, you have a self-contained egg that will work regardless of the end-user’s installed version of zlib.

    The company I work for (Enthought) uses eggs on a large-scale of distribution and so far it has worked on Windows, 2 flavors of Linux and Mac OS X. There are already tools in place that can be used to modify rpath’s and mach-o headers in order to deliver dynamically-linked binaries. For Linux, there is a tool called chrpath and for OS X there is macholib.

    There is definitely more potential in eggs than is currently credited towards them, as I have seen them successfully used as a distribution mechanism across a range of platforms. Some changes would need to be made in setuptools though in order for everyone to take advantage of them.

  14. philikon Says:

    @Chris: That’s interesting, though my experience with Linux binaries has been far more problematic just rpath problems. For instance, some Linux distributions compile their Python with the UCS4 setting for unicode, some with UCS2 or whatever it is. As far as I can tell, there’s no way to work around that and the egg identifier doesn’t account for this compile-time setting.

  15. Chris Galvan Says:

    @philikon: Yes, that is a more difficult issue to solve and to account for specific compile-time settings in the egg identifiers could clutter the egg names pretty quickly🙂

    In order to circumvent these issues, we (Enthought) ship a fully built version of Python with our extra packages included that can be installed along-side existing Python installations without conflicts. You can then get updates to these packages from our egg repositories, which have all been built against a Python installation that will be compatible with the end-users’.

    While this doesn’t solve the issue you mentioned for user’s who simply want to install a few select packages within their current installation, it does solve our large-scale distribution problem.

  16. Justin Ryan Says:

    Has anyone considered that the whole idea of eggs as deployment is a bit of a mess as well? I haven’t profiled, but a lot more work must be happening to traverse more numerous, deep directory structures, many of whom are mirrors of each other, confusing the always-beautiful packages-are-directories meme that makes Python so easy to deal with at a large scale, or should.

    When you have a package like plone.* or zope.* which has 5, 10, 20 subpackages depending on how many levels you look at, and each of those has its’ own entire tree with a ‘zope’ or ‘plone’ subdirectory and some boilerplate __init__.py. Surely there must be some overhead to reconciling these mirrored or, depending on how you look at it, overlapping namespaces.

    I agree that, eggs or sdist, it’s nice to have a simple, atomic package representing a given bit of functionality which may work on its’ own or at least without *all* of the packages from the master, but it seems that omelettes are ideal for deployment, once you’ve selected the packages.

    What we do now in a typical buildout is akin to extracting each package for a gnu/linux system, and symlinking its’ contents into place, and symlinking the contents into another place.

    Who’s with me!?😉

  17. Justin Ryan Says:

    Whups, there should have been a comma after __init__.py above, and a lower ‘s’ on surely.

  18. Chris Galvan Says:

    @Ryan: I’m not sure exactly what problem you are explaining🙂 Using setuptools, you can have the same namespaces in different eggs and they won’t overlap(overrule) eachother, if this is what you are afraid of. Instead, the namespace would be extended to contain the contents from each egg.

    Was this the problem you were describing or am I way off?🙂


  19. Has anyone considered that the whole idea of eggs as deployment is a bit of a mess as well? I haven’t profiled, but a lot more work must be happening to traverse more numerous, deep directory structures, many of whom are mirrors of each other, confusing the always-beautiful packages-are-directories meme that makes Python so easy to deal with at a large scale, or should.

    • philikon Says:

      Dude, this blogpost is 2 years old…😉

      Anyway, setuptools isn’t the only way to deploy eggs. pip deployss them quite nicely in the packages-are-directories way we know and love.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: