February 2009 Archives

Misunderstanding the AGPL

| 5 Comments

People seem to misunderstand the AGPL. (Or I do). Twice today, I've been told that the AGPL somehow requires that any software interacting with it to be under the AGPL, like it's some sort of infectious disease. Like Ebola or something.

For example, someone posted the following on MongoDB site as an anonymous comment :

My understanding (IANAL) of GNU AGPL, and probably that of the first commenter, is that GNU AGPL applies the GPL to applications accessing the server via network protocols, as well as to applications linking the libraries. That's the reason for the GNU GPL or GNU LGPL suggestion, so as to not force all applications that communicate with the server over the network to also be GNU GPL licensed (there by preventing most closed sourced or most commercial applications).

Clearly this is not the intent of the AGPL (although we shouldn't give Stallman and Moglin any ideas...). While the AGPL clearly exhibits the purest known form of FSF-style "Freedom(tm)" (more than the GPL or that fifth-column of FSF licenses, the LGPL), and it does capture the "All your modifications are belong to ... others" spirit of Freedom(tm), it clearly can't force a re-licencing or limitation upon code that touches it.

That would just be silly, and we all know that nothing silly ever happens with open source licensing.

Twofer Twofer

| 1 Comment

As I noted yesterday, I've been turned down for my two JavaOne talks.

Today, Sun sent me the same mail. Only difference is that the mails end with "Pthbbbbbbbbbb". I guess they wanted to rub it in.

Alas.

Baffled with Ruby

| 4 Comments

At 10gen, we've been writing drivers for the MongoDB database. MongoDB is fast, so we want to make sure that our drivers are fast. A user reported some performance numbers, and even though the driver is very new, the were slower than expected. Even thought my ruby isn't very good, I decided to take a look.

From my experience with our NIO-based Java drivers, I know that limiting memory churn when serializing the data to the wire format is critical, so that's the first place I looked. I didn't really like what I saw - a key array was allowed to resize on each write - but as I don't really know anything about Ruby, I wanted to prove that reusing a pre-allocated array was faster.

To make a long story short, I've run into the following mystery, and would appreciate any hint I can get. This "behavior" happens on Ruby 1.8.6 on both OS X and Ubuntu.

The code :
require 'benchmark'

class BufferContainer

 def initialize(initial_data)
   @buf=initial_data
 end

 def put_array(array)
   @buf[0, array.length] = array
 end
end

size = 1000
thearray = [0, 1, 2, 3, 4]

for i in (1..3)
 size *= 10
 a = Array.new(size)
# RUN WITH THIS COMMENTED OUT FIRST
# a[size] = 1
#

 puts "size = #{size} #{a.length}"
 puts  Benchmark.measure {

      10000.times {
        buf = BufferContainer.new(a)
        buf.put_array(thearray)
      }
  }
end

So what's the problem? For three different sizes of a pre-allocated destination array, I'm doing range copies, always of the same size, always at the same location.

   @buf[0, array.length] = array

Everything is pre-allocated, so there should be no reason to create garbage. However when I run this, I see :

size = 10000 10000
  0.050000   0.000000   0.050000 (  0.053745)
size = 100000 100000
  0.400000   0.000000   0.400000 (  0.409549)
size = 1000000 1000000
 17.770000   0.130000  17.900000 ( 18.338888)

What is this saying? For 10,000 iterations of the range copy, if the pre-allocated destination array is sized to 10,000, it takes, 0.05 seconds, if sized to 100000 it takes 0.4 seconds (an order of magnitude bigger), and if 1,000,000, then 18 seconds. (!)

My expectation was that it as constant in time - no allocations or memory moves should be being made - it should be a simple copy. I spent a lot of time trying to boil down this testcase, then play with it. I think this is entirely due to some GC issue (my primary theory), or lazy array creation, or something. I feel kind of stupid that I spent the time I did, but fixing this driver is important, I'm not a Rubyist so I don't know who to ask (The Google was no help), and I really hate a mystery.

On a hunch, I added the line in the code that is currently commented it out. (a[size] = 1), as I figured that if it was some weird lazy creation issue, this would force it to create the array.

Uncomment that line see what happens.

I'll dig up the Ruby source tomorrow, but if someone wants to save a Reluctant Rubyist the time....

Update : 2009-02-27 : Seems to be a bug in the implementation of Array, as it's reported gone in v1.9.1 of Ruby. It's also not in JRuby, but that's not a surprise since that's a re-write. I'm honestly surprised that no one has seen this before, or if they have, it wasn't fixed in 1.8.x

Twofer

Both proposals for JavaOne rejected.

Alas.

Erlang Driver for MongoDB

Elias Torres just committed the first version of a native Erlang driver for MongoDB.

This will force me to learn Erlang. Then I can hang with the cool kids. Maybe.

PHP and MongoDB

At 10gen, we just noted yesterday that we have a PHP Driver for the MongoDB database pretty well along.

I've been working with PHP from the POV of making sure that this impl is tested in our Hudson CI farm, and while I find it somewhat baroque to use (e.g. I have to edit /etc/php.ini as root to add the module....), it's cool that we have this and I'm looking forward to see what PHP people think.

This gives them a way to leverage a different kind of data store either instead of, or more likely, along side of their RDBMS.

Intellij IDEA 8.1 and Git!

Now with Git support.

Whee....

MongoDB v0.8

We've just released (well, last week...) the first release of MongoDB, an open source, high-performance [and now I'm going off official company script] "queryable persistent cache". Ok, it's a database, but I've discovered that when I introduce it as that, all of the preconceived notions, assumptions, use cases, tools, problem domains, etc that every programmer has after working w/ RDBMSs completely confuses the discussion.

This release is really a baseline release for us at 10gen after we re-focused the company on Jan 1 to the persistence layer of our appserver stack. The appserver will continue as an Apache Licensed project (http://www.babbleapp.org.

This release contains :

  • The MongoDB database (of course)
  • A new, slick, Google V8-based command-line client that lets you interact with the database in JavaScript.
  • Basic tools like import/export, backup/restore.
  • Drivers for Java, Ruby, Python and C++, with PHP probably going to be made available (although not yet releasable) today.
  • A RoR ActiveRecord Connector.
  • An implementation of the ActiveRecord pattern in Ruby (not to be confused with the RoR AR component).

So this think-of-it-not-as-a-database database has some interesting properties. It stores JSON-like documents. I say "JSON-like" because rather than just strings, numbers and booleans, it can store other types like dates, binary data, and distinguish between integer and floating point numbers. It's pretty quick - on my mac laptop, I can do 300k inserts/sec from a Java client (doing them in small blocks of 100 documents per network message), and random reads at about 30k/sec. (Awake readers will note that I'm not transactionally persisting that much data to disk at that rate... disks don't go that fast... a subject for another post). I can do fancy indexing on the "documents" - not just primary, but also index into sub-objects. E.G., f I have a document that in JSON would be structurally represented as :

{
   foo : {
       bar : ....,
       woogie : ....
   },
   x : ....
}

I can create indexes on things like foo.woogie. I can have multiple indexes per collection (think of a collection like a table).

It also has a rich query language that lets you do a lot of the things that you'd expect when coming from a SQL background, and lets you express those queries in a way that is compatible with thinking in the document structure you're working in (in JS notation with the "what I think about in SQL" above it in the comment):

 //  select * from mycollection where foo.bar == 10
  db.mycollection.find({ foo.bar : 10});
  //  select x from mycollection where foo.bar == 10 skip 10 limit 10 order by foo.woogie
  db.mycollection.find({foo.bar : 10}, {x:1}).skip(10).limit(10).order({foo.woogie:1});

Where the first example lets you find all documents in the mycollection collection where the value of bar of the foo element is 10. The second example goes further, skipping the first 10 elements, only returning 10 elements, ordering by the woogie subfield of foo, and limiting the return to partial documents that only contain the x field.

Also, you can do document updates - rather than replacing the whole document if you want to modify it (which is a horror show if you have large documents), you can just update elements of the document in-place :

// update mycollection set total = 10 where id = 12345
db.mycollection.update({id:12345}, {$set:{total:10}});

MongoDB also has some nice replication and semi-HA master pairing features, and sharding is on the way.

What's it good for? Well, as I argue when people give me the chance to speak about it, databases are changing - just look at what is available in the so-called "cloud" arena. It tends not to be a RDBMS if it's scalable. The storage engine under AppEngine, or Amazon's SimpleDB, or any of the Dynamo implementations, etc, all of which change your programming model to one that isn't "tables and joins". Or look at the excellent CouchDB, a JSON store. If the RDBMS isn't being replaced outright (like it has to be in "the cloud"), it can to be augmented with other persistence technologies that are better suited for a portion of the data requirements of a system.

So what's it good for? It works fine as a database, but you can't think relational. If you want to just replace MySQL with something else, but don't want to rethink your data model, MongoDB isn't for you. Because of it's pedigree and initial design requirements, it works very well as an "object" store for dynamic languages. JS objects, Python and Ruby hashes all go in and out very effortlessly :

db.mycollection.save({a:10, b:2});

We've had it supporting news-ish/blog-ish websites in production for a year now, and it does fine there. It does fine as a large object store - think big binary blobs here, like images and videos. We have a POC in progress where we leverage the server-side JS execution feature to provide transaction-like isolation for high-performance shopping-cart/inventory management. (4k a second at last check on a mac desktop). It has some interesting potential as a persistent cache - one where you aren't afraid to restart the cache for fear of the hammering the backend data store will receive.

I think that this DB has a lot of potential, and I look forward to seeing what other kinds of problems it can solve. Download it and try it. We have it available for OS X 32-bit and 64-bit, Linux 32-bit and 64-bit, Windows 32-bit, Solaris 32-bit. Let us know what you think.

http://www.mongodb.org

Android #3

Brief updates for anyone who cares.

  • Battery life isn't as bad as I saw that fateful day last week. Keeping GPS off seems to give me a full day of use, which is enough for me.
  • The included mail IMAP client has lots of "opportunity for improvement". I've switched to K-9, a fork of the original Google code, which has some nice features. Still, after the iPhone, there's lots of work still to be done here.
  • I can't find a way to read a PDF. The iPhone spoiled me, and now I think of this as table stakes.
  • I miss my newsgator reader from the iPhone. Typing in feed URLs directly on the phone is no fun. Also, the RSS reader market is wide-open. Yes, I know I can use Google Reader, but the key for RSS on a phone is being able to read when offline - like on a plane or subway.

I'm really hoping to find some time soon to play w/ the SDK...

A user reported some difficulty in getting an instance of MongoDB running on Ubuntu 8.10.

I spun up an instance of Ubuntu 8.10 32-bit (AMI from http://alestic.com/ - recommended...), installed the JDK, downloaded and installed MongoDB, ran it. Wrote email to user. Spun down instance.

Took about 20 minutes all told.

Definitely worth the $0.10 that AMZN will bill us.

Things can be strange out of context...

I said the following on IM yesterday...

...and seeing your Python, I know you wield a colon w/ the best of them...

Perfectly innocent. Nothing to see here. Move along.

"First, get rid of the good people"

| 5 Comments

[This has been sitting around for a while... I'll leave it at the three people I have and try for more later]

I'm a long-time "Sun Kremlinologist", mostly driven by my relationship with Sun on behalf of the Apache Software Foundation. I'm also a Java Weenie(TM), and for the longest time, Sun (NASDAQ:---->JAVA<----) was the center of that universe. (The relatively recent ticker change seemed like a futile grasp for the old glories.) I've made a lot of really good friends at Sun over the years, and thus have been watching with admittedly hopeful eyes that Sun might pull itself out of the "nose down, engines off" flight path that it's on ("Saves energy! Quieter!"). Whether you love Sun or hate Sun, the industry just wouldn't be the same with out them. I mean, they've been responsible for some really great things (Java, for example) and recently, quite a bit of entertainment - who could you rely on for such constant hilarity like their broken IP policy or "Java FX". (Seriously? You want to take on Adobe *and* Microsoft at the same time in a mature market?)

Anyway, their recent round of layoffs was personally distressing. The title of this piece comes from an IM conversation I had with one of the victims - I was trying to figure out the thinking behind the layoffs, and the best I could come up with was "First, get rid of the good people", as it seems that they let go a lot of good older, senior people, people who knew the industry, had the relationships, got things done, etc.

So, like the "Stray Sunbeam" series being done by Tim Bray, I'd like to mention some of the people I know and suggest that if you are looking for solid people who get things done, you hire them. My POV is biased - these are friends. (Some I don't know, but mentioning anyway). Also, they don't know I'm doing this, and some will probably be mad at me. Now that Cheney's bunker is empty, maybe I can use that if they come after me.

  • Onno Kluyt : My good friend and long-time nemesis on the JCP, Onno and I have developed our personal and professional relationships to the point where we could literally be screaming at each other over Sun's utterly foolish and destructive passive-aggressive patent policies (who ever guessed that the "Participation Age" would require a patent license?) and still sit down over dinner moments later. He's a seasoned manager and corporate politician, and I have no idea what he wants to do next. He's also brutally honest - when I first engaged with Joost, and was visiting the Netherlands, I asked him what to look for in the local food, as I like to eat local when I travel. His response? "Have you ever seen a Dutch restaurant outside of Holland?" (If you couldn't tell by the unpronounceable name, he's Dutch).
  • Sara Dornsife : Another good friend, I met Sara during the open source Java wars - she was doing community and developer marketing for... ok, it never was clear - Sun's dev and community marketing strategery never really was that obvious ("Ensure that we do what we can to ostracize the most popular Java IDE - Eclipse - and fracture the market with our own because we can put our logo on it.") She's very versatile and execution oriented, is able to connect with people, understands developers and is able to get things done. I guess the best way to sum it up is that she understands that when trying to get some communities together, it sometimes requires a tequilla shot.
  • Ray Gans : Yet another person I consider a friend, Ray was responsible for a lot of activity around the open-sourcing of their implementation of the Java SE spec. (i.e "Open"JDK). Despite the fact that I was seen as a "dangerous enemy" for helping put together Apache Harmony, an open source implementation of the same spec that gave Sun the push to do OpenJDK (see. "NIH, fear of, e.g. Eclipse"), he had the wisdom and the open mind to solicit my input into how they should OSS Java (Ray, you picked the wrong license...), graciously invited me to attend the OpenJDK launch, and since then has always ensure that I was welcome to hang with the Sun crowd at conferences and similar events. I think he has a wealth of experience in many areas, but his work in OpenJDK - which I do think is a milestone in open source history - is valuable to anyone either wanting to open a living codebase, or engage in a serious way with open source. He was a corporate guy - not an open source partisan - who jumped into an alien world, and he did very well.

Android #2

| 2 Comments

Fight the power!

The gPhone certainly does, or at least gobble it. Yesterday, the phone didn't make it to noon before nearly expiring with a low battery. Far worse than the iPhone, which I thought was pretty bad. I was out of the office, wandering about Brooklyn for a meeting, and maybe the WiFi searching was the problem. I also remember using the GPS in maps to be sure I was wandering the right way, and left it on that app. I wonder if that did it - I've done that before w/ iPhone when letting it track current position.

I've turned off the wireless this morning when leaving the house - I'll see if that helps.

Android #1

Search. Where's search? A phone from a company who's name has become synonymous with search produces a phone stack w/o mail and contact search?

Maybe I'm dumb and I just can't find it.

Also, AT&T is going to make this difficult. I put the iPhone SIM in and telephony and SMS work, but no data services. I went to AT&T store, and they said that the iPhone data plan is different than the PDA data plan. They cost the same. The services are the same. But they are different (I suppose so Steve J can get his tithe). I'll call AT&T Central Services and see if I can work something out. Until then, only get email on the thing when I'm at home.

Re the email client, I'm not convinced it works right wrt IMAP.... Things don't seem to be in sync.

I also forgot to turn comments on to yesterdays, so that's fixed now.

Update 1 : CrazyBob reminded me that ironically, I failed to turn on comments today. /me wanders off to see if I can change a default.

Update 2 : CrazyBob pointed me at http://groups.google.com/group/android-developers/browse_thread/thread/8fa165d1d2a5f1f0 to fix the data service problem. It works. :) Thanks Bob

Harmonious Android

I managed to acquire a Google Android Dev phone from Munificent Chris DiBona, as he shall henceforth be known. The Android phone has a very special place in my heart simply because code from Apache Harmony ("Putting the 'open' in an open JDK") helps power the phone. So even if I don't use it, it certainly will be a treasured geeky memento for a part of my life that continues to consume passion and energy. (My wife can't stand my collection of geeky momentos.) Anyway...

First, thank you Chris.

Second, thank you Chris.

Third, it's amazingly fast (remember, I've been using an iPhone for the last year+). I'll need to ping Dan and tell him how impressive that is.

Fourth, it has this really nice "geek aesthetic" that you don't find on the very slick, very polished iPhone, and that's a compliment.

Fifth... this gave me pause - the phone doesn't appear to support any bluetooth profiles? I pine for the days of my T68i, where I could pair with my mac, effectively turning the mac into an extension of the phone - I could initiate and answer calls, send and receive SMS, etc. I really, really miss that integration, and wonder why I can't do it on the GPhone. Something to figure out.

The SDK is downloading as I write this - this should be fun.

(P.S. Chris, thanks again!)

About this Archive

This page is an archive of entries from February 2009 listed from newest to oldest.

January 2009 is the previous archive.

March 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.