Sunday, April 22, 2012

After Two Years: AS3 Vector.<T> Review

Let me start off initially by saying that this post is pretty much a rant with examples. On a given day, I complain a lot, mostly about complainers (not including myself), objectively stupid people, or arrogant people/organizations who make stupid choices (the mythical programming unicorns to which all evil code emerges).

Concerning objectively stupid people, If you wrote a blog post about "Why Adobe Killed Flash," you are stupid. There is a decline in Flash's popularity and all of a sudden, you think you're the Nostradamus of technology. No one cares that you lost your contracting job making banner ads for Viagra.

Concerning the evil mythical programming unicorns, this will be my topic for today. More specifically, the Flash Platform's Vector.<T> class and the mindless semantic differences between it and the Array class. I suppose this is a fairly out of date topic, being as the Vector.<T> class appeared in the Flash Player 10.0 release. However, I have not noticed any changes or improvements to the APIs since the 10.0 release, and thus the complaining commences:

Just a note, that I use the "shorthand" instantiation of both Array and Vector. If you're unfamilar with this syntax you can read more here, or here's a quick rundown:


Conversion


Array

The conversion from Array to Vector.<T> is done via a top-level conversion method, similar to XML(xml:String). This is a good choice in my opinion. This is done like so:


Vector

The conversion from Vector to Array doesn't exist as a utility in AS3. Um, ok. So, I guess we write our own. Since we want to be able to convert any T-type Vector, let's write a static helper function that accepts a * parameter:


There are problems with this, most notably:
  • No typing on the vector object, so it's going to execute substantially slower.
  • There's no type checking at compile time, so we could pass anything to this method and end up with unexpected run-time errors.

No typing on the vector object
The first bullet point could be solved if we get rid of the generic helper method, and just in-lined the conversion via an anonymous function or loop.


I personally like the anonymous function, just to make it clear that we're executing a conversion function on the Vector input and getting an Array output, but it might seem too "cute" for co-workers, so you may find work a more friendly place if you just use a for loop.

Unexpected run-time errors
The second bullet point, we can "solve" in a few ways as well. Let's say that my application has decent error handling, which requires a semi-intelligent use of Error objects when expected exceptions occur. Since the point is to avoid unexpected run-time errors, let's validate our input, and throw a more reasonable error, which can be caught and handled:


Using these new helpers, let's validate our original helper method before performing the iteration:


Thoughts

Why does it have to be so complex? Why is it so simple to convert an Array to a Vector, but not the other way around? The reason I have a problem with the lack of conversion is due to the nature of Vector and Array. Vectors are not only nice type-friendly collection objects, but they're also are very fast during iteration. They're also unlike Array in that they're strictly indexed (in order). Because of this, certain operations like element removal via splice() and shift() are slower using a Vector compared to with an Array. This makes sense because the Vector has the additional overhead of maintaining index order internally.

So, if I'm executing some sort of filtering logic on a collection of elements, it's typically best for me to stick with Array. However, in terms of creating a user-friendly API, I prefer strongly typed Vector structures.

What I am left with is either the decision to just stick with Array and avoid Vector, or implement two way conversions (which we've already seen is a bit of a headache). Perhaps there are better ways to go about this. Let's continue.

Concatenation


Array

Array + Array concatenation is fairly simple, and probably what anyone would expect:


The concat() method will also accept multiple Array objects:


Vector

Vector + Vector concatenation is also fairly straightforward, and works similarly to Array:


And just like Array's concat(), we can specify multiple Vector.<T> objects to concatenate:


Pretty much the same, right? Let's go back to Array.

Array

Array's concat() also let's you concatenate individual elements:




You can also use a combination of individual elements and Array objects:


Vector

For some reason, whoever wrote the Vector concat() method decided he/she would subtly rip-off the entire AS3 community by silently killing the ability to pass individual elements to the concat() method.



So, clearly you are only allowed to pass Vector objects with the same T-type to the concat() method. This seems wasteful:


We have to write three lines of code to create, copy, and append for Vector, while this can be done with 2 lines (preserving the original) with Array.

Thoughts

Again, we have a very similar set of APIs on two collection classes, and while the semantics of both seem identical (even after reading the documentation), they have subtle, yet annoying, differences.

Sorting


Array

I'm certainly a fan of the native array sorting that can be done in flash. It's very easy to use, and much faster than a homemade AS3 sorting function. Let's talk about the two sorting methods available on Array:

Custom Sorting Function
The sort() method allows you to pass a few different types of parameters, which you can read about in the AS3 Array sort() documentation, but for the sake of time, let's use the custom sort function method:



Vector

Not surprisingly, the Vector class has a sort() method as well, and would you believe that it takes the exact same parameters as Array's sort. At least, from what I can read in the ASDocs and from all my tests, they take the same type of arguments.


Alas, it seems the same lazy asshole who wrote the Vector concat() must have also copy and pasted the sort() method from Array because according to the Vector ASDocs, here's how I should use the sort options with the sort() method.


Of course! Now it's ok to pass Array data to a Vector method!

Array

Being able to use a custom sort() function on Array a huge plus since the there may not always be a clear way to sort a collection of elements. Even though the guts of the sort() method run natively, if you use a custom sorting function, it has to be called on N elements, and obviously, the code execution isn't native. Either way, sort(compare) is still fast, and to make it possible to do a semi-custom sort, the Array APIs added an elegant and simple hook for sorting objects using their properties: sortOn():


From an AS3 perspective, this is such a powerful tool. Native speed, custom property hooks, and basic sorting strategies all implemented in one line of code that's immediately easy to understand.

Vector

Vector does not have the method: sortOn(). This forces us to use the sort() method, or manually swap things in AS3. I just don't understand!

Final Thoughts

I've been using the Vector class since it's addition to the flash APIs for the Flash Player 10 release, and I've definitely seen how much speed improvement you can get by simply just swapping in Vector for Array. Let's also not forgot how great it is to finally be able to associate a single type with a collection of elements. Let me also commend the Flash Platform team for taking a non-existing generics standard, and cutting a few corners to bring developers something that "was as close as possible" at the time.

However, in the past few years, the only real reason (unfortunately) I have used Vector is for type clarity, and because it didn't really matter whether I used Array or Vector. To me, Vector.<int> is a lot less confusing to look at than just plain ole Array. In most of the flash games we've written at Electrotank, we just don't use a lot of super heavy iteration like you'd see in the "300,000 pixel pushing particles demo," and we quickly noticed that swapping in Vector for every occurrance of Array mindlessly was NOT the way to go. In fact, the conversion problem I mentioned earlier will turn your code into spaghetti very quickly, and many times, we saw a slower performance (especially using splice() and shift()).

Of course, I don't cover all of the differences between Array and Vector in this article, as I've chosen to focus on the two that effect me the most. In my opinion, using a faster, strongly typed data structure shouldn't be a burden on the developer. In fact, I may go as far as to say that you should be able to pass a Vector anywhere you can pass an Array. At the minimum, a native conversion from Vector -> Array is needed.

My disappointment is in the lack of effort that has been applied to the Vector class in the past few years. The subtle differences between Vector and Array make them far too difficult to use together. Collections are supposed to be available to a developer to use for different circumstances, and in most cases (like Java for example), collections can be easily converted into another. As a product, you end up with well organized, high performing code.

Unfortunately, the Flash Platform didn't see it the same way.

The claims I've made in this post are based on my general experience with AS3 and the Flash Player 10 and higher. I'm a bit over the top at times, some of the views expressed may be over exadurated, so take it all with a grain of salt. It'd be unfair for me to dish out comments like this if I wasn't prepared to accept corrections or disagreement, so please feel free to set me straight, or disagree!

7 comments:

Jobe Makar said...

That's all you have to say?

Matt Bolt said...

Of course it isn't, but I would need at least two lifetimes to get through it all.

Benjamin Jordan said...

Excellent article. I've always been frustrated with the lack of native vector to array conversion. I honestly see very little use for the Vector class except in public APIs because it's so difficult to use, and its speed benefits are so rarely applicable. It seems so out of place when compared to the rest of the language.

Alan Klement said...

RE: Vector -> Array conversion.

From a computer engineering POV. This is a really bad idea to do. The memory allocation and behavior of Array is dynamic - which is why it's slow. Every time you augment an Array, it's likely that the VM:

1) Creating a new memory block the same as the old one +1.
2) Copy the existing array to the new block (near the old Array)
3) Add the additional element
4) Kill old array

With Vectors, (being more static) the memory size is immutable, then the VM can allocate the proper amount of memory.

So if you're going Vector -> Array....you probably have to:

1) Figure out and convert how much memory you need when going Vector --> Array
2) Create the memory block(s)
3) Copy over data (this may take longer b/c the Vector memory is probably not near the Array memory as with Array -> Array)
4) Kill Vector memory

Matt Bolt said...

@Alan

You're correct concerning the static memory allocation for Vector as long as you set the fixed boolean to true in the constructor. If you use the Array -> Vector conversion, the end result is still a Vector with mutable size.

You sparked my curiosity, so I jumped into the Tamarin source and had a look at Array and Vector: Array and Vector allocations from Tamarin Source

It does appear that Array uses a slightly different allocation method, but purely based on it's "double" personality, where it behaves as a hash table and/or an index based Array.

From this first pass through, I'm mostly in agreement with you as to why they didn't add the conversion, but I don't see the exact reason why.

I suppose my initial complaint wouldn't change - Converting from Vector -> Array would be much faster running natively than it would be using actionscript to copy.

Jeff said...

What frustrates me even more is the half-assed use of Vector in the spark List and DataGrid components. Their "selectedItems" properties force you to use "Vector.<Object>". So not only are you forced to deal with Vectors, but those Vectors are typed uselessly.

Full language support should allow the creation of a List.<Number> whose "selectedItems" would return a "Vector.<Number>".

Full language support would also have a parent class or interface for Vector and Array, allowing the developer to abstractly deal with either with the same code.

When Vector came out, we all (my coworkers and I) jumped the gun by converting many uses of Array in our core libraries to typed Vector implementation. As you undoubtedly experienced as well, the Vector-Array headaches mounted, and we eventually converted nearly everything back to Array for improved simplicity.

Matt Bolt said...

I totally didn't realize they forced Vector into the list components like that. Considering that the flex collections are hard enough to deal with as is, slapping a mindless Vector.<Object> in there doesn't help at all. Thanks for sharing.

I have another one:
Vector.<T>::map() which maps a Vector.<T> to a Vector.<T>. That's NOT an implementation of map()... limiting the transformation to the same base type is ridiculous and renders that method useless.