Hype Document Loader (for PHP)

If anybody is interested, I developed a server-based PHP Document Loader for Hype generated scripts based on CJSON. This is an early version. Happy Easter.

3 Likes

↑ look at project
1.0.0 Initial release under existing CJSON license
1.0.1 fixes on indices and rendering return values
1.0.2 added loaded in constructor, add fetch_generated_script
1.0.3 Refactored to preg_split over match, additional nameValue rules

I was playing around and it is still work in progress but also added some first examples. One of them being a symbol/Hype output compression that already (WIP) does 50%+ size reduction if the symbol is used across more than one scene… I compressed a complex 1,9MB Hype generated file to 950kb…

sounds painful

Actually not. Coming up with a decent full match regex is much more painful then using a regex to split at specific points. I can recommend the tool from Grant Skinner. It allows save states and has a convenient GitHub login (OAuth).

Using preg_split over a explode allows to split on a key containing the build number that naturally changes from version to version. See…

Lol.
I was making a bad joke.

Still not ringing the joke bell… seems like the nuances are "lost in translation" on me.

↑ look at project
1.0.4 added injection option, added inject_code_before_init

Now, the compression allows to inject the lookup into the loader rather than setting it in window. The compression snippet on GitHub has been updated accordingly but is only a simple approach. I got a version that does additional string substitutions and gains another 10% (mail if interested).

Some words about the compression example and how it works. Hype offers symbols but if you use them extensively (specially if they contain multiple items) and you copy them across scenes you will realize that your exports become huge even though you might not expect that.

The reason behind this becomes clear if you run the exports through a deserialization process:

As you can see the data of a simple symbol is twice in the data tree describing this example file with two scenes (each holding the same symbol). As confirmed this is duo to an initial roll-out of the symbols feature in Hype that didn't consider the nature of instances in the exported data but rather only in the IDE. This also doesn't really show if your project isn't using symbols or uses symbols only in a limited way. Until recently, I wasn't really aware of this either. The project I am involved with started using around eighteen scenes with two layouts (2*18=36) and despite using symbols the export file was growing to fast. Hence, I investigated and learned about the described issue. There is not much one can do about it without changing the way Hype creates exports or by fiddling around with the exported data.

As we can't wait for this to be fixed in a future version of Hype, I ported the Document Loader approach to optimize the data duplications by using a reference to lookup repeating sub-branches in the symbol data. Creating a true instance approach goes beyond what can be done in this project and probably beyond what we as user should be taking care of anyway. But I am still proud to say that the lookup approach yields around 50% compression out of the box. In case of the project in question I combined it with a further string lookup and got to 60% compression (40% file size compared to the original export). You might wonder why I wrote the Document Loader and minification as PHP. That is because the project I am working on uses Hype as an online template engine and hence optimizations are done on the server, but you can use the script for one time optimization before deployment and one could probably integrate it into a Hype Bundle or classic export script (directly as PHP wrapped in a shell or port it to Python). But I currently don't have the time to do so… so this brings me to the reason why I am posting about this: I hope to raise awareness about this issue and hope the script helps somebody in a similar situation and/or explain why this compression oversight might be something we need fixed.

The current code solves my immediate problem but could probably be improved, and I also looked at true compression methods (LZWA, GZ) over the simple lookup that yield even better efficiency to the point of 70-85% compression, but they would require more in-depth integration and time.

4 Likes

Most modern web servers will on-the-fly gzip compress javascript files (commonly apache with mod_deflate). This works great on redundant text data like the Hype data file. It is hard to use compression with json data directly since it still needs to be in a text format, which means expanding it to base64 which usually loses most gains. If the server uses gzip, then double-compression doesn't often deliver any gains anyhow. (and then beyond this, you need to include the decompression code!)

1 Like

The discussion about further compression is totally valid but a topic that is beside the symbol compression that should be implemented. Looking into these "true" compressions is an additional route to explore, and I discarded it for now…

Good point and I am aware of server-side compression given the server is set up correctly and the browser does its thing.

Either way we both agree that there is also the discussion if a binary compression plus the decoder would be smaller than any string-based format. Either way I think that exports should offer as much "compression" as possible despite the web server possibly kicking in. Certainly, one reason people use minification and preprocessor and Hype uses a shorthand lookup. The symbol instance topic is something that should and could be addressed without deploying a true compression as seen in the example files just by adding some minimal JavaScript and restructuring in the document loader. I think we also agree that having a smaller file to begin with is great for ad restrictions, admins looking at log files and traffic (as many count download volume against environmental impact these days).

1 Like

I ran some further tests. The symbol compression benefits zip compression. My results have been that the symbol compressed files get even further compressed that if not prior symbol compressed and directly zipped. So, zipping seems to fail on some more minute details of repeating patterns. I also just put together a JavaScript based zip decoder MVP in a generated_js file that was previously already symbol compressed, then zipped and then base encoded. That thing unpacks itself and runs without any flaws and is 214kb including the decoder down from previously 1,9mb. I get that a similar or even better compression might kick in with on-the-fly mod_deflate stuff, but it is very impressive to see what can be done just with JS alone and fun to play around with.

2 Likes

Since you're hacking away at it, another area (I believe you have mentioned in the past) to potentially do is minification of the javascripts :slight_smile:.

1 Like