## CSSMatrix 3D Transformations

Well I’ve spent a good weekend figuring out CSS3 3D transformations. In short, now I think it is intelligently designed to fit web design needs, however coming from Direct3D background I was looking for a camera in W3C API.

In this post I show how to make a touch sensitive, interactive 3D model of the iPad using matrix transformations.

Although iPad is not a perfect cuboid, for simplicity we assume it is. Google and download pictures of the 6 sides of the iPad. Read this excellent Introduction to CSS 3-D Transforms on how to make 3D cuboids using CSS3. It is fairly easy and straight forward.

I want to rotate the model by touching the screen / moving the mouse. We’re not digging into the details of touch event handling here, check out Touching and Gesturing on the iPhone for a nice discussion on this subject.

Obviously we move our fingers on the phones screen, or the mouse in 2D flat surfaces, but we want to rotate our iPad in 3D space. Luckily any rotation in 3D space can be decomposed to 3 elemental rotations around the axes of a coordinate system (frame of reference), using Euler angles; meaning that a user can rotate the model to a desired state by no more than 3 gestures.

It works; you can freely rotate the model in any direction. But something’s not quite right: the UI response to gestures is not intuitive. Sometimes when you move your fingers to the left, the model rotates upward, another time it rotates down-right…. The problem is that every time that you rotate the model you change the orientation of its axes. You can leave your program as it is, it is sellable and in fact I’ve bought programs with this bug before. The rest of this post explains a solution to this problem.

## Linear Transformations

All CSS3 transformations (rotate, scale, skew) are reversible, you can rotate an object 40 degrees clockwise, scale it to 2x bigger, then shrink it to half and rotate it 40 degrees counter clockwise and you end up with the object in its initial state. Another interesting feature of CSS3 transformations is that although an image can be distorted by a transformation, straight lines don’t curve or bend and remain straight under any transformation. Each point of the original image is always mapped to one and only one point of the transformed image. These are the characteristics of linear maps.

Any linear map can be represented by a transformation matrix. In our case, in a 3-dimensional space, it is a 3×3 matrix. I won’t dig into the technical details of matrices, simply because we don’t need to know those details. Check out your old analytic geometry textbook.

Given a transformation matrix M, any point P of our object will be transformed by this matrix product:

P’ = M * P

(Here we use the fact that points can be represented by their position vectors, hence column matrices)

There’s a little thing about translation transformation. In good old geometry, a translation can be represented by a vector (when you translate an object, you move it along a path that has a direction and length), and you find the translated coordinates of a point P by summing up its original coordinates with the translation vector: P’ = P + T. There’s a trick to combine translation with other forms of linear transformations using 4×4 matrices.

If M is a 4×4 transformation matrix and P is a 4×1 column vector representing the coordinate of a point in space (the last row of the vector is set to 0), then:

P’ = M * P

P’ (the matrix product of M and P) is the coordinate of our point after the object has been transformed by M.

M can represent any state of escalation, rotation, translation, or skewness.

## CSSMatrix

CSS gives us the option to define our desired transformation by a 4×4 transformation matrix. This method is virtually useless in CSS declarative way, for a 3D transformation you have to calculate the matrix elements and pass 16 parameters to `matrix3d()`

property function (for an example to see how obscure the code might become, check out `rotate3d()`

definition in W3C’s CSS 3d transforms draft). But it is easy and very convenient to use this matrix in JavaScript code, thanks to DOM’s CSSMatrix interface. Currently (Sep. 2011) WebKit implements this interface by `WebKitCSSMatrix`

type.

We initialize an instance of a `WebKitCSSMatrix`

by passing a correct string value of `-webkit-transform`

CSS property. So one can construct it by something like this:

new WebKitCSSMatrix("scale3d(1,2,1)") or new WebKitCSSMatrix("scale3d(1,2,1) rotate3d(0,0,1, 45deg) translate3d(100px, 0, -20px)")

Here’s where this window object’s little useful function comes handy: ‘`window.getComputedStyle()`

‘. `getComputedStyle()`

takes a DOM Element and returns an instance of `CSSStyleDecleration`

that is a representation of all the style properties currently set for the element. It is also a dictionary. You can get the current transform value by: `window.getComputedStyle(element)["-webkit-transform"]`

or by `window.getComputedStyle(element).webkitTransform`

property. Its value is in form of `matrix()`

or `matrix3d()`

. To get the current `CSSMatrix`

that is applied to an element use:

`m = new WebKitCSSMatrix(window.getComputedStyle(element).webkitTransform)`

`CSSMatrix`

is indeed a 4×4 matrix (its properties are named m11 to m44), its `toString()`

method returns its CSS representation (in `matrix()`

or `matrix3d()`

form).

It also provides a handful of useful functions for matrix manipulation. These functions don’t mutate the object; they return a new instance of `CSSMatrix`

:

- multiply
- inverse
- translate
- scale
- rotate
- rotateAxisAngle
- skewX
- skewY

Check out Apple’s documentation.

I learnt it in a hard way that `multiply()`

function doesn’t exactly work as I understand from the documentations. The text says, and I naturally expected that, given matrices A and B, `A.multipy(B)`

must be equal to `A * B`

in math notation. But it turned out that it is actually equal to `B * A`

.

Back to our original problem, let’s differentiate between the model’s frame of reference and the world (device viewport) frame of reference. Your view port (computer’s screen) has a static frame of reference (for our purpose). Viewport axes: Up (Y), Right (X) and Facing you (Z) are attached to the device; they don’t change with respect to the device. But the directions of your 3D model’s axes (X’,Y’,Z’) change as you rotate it inside the viewport. Our UI inconsistent response problem happened because when we move our fingers upward on the device, we expect the model to rotate around device X axis, but `rotate3d(1,0,0,#deg)`

actually rotates the model around its own X’ axis.

Luckily `rotate3d(z,y,z,#deg)`

function can rotate the model around any arbitrary axis (defined by vector [x,y,z] here). So the problem boils down to finding, which axis of the rotated object is parallel to the device X and Y axes, after an arbitrary rotation.

We know that rotation can be represented by a linear transformation matrix. If V’ is an arbitrary axis on the model, that was parallel to V (an axis in device frame of reference) prior to the rotation, then we can find what axis on object is now parallel to V after the rotation, by:

V’’ = M * V’

(where M is the transformation matrix that defines the rotation)

If the model is not transformed (when `window.getComputedStyle(element).webkitTransform`

is the identity matrix) V’’ is parallel to V’ parallel to V.

Note that we represent an axis by a vector parallel to it, so the column vector: represents X axis, represents Y and represent Z. A rotation linear transformation can be represented by a 3×3 matrix

M =

There you go. You can extract `CSSMatrix`

m11..m33 elements and write a little bit of JavaScript to produce V’’. Or use `CSSMatrix.multiply()`

function that takes a `CSSMatrix`

as its argument; then you have to construct a 4×4 representation of axes by just padding the column vector and setting all the other elements to 0.

In JavaScript:

var deviceXAxis = new WebKitCSSMatrix("matrix3d(1,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0)"); var deviceYAxis = new WebKitCSSMatrix("matrix3d(0,0,0,0, 1,0,0,0, 0,0,0,0, 0,0,0,0)");

The result of `deviceXAxis.multiply(transformation)`

is the object’s X’’ axis that is currently parallel to device X axis.

The following function rotates the model around device X and Y axes (resulting in a natural user experience):

function rotateModel (xRot, yRot) { // get the current transformation matrix: var m = new WebKitCSSMatrix(window.getComputedStyle(cube).webkitTransform); // Model Y’ axis that is now parallel to device Y axis: var yAxis = ipad.deviceYAxis.multiply(m); // Rotate around Y’: var m1 = m.rotateAxisAngle(yAxis.m11, yAxis.m21, yAxis.m31, yRot); // Model X’ axis that is now parallel to device X axis: var xAxis = ipad.deviceXAxis.multiply(m1); // Rotate around X’: var m2 = m1.rotateAxisAngle(xAxis.m11, xAxis.m21, xAxis.m31, xRot); // Apply the final rotation matrix to the model: cube.style.webkitTransform = m2.toString(); }

## HTML5 mobile games

For everybody who got here looking for selling their games to my company, you can catch me in twitter @homam, Google, Windows Live (whichever you like :)).

## Mutually Dependent Systems

In this post I am discussing a problem that I have faced several times in the past year. Simplicity is always a goal in design as it saves resources during development and maintenance. But it’s not always clear which design is simpler. Sometimes a seemingly complex design turns out to be simpler to develop, maintain and extend.

In a master-slave architecture, assume S1 is the master. It produces one or more tasks form a given job and transfers them to S2 (the slave); S2 does the tasks and return the results back to S1. S2, the slave, depends on S1, the master.

If the next time that S1 assigns a task to S2 it uses the information that exists in the result of a previous task that had been assigned to S2 then S1 also depends on S2 and we have a mutually dependent couple.

In our terminology the systems are mutually dependent if and only if S1 uses the information it gained as a result of a previous task that it had already assigned to S2. It doesn’t matter if S2 has completed the previous task or not, but it should have reported something to S1 that is useful for S1 for a next assignment of a task to S2.

If S1 is only using the fact that S2 is busy or free then we don’t call it a mutual dependency. S1 must use the information that is generated by processing a task at S2. For example a MapReduce system is not a mutually dependent system.

Why is it important? You should have already guessed that S2 is the name of a class of slave systems that work with S1. There could be many instances of S2. Let’s define a homogenous mutually dependent system as a system that in which all slaves of S1 are in the same class.

Two slaves are of the same class if they share a common interface for communicating with S1.

Now assume that S3 is also a slave for S1. S3 is in a different class other than S2 if either its input or its output interface is different from S2’s.

When designing mutual dependent systems we have to always decide whether to keep the mutual dependencies or to break them by introducing new nodes. It’s mainly a decision over complexity. The other factor that may affect your decision is the swiftness of the system. Introducing a new node will usually reduce the responsiveness.

For instance a new node must not be added if S1 waits for S2 to return. Generally you should try to keep the number of nodes as small as possible if the operations are not asynchronous.

Homogenous mutual dependency is OK (when the systems are simple and synchronous) but things get much dirtier as we introduce new classes to the system. On the other hand if extensibility is a goal you should try to avoid mutual dependencies.

For a conclusion, use mutual dependent systems in live systems, when a rapid response is required, and try to avoid them by introducing middle nodes if you have many classes of slaves or if extensibility is a goal.

## Canvas Intellisense in Visual Studio

I was playing with HTML5 Canvas element to see how it could be useful in future web based game developments. I like that it is easier than GDI. I haven’t yet done much performance testing but it is definitely faster than making games by animating DOM elements.

Recently I had some free time so I decided to create vsdoc documention for Canvas element interface for Visual Studio. I added intellisense (auto competition) and some helps and tips.

Download canvas-vsdoc.js and canvas-utils.js from CodePlex.

It is tuned to work with VS2010, but we can make it work with VS2008 too.

canvas-vsdoc.js contains the intellisense documentation.

canvas-utils.js has a few utility functions (like detecting if the browser supports Canvas) and some enumeration types for things like Line Joins, Repeations, Text Aligns, etc.

To use the intellisense you need to reference canvas-vsdoc.js in the beginning of your JavaScript file, like this:

/// <reference path=”canvas-vsdoc.js” />

Note you can just drop the .js file and Visual Studio will write the reference.

Then use a utility method to get a reference to canvas element:

var canvas = Canvas.vsGet(document.getElementById("canvas1"));

Canvas.vsGet(element) receives a HTML element and returns the given element itself if it is in runtime. But in design time it returns Canvas.vsDoc.VSDocCanvasElement object that contains the documentations.

Then you can use the canvas element as usual:

var ctx = canvas.getContext("2d"); ctx.arc(50, 50, 25, 0, Math.PI, true); …

Please note canvas-vsdoc.js must not be included in runtime but canvas-utils.js should be included (if you want to use Canvas.vsGet() and other utilities).

In VS2008 you should trick the environment by assigning the variable that refers the 2D context to Canvas.vsDoc.Canvas2dContext, by something like this:

var ctx = canvas.getContext("2d"); if (typeof DESIGN_TIME != "undefined" && DESIGN_TIME) ctx = Canvas.vsDoc.Canvas2dContext;

DESIGN_TIME global variable is defined inside canvas-vsdoc.js. In runtime it should be undefined or false.

Just a note: if you still want to work in IE, you will find this Google extension very interesting: http://code.google.com/chrome/chromeframe/

Update: Visual Studio 11 natively supports canvas intellisense.

## iframe Cookies in Safari

Older Hyzonia games depend on session and authentication cookies. This dependency has been fixed in the newer games by storing session ID in JavaScript variables. The cookie independent services explicitly require a session ID to be sent by their clients.

In this post I am not going to dig into the details of session management in Hyzonia platform, I just want to highlight a series of problems in the old schema that led us to redesign the session management behavior.

Hyzonia games can be embedded in publishers websites using a piece of code we call Hyzobox. Hyzobox basically renders an iframe in the webpage. The internet domain where the actual game is hosted could be different from the publisher’s domain. If you have ever tried this before you know that we gonna have a lot of cross site security issues.

To address cross site scripting issues we developed Hyzobox In/Out API. A publisher can control certain things in the game and be notified about the events that are occurring inside the game using In/Out. It is a JavaScript based solution and strangely is widely supported in all major browsers. The In/Out API is not made public yet, but we are using it extensively in www.hyzogames.com. For instance whenever you win in a game Hyzogames.com will be notified about this event (winning) and may show you a message box.

But cookies are another issue. Different browsers have way different behaviors when it comes to handling cookies in iframes. For starters for it to works in IE you need a P3P header like this:

CP=”IDC DSP COR ADM DEVi TAIi PSA PSD IVAi IVDi CONi HIS OUR IND CNT”

There’s a lot to say here, I have a long standing view that P3P is generally useful but this kind of usage is pointless. Anyway for now just add it in your response and relax.

But still Safari rejects the cookies that iframes try to write. The rationale here is that Safari only wants to write cookies from websites that the user directly visits. It’s not a bad idea for privacy. Let’s assume that you are visiting a fan website for The Grudge! thegrudgefans.com is using Google AdSense (put any evil multibillion dollar internet ad service instead of Google :D) to display you some ads or even just in the background. The AdSense is running inside an iframe and it writes a cookie on your computer indicating you’re a fan of nonsense horror teen movies. Now it is written on your face that you’re a fan of The Grudge. AdSense can use this cookie anywhere else in the internet. OK you got the idea.

The problem was this privacy feature in Safari was causing our Hyzobox User Integration (a kind of Single Sign On service) to break. Safari users can always turn on a checkbox in the preferences to accept all cookies. But it’s not the case by default. The workaround is that the page that writes the cookies must be initiated as a result of a direct user request. Literally meaning that prior to writing any cookie you have to provide a hyperlink (an explicit anchor tag) in your iframe that takes the user to the page that writes the cookie.

## Overbranding

It’s a classic question I ask every UX/UI designer who I interview. What’s wrong in this picture?

Tip: The same problem exists here in our own product:

## Higher spatial dimensions

A few days ago I had an interesting discussions about the existence of other dimensions. In my experience most people who have heard about Big Bang, relativity or even inflationary theory are unfamiliar with have a vague understating of this concept. Although higher spatial dimensions are extremely easy to understand and even have applications in everyday circumstances.

Most of us are familiar with the idea of flatlanders (thanks to Carl Sagan), 2-dimensional flat creatures that live on a plane, or a surface. A flatlander can never see a 3rd dimension but he can deal with it mathematically. A flatlander can certainly understand points, lines, circles and all other 2D geometrical objects. A flatlander needs 2 piece of information to identify any point in his universe: X and Y. He should have no problem in imagining line-lander creatures who need one piece of information to identify the positions in their universe: X. He can think that his flatland is made of infinite number of line-lands. To make a flatland you have to take a line-land and drag it in a direction orthogonal to it.

Flatlander knows that any position in line-land can be described by:

And any position in flatland can be described by:

Or

Y is a dimension that is unknown to line-landers and is the direction that we dragged the line-land to create a flatland.

We as 3-dimensional creatures can think that our 3D world is made of a flatland, dragged in a direction orthogonal to it. Any position in our world can be described by:

Or

Z is the new direction that is orthogonal to flatland.

You already got the picture, a 4-dimensional creature can drag our volume-land in a direction orthogonal to it to create his 4D world and so on.

(W is the new dimension).

In other words:

A line-lander has no idea of a flatland. The position is only X for him. He describes a 0-land by X.

A flatlander describes a line-land by , a set of s.

A volume-lander describes a flatland by , a set of s

A 4D-lander describes a volume-land by , a set of s.

From elementary mathematics you must remember that a set of objects could be described by a function. For example if is a function defined on real numbers and for each real number that it receives it returns another real number then:

Is the description of a line-land from a flatlander’s point of view.

Other kinds of line-lands include:

And so on.

The line-lander only knows about X. If somebody tells him that his line-land is a part of a flat-land he doesn’t immediately find out what F function describes his line-land the flatland. But F is easily known to the flatlanders who are studying the line-land. Generally F describes the shape of the universe in a higher spatial dimension.

Let’s talk about an interesting example. Assume that the line-land world as it is seen by our flatlanders is a circle. It means that . The poor line-lander has no idea about the 2-dimensional shape of his world but he can find it out.

The line-lander finds out that his world is indeed bounded if he starts walking toward a direction and reaches the starting point.

A sphere is a 3-dimensional circle, F for a sphere is:

Let’s rewrite these two equations in a more usual form:

1D Circle:

2D Sphere:

Look at the pattern:

0D | A Pair of Points | ||

1D | Circle | ||

2D | Sphere | ||

3D | 3-Sphere | ||

ND | N-Sphere |

The surface of a sphere could be described by many circles. The smallest circle at the north pole is a point, the radius of the circles grows as we reach the equator and then again shrinks back to 0 at south pole. This is best described in this form of spherical coordinates:

is a circle. A sphere is a collection of circles stacked on each other. The circles at north and south pole have 0 radius and the radius of the circle at equator is maximum (a). We can think that the radius of these circles change by: . This is orthogonal to (and is in the new dimension).

So describes a sphere.

In the same sense you can think that a circle is made of many pairs of points. At the top of the circle the distance between the pairs is 0, it reaches a maximum in equator and again 0 in the bottom.

Now we can extend this model to higher dimension spheres. Take a 3D sphere with radius 0, increase the radius to a maximum and then shrink it back to 0; you have a 3-sphere.

is a 3-sphere (a 4 dimensional sphere)

This way you can make other higher dimensional shapes. Note that here we only talked about spatial dimensions.

For example a stack of lines in Y-direction makes a square, a stack of flat squares in Z direction makes a cube and a stack of cubes in W direction makes a tesseract.

Take a look at the projection of a tesseract (4D cube) in 3D space here.

## Hollywood Principle

Today I was working on refactoring some names in Hyzobox In/Out API. It is my personal favorite piece of code in the whole platform. In summary it allows the publisher of the game to customize the game in runtime dynamically by injecting codes and executing some functions in the context of the game (In) and to get notified about the events that are occurring inside the game (Out).

Inversion of Control is very natural in JavaScript and it has been used extensively in In/Out API. It’s quite different from popular prototype pattern but most everyday JavaScript programmers use IoC even though they don’t usually notice it.

Here is a sample in our API:

We register an event listener in Hyzobox, waiting for landing view of the game to be loaded:

var hb = Hyzobox.createInstance(); var landingview_loaded = function(hbEvent) { // do something with hbEvent } hb.addEventListener('landingview_loaded', landingview_loaded);

It’s is clear that we don’t have control over when landingview_loaded event will be fired and its handler will be called.

hbEvent argument that is of type Hyzobox.Event has a data attribute that is of type Object. In this example the data is a LandingView instance. We can interact with this object in the event handler:

var landingview_loaded = function(hbEvent) { var view = hbEvent.data; view.inject('a-container-id', 'some text'); };

In this example we are injecting ‘some text’ to an element identified by ‘a-container-id’ inside landing view.

addEventListener() is part of Out and inject() is part of In API. If we want to listen to the events that are occurring inside a particular view, we should register their halnders after the view has been loaded:

var landingview_loaded = function(hbEvent) { var view = hbEvent.data; view.attachEventListener('playNow', function playNowHandler(playNowHbEvent) { // do something with playNowHbEvent }); };

And so on, we can have many nested Ins and Outs.

It’s easy to see that in these examples the control has been transferred to the event handlers. The caller raise the events but it’s the responsibility of the handlers to control the functionalities.

I don’t want to go into the the details now, because the work hasn’t yet been finished on these APIs. Once released, we will post them in Hyzobox documentations and Hyzonia tech blog in the following weeks.

## Microfinancing, Lending vs donating

I want to talk about what the brilliant idea microfinancing is.

I grew up in an Islamic family – however in my childhood they were not as conservative as they are now. It’s a tradition in Iran and I think it is also an Islamic rule to donate some portion of your surplus to poor people; usually the people you know from your work, or in the neighborhood. Currently in Iran there is a well systematic Islamic donation process, and I think people are obliged to donate one fifth of their yearly surplus to this system. And no matter where you are, we are all used to see charities everywhere and in different occasions (like thanks giving or new year). Just a disclaimer that I’m by far not an expert in this area, but I’ve always believed charities are not a solution to poverty, they make it worse, they distribute poverty. They keep poor people poor. If the poor person is jobless, donations don’t help him in getting hired, and if the poor person is receiving a very low wage, the charity doesn’t urge the employer to pay more.

During the past months there were times that I was finding myself dangerously inclined toward social ideas and then I was swinging back to ‘prosperity’ and capitalism. But this issue that you have to always start form some place bigger than zero to have a life in a capitalistic society had been bothering me. Now I think microfinancing is a reasonable solution for this problem.

Why do you want to donate to somebody and loose some money? Why don’t you lend him the money so he can start a business and pay you back? I understand that charities might be vital for some regions of the world, but I’m sure most of the donations that my family is doing in Iran will not be used in those regions.

‘Carlos needs $1,100 to buy some pigs’ he will return the loan in a year, but no bank is lending him, simply because the cost of processing the loan is higher than the profit. But this website: www.kiva.org is making it possible for Carlos to raise the money in a matter of a week. Carlos will not be another homeless beggar, he is starting his own business and his future depends on how hard he works and how well he manages the business. He will not need charity and in a few months he should be able to even hire employees and extend his business.

I was impressed that my like-minded people are among the top lenders of Kiva. Yeah! I think superstitious people have a hard time digesting this idea that they, not God, can help people to stand on their own two feet and overcome poverty. And of course the LGBT community that are helping their entrepreneurs to achieve their own equality.

That’s pretty all I wanted to say, I have something beautiful to think about tonight.

leave a comment