Caustic Shareables

Jonas Kgomo

Shareable Caustics

In the desire to understand the extended reality ecosystem for the ideal open, interoperable and cross-platform file type for 3D interactables, which is fundamental to unlocking the next generation of creativity we analyze modern viewing methods.

Shareable Caustics is a class of interactive methods for objects in mixed reality. It is casual to share files like videos, audio and text over the internet. However , there is no standard for sharing 3D objects,audio or text files combined with specific interaction.

Mixed reality product marketplace means a lot of sharing capabilities will be explored. CGI (computer-generated imagery) sharing will enable ergonomically sustainable usage for the affordance for personas (users). A way to enable zero-sum game is an interaction in which some combinations of actions provide a net gain or loss to the two of them.

How will we interact with reality?

Although this might seem like a simple question, I think its answer is as deep as its simplicity. From a spatial computing point of view it is dependant on the input, from an information point of view it is dependent on the channel. From a theoretical physics point of view it is dependent on the universe, and constraints from a mathematical point of view. From a biological view we have a mutisensory organism capable of manifold inputs.

Digital Dualism

Nathan Jurgenson is a sociologist and social media theorist at Snapchat. His perspective rejects the digital dualist position that the digital and physical are separate spheres and instead promotes the idea that atoms and bits enmesh to create our augmented reality.  As Thiel would say, we've seen "innovation in the world of bits, but not in the world of atoms."


How do you send a digital vinyl record to your friend and allow them to play it remotely on ther device, moreover , what do we even call that interaction file we just sent? What do we call catching Pokemons when they are something else, what does catching mean? These are the motivations for going through a list of shareable methods for interaction design


The question above can technically be posed as :  "What is a review or event? What are the specific fields in the data structure?" Microformat lets both user and machine understand and answer these questions equivocally. A web of data sources, services for exploring and manipulating data, and ways that users can connect them together. Aggregation Network  Model is a microformat for publishing 3D objects 

 <a class="h-card" href="">Tim Coates</a>


The important competing 3d models are glTF2.0 and USDZ in terms of interoperability.

In this series we ask the question, what shall we call the meriad of 3D shareable assets in digital realities, considering the interaction they inhibit ?


Apple launched this new open file format that enables several new experiences. These were designed to exchange source art assets in the content pipeline of a film studio. They are fantastic for that use case, but they were never designed for the needs of real-time client-side rendering applications like Apple is promoting it for. Safari on iOS 12 supports viewing 3D models and allows you to see them in Augmented Reality (AR). Supported assets use the Universal Scene Description format, or USDZ, developed by Pixar.

gLTF 2.0

glTF™ (GL Transmission Format) is a royalty-free specification for the efficient transmission and loading of 3D scenes and models by applications. glTf is the current best practice for runtime delivery formats. PBR-based materials, glTF 2.0 is a stable base for the future and will support practical runtime implementations for many graphics APIs.

Features include: Graphics API neutral, Physically Based Rendering (PBR) material definitions (Material information stored in textures), Deployment as single file (Binary glTF ), Morph Targets (enhanced animation system)

Windows Mixed Reality

WMR is a mixed reality platform introduced as part of the Windows 10 operating system, which provides holographic and mixed reality experiences with compatible head-mounted displays


Sceneform assets are assets that are encoded insides Google's Sceneforms API. They store information about the model. They work inside android studio for ARCore. For runtime, you can use the SFB, which is the binary for SFA.


A point cloud is a set of data points in some coordinate system $\mathbb R^3$ . The PLY file format defines vertices and (polygon) faces. This means the intersection(s) between a plane and the polygons defined in a PLY file are polylines (or polygons).

In the desire to summarise these interaction, we have classified three levels of interaction: 

Hard Caustic | Soft Caustics | Augmented Caustics

1. Hard Caustic(Haustics)

These are hard surfaces in the physical world that can recieve signals from the real world to the digital. Interaction in this case is designed as a Natural User Interface. So a keyboard falls under this class, here is the most general and ubitiquous model. In a more general sense we have a detached surface that responses to gestures. One example of this is hypersurfaces, this includes a series of modular UI that is embedded as IoT and/or sensors.  Ofcourse, we consider non-locality ( spooky action-at-a-distance) as a subject falling under this class.

This method is purely based on haptics as an input.

2. Soft Caustics (Saustics)

This is hardware designed to help interact within a software.In the WAVE VR application these soft caustics are known as waves. Input is usually limited to the electromagnetic spectrum radiation,hardware(joysticks,knucles .etc).


Soli is a Google project using RADAR as a base technology for sensing microgestures. Its chip transmits a millimetre-wave radar electromagnetic waves of wavelength 1mm $\to$ 10mm which are longer than infrared rays and x-rays but, smaller compared to radiowaves and microwaves.

Hardware that can interact with software as a shareable.

Leap Motion

These track infrared light with a wavelength of 850 nanometers, which is outside the visible light spectrum. The data takes the form of a grayscale stereo image of the near-infrared light spectrum, separated into the left and right cameras. Typically, the only objects you’ll see are those directly illuminated by the Leap Motion Controller’s LEDs. Algorithms from this sensor are not depth-mapping but computer vision oriented.

3. Augmented Caustics (Augstics)

Interaction on this part is basically screen based at minimal,however approaches that span gesture tracking, motion tracking, eye tracking, and speech are evolving and enabling more inclusive interaction. A seemingly simple interaction of “scanning” a floor decal and looking up to see an entertaining 3D animation is surprisingly difficult. Breaking down this user interaction, it actually includes a lot of parts—like the physical design of the decal, all the 2D UI, and all the 3D AR elements.



Bill Gates showing information equaivalence of a log of paper sheets  "This CD-ROM can hold more information than all the paper that's here below me" - Bill Gates,1994.

Shot in 1994 for a National Geographic story called the Information Revolution. For one of the pictures to illustrate the power of digital storage and came up with the idea of showing how much information can be stored on a CD. At the time, it was 330,000 sheets of single spaced 8x10 single pieces of paper.


Information Entropy

In 1948, Claude Shannon, a young engineer and mathematician working at the Bell Telephone Laboratories, published "A Mathematical Theory of Communication," a seminal paper that marked the birth of information theory. In that paper, Shannon defined what "information" meant for communication engineers and proposed a precise way to quantify it-in his theory, the fundamental unit of information is the bit.

He also showed how data could be "compressed" before transmission and how virtually error-free communication could be achieved. The concepts Shannon developed in his paper are at the heart of today's digital information technology. CDs, DVDs, cell phones, fax machines, modems, computer networks, hard drives, memory chips, encryption schemes, MP3 music, optical communication, high-definition television-all these things embody many of Shannon's ideas. What he showed is you can communicate reliably even though the communication medium is unreliable; that's what digital means.

The one that's most interesting for me is he proved the first threshold theorem. What that means is I could send my voice to you today as a wave, or I could send it to you as a symbol. What he showed is if I send it to you as a symbol, for a linear increase in the resource used to represent the symbol, there is an exponential reduction in the error of you getting the symbol correctly as long as the noise is below a threshold. If the noise is above the threshold, you're doomed. If it's below a threshold, a linear increase in the symbol gives you an exponential reduction in error.

Thus the human body might be the most unreliable device but according to Von Neuman that doesnt stop it from being an excellent commucation channel.


This blog is writen by Jonas.
Thoughts and comments are always welcome. As always, thanks for reading.