review v6

2025-04-08 13:46:36 +01:00 · 2025-04-08 13:46:36 +01:00 · cac5284129
commit cac5284129
parent 6339d63aaa
3 changed files with 57 additions and 59 deletions
--- a/biblio.bib
+++ b/biblio.bib
@ -22,6 +22,11 @@
 	author       = {{Apple Inc.}},
 	howpublished = {\url{https://developer.apple.com/visionos/}}
 }
+@misc{AppleVisionOSRenderPipeline,
+	title        = {Understanding the visionOS Render Pipeline},
+	author       = {{Apple Inc.}},
+	howpublished = {\url{https://developer.apple.com/documentation/visionos/understanding-the-visionos-render-pipeline}}
+}
@misc{ARCoreCloudAnchors,
 	title        = {Cloud Anchors allow different users to share AR experiences},
 	author       = {{Google Inc.}},
--- a/figures/bootstrap.drawio.svg
+++ b/figures/bootstrap.drawio.svg
--- a/main.tex
+++ b/main.tex
@ -153,15 +153,13 @@ We introduce AURA --- the \textbf{A}ugmented Reality \textbf{U}nified \textbf{R}
 \maketitle

 %% \section{Introduction} %for journal use above \firstsection{..} instead
-Augmented Reality (AR) is emerging as a key paradigm for immersive computing, blending digital content with the physical world in real time. Despite rapid advances in hardware and toolkits, developing AR applications today remains a fragmented endeavor. Different platforms (e.g., mobile AR toolkits, smartglasses) and engines (Unity, Unreal, etc.) impose distinct content representations and siloed runtime contexts. This fragmentation makes it difficult to coordinate multiple AR experiences together. In practice, users are confined to one AR application at a time, and there is no straightforward way to allow multiple AR apps to augment the environment concurrently \cite{huynh2022layerable}. These gaps highlight a pressing need for a unifying framework that can represent AR content in a platform-agnostic, shareable manner across applications.
+Augmented Reality (AR) is emerging as a key paradigm for immersive computing, blending digital content with the physical world in real time. Despite rapid advances in hardware and toolkits, developing AR applications today remains a fragmented endeavor. Different platforms (e.g., mobile AR toolkits, smartglasses) and engines (Unity, Unreal, etc.) impose distinct content representations and siloed runtime contexts. This fragmentation makes it difficult to coordinate multiple AR experiences together. In practice, users are confined to one AR application at a time, and there is no straightforward way to allow multiple AR apps to augment the environment concurrently \cite{huynh2022layerable}.

 Most existing AR development platforms, such as Unity's XR Interaction Toolkit and Unreal Engine's AR APIs, are geared toward building single-application experiences. These tools provide abstractions for tracking, rendering, and input (for example, ARKit provides world tracking and scene understanding \cite{ARKitDocumentation}, and ARCore offers similar capabilities on Android \cite{ARCoreDocumentation}), but they lack a standardized model for representing application structure, coordinating access to physical components, or supporting interaction logic across applications.  Unity's XR Interaction Toolkit, for instance, is a ``high-level, component-based, interaction system for creating VR and AR experiences'' \cite{AndroidXRUnity}, focused on common interactions within one application. This single-app focus limits scalability, complicates cross-platform deployment, and hinders the development of multi-application AR environments.

-The motivation for AURA stems from several observations. First, current AR experiences are typically built in isolation: there is no standard way for one AR app to expose its content or to incorporate content from others. This not only limits content reuse but also prevents compelling multi-application scenarios --- for example, a navigation app and a social AR app cannot easily co-exist and share the view \cite{lebeck2019multiple}. Second, prior efforts to standardize AR content (such as AR markup languages or model-based UI design tools) have not been widely adopted by today's AR platforms, which favor imperative and engine-specific approaches. Third, as AR moves towards mainstream use, there is a growing need for cross-application interoperability and consistency, akin to how web browsers unified content on the Internet. AURA directly addresses these needs by providing a unified representation layer that sits between AR applications and the underlying AR runtime or operating system. By doing so, it facilitates interoperability, simplifies content creation, and enables safer multi-source augmentation (ensuring that virtual objects from different apps can coexist without conflict).
+The motivation for AURA stems from several observations. First, current AR experiences are typically built in isolation: there is no standard way for one AR application to expose its content or to incorporate content from others. This not only limits content reuse but also prevents compelling multi-application scenarios --- for example, a navigation app and a social AR app cannot easily co-exist and share the view \cite{lebeck2019multiple}. Second, prior efforts to standardize AR content (such as AR markup languages or model-based UI design tools) have not been widely adopted by today's AR platforms, which favor imperative and engine-specific approaches. Third, as AR moves towards mainstream use, there is a growing need for cross-application interoperability and consistency, akin to how web browsers unified content on the Internet. AURA directly addresses these needs by providing a unified representation layer that sits between AR applications and the underlying AR runtime or operating system. By doing so, it facilitates interoperability, simplifies content creation, and enables safer multi-source augmentation (ensuring that virtual objects from different apps can coexist without conflict).

-We introduce AURA --- the \textbf{A}ugmented Reality \textbf{U}nified \textbf{R}epresentation \textbf{A}rchitecture --- a different approach to structuring AR applications and coordinating their behaviour within shared spatial environments. At its core, AURA introduces a manifest format that allows applications to formally declare their spatial and behavioural requirements, including entities, components, agents, and interaction logic. The AURA-based system, in turn, guides the runtime context definition: available physical components, spatial subdivisions (\textit{ambients} and \textit{worlds}), and mappings between devices and applications.
-
-AURA adopts terminology and structuring principles inspired by the Entity-Component-System (ECS) architecture, widely used in interactive software. Modern AR engines increasingly embrace ECS for performance and modularity (e.g., Apple's RealityKit uses an ECS design for AR scene content \cite{AppleRealityKitChapter9}, and Unity's DOTS ECS framework enables data-oriented high-performance AR/VR simulations \cite{unity_dots}). In AURA's architecture, entities are logical sets of components (which represent tracked surfaces, spatial areas, or physical devices), while the AURA-based runtime plays the role of the ``system'' in ECS --- it observes the physical space and mediates application runtime and agent interactions.
+AURA is a different approach to structuring AR applications and coordinating their behaviour within shared spatial environments. At its core, AURA introduces a manifest format that allows applications to formally declare their spatial and behavioural requirements, including entities, components, agents, and interaction logic. The AURA-based system, in turn, guides the runtime context definition: available physical components, spatial subdivisions (\textit{ambients} and \textit{worlds}), and mappings between devices and applications.

 Importantly, AURA does not enforce a single global spatial model. Instead, each application is bound to an ambient-scoped view, enabling coexistence through a layered abstraction of space. The \textit{``unified''} nature of AURA refers to its consistent runtime contract: all applications interface with the system using a common schema, while the system tracks general interaction and coordinates access to components without requiring direct inter-application communication.

@ -196,9 +194,9 @@ AURA builds upon concepts from software architecture, human-computer interaction

 \subsection{Entity Component System Architecture in AR}

-Entity-Component-System (ECS) is a software architecture pattern that promotes composition over inheritance, separating data (components) from behaviour (systems) for better modularity and performance. ECS has proven especially useful in game development and interactive simulations, where many objects share behaviour patterns but differ in data. Modern AR engines have begun to adopt ECS principles. For example, Apple's RealityKit employs a modular ECS design\cite{AppleRealityKitChapter9}: AR developers define entities and attach components (such as geometries, physics bodies, or anchors) to those entities, and define systems that operate on all entities with certain components. Unity's DOTS framework similarly provides an ECS paradigm to improve performance in complex scenes\cite{UnityECSConcepts}, including AR/VR scenarios, by processing entities in bulk with optimized memory access patterns. In these ECS-based engines, an \textit{entity} is typically a container or identifier, \textit{components} encapsulate attributes like position or visual mesh, and \textit{systems} run logic (e.g., physics or animation) on all entities possessing the requisite components.
+The Entity-Component-System (ECS) architecture separates data (components) from logic (systems), enabling modular, efficient representation of interactive environments. This pattern has seen widespread adoption in game development and is increasingly embraced in AR engines. For instance, Apple’s RealityKit employs an ECS-based design, where developers define entities and attach components (e.g., geometries, anchors, physics bodies) to them, with systems processing entities based on their components \cite{AppleRealityKitChapter9}. Similarly, Unity’s DOTS ECS framework improves runtime performance by processing large numbers of entities using data-oriented memory access patterns, including in AR/VR contexts \cite{UnityECSConcepts}.

-The ECS pattern is relevant to AR not only for performance, but also for managing the dynamic, heterogeneous content that AR applications involve. AR scenes consist of various elements --- surfaces, 3D models, UI widgets, sensors --- that can be naturally modeled as components attached to entities in a scene graph or structure. Adopting an ECS-like approach in a higher-level architecture can enable AR content to be described in a flexible, engine-neutral way. AURA leverages ECS-inspired structuring to describe AR application needs: each AURA Entity represents a logical object or grouping in the AR scene (for example, a ``PhotoWall'' or an ``Avatar'' might be entities), and each Component attached to it represents a specific facet or resource (e.g., a planar surface in the physical world, a spatial coordinate system, a displayable 3D model, or an I/O device like a camera or microphone). Unlike Unity or RealityKit where entities and components are used in code at runtime, AURA's use of ECS is declarative --- the manifest lists entities and their required components, and the system (runtime) ensures that those requirements are met by binding the entities to actual physical resources.
+AURA draws on ECS principles to structure declarative AR applications in a runtime-independent way. Each AURA Entity represents a logical construct (e.g., a ``PhotoWall''), while Components encapsulate spatial or functional capabilities such as display surfaces, coordinate systems, or I/O devices. Unlike engine-level ECS systems, where entities and components exist in runtime code, AURA's ECS model is expressed declaratively through JSON-LD manifests. The runtime acts as the system layer: it matches component requirements to physical resources and mediates interactions, enabling applications to remain engine-neutral and spatially adaptive.

 \subsection{Declarative Manifest Models in Robotics, Web, and XR}

@ -208,8 +206,6 @@ On the web, declarative formats are the norm for content: HTML, CSS, and related

 There have also been efforts to introduce declarative approaches specifically in XR (extended reality) content. Notably, A-Frame (by Mozilla) is a web framework that allows AR/VR scenes to be written in HTML using custom tags. A-Frame internally follows an entity-component architecture and lets developers declare 3D entities and their components in markup form, which are then realized by the runtime in the browser \cite{AFrame}. The success of A-Frame (and related WebXR frameworks) in lowering the barrier for creating XR content demonstrates the power of declarative abstractions. However, these frameworks operate at the level of a single application's content (inside one webpage or session).

-Another relevant concept is the model-based user interface (UI) design from HCI, where high-level models of interaction are defined and then transformed into concrete UI implementations. Languages like UIML and XAML allowed designers to specify interfaces in an abstract way. In the AR domain, prior work on model-based approaches (discussed next) attempted to do something similar for AR interfaces. AURA can be seen as a model-based approach for AR system configuration: the manifest is the model of an AR app's interface with the world, which the AURA system then ``implements'' by allocating real devices, coordinate spaces, and interaction handlers.
-
 \subsection{AR Content Description Languages and Frameworks}

 The idea of a standardized description for AR applications has been explored in prior research, though with limited adoption in practice. One early effort was SSIML/AR (Scene Structure and Integration Modeling Language for AR), a visual language for the abstract specification of AR user interfaces. SSIML/AR, introduced by Vitzthum~\cite{vitzthum2006ssimlar}, extended a 3D UI modeling language (SSIML) to cover AR-specific constructs, allowing developers to model virtual objects, interactions, and the relationships between real and virtual entities at a high level. While SSIML/AR demonstrated the feasibility of describing AR application structure abstractly, it was primarily a design-time tool and did not see integration with mainstream AR runtimes.
@ -226,18 +222,18 @@ Supporting multiple concurrently running AR applications in a shared physical en

 \paragraph{Spatial Anchoring and Persistence.} Modern AR platforms support application-level persistence through spatial anchors. Microsoft's HoloLens enables persistent and shareable anchors\cite{MicrosoftSpatialAnchors} via Azure Spatial Anchors\cite{MicrosoftLearnAzureSpatialAnchors}, though anchor management remains scoped per application. Apple's ARKit supports ARWorldMaps for persistence \cite{ARWorldMap}, while visionOS extends this with automatic anchor restoration \cite{AppleVisionOS}. Magic Leap introduces environmental ``Spaces'' to persist and share spatial mappings \cite{MagicLeapSpacesApp}. Meta's Mixed Reality Utility Kit supports anchor persistence \cite{MetaSpatialAnchorsPersist}, while ARCore offers similar functionality with Cloud Anchors \cite{ARCoreCloudAnchors}. Despite this, all platforms treat anchors as per-app constructs without system-level coordination. In contrast, AURA introduces declarative anchor management via manifests, enabling coordinated anchor access, deduplication, and shared reference across applications.

-\paragraph{Multi-Application Coexistence and Spatial Partitioning.} HoloLens enforces single-immersive-app execution, allowing only auxiliary 2D apps to coexist spatially \cite{lebeck2019multiple}. Magic Leap One enabled limited concurrency via ``prisms'' that spatially confine app content\cite{MagicLeapPrisms}, though overlap is not strictly prevented\cite{lebeck2019multiple}. VisionOS adopts a ``Shared Space'' model \cite{AppleVisionOS}, allowing multiple windowed apps to coexist within a common coordinate frame while preserving visual separation via UI depth cues. AURA diverges by allowing multiple applications to share or isolate space via dynamic \textit{ambients} and \textit{worlds}. Conflict resolution is handled proactively through spatial declarations, allowing flexible partitioning or cooperative content merging.
+\paragraph{Multi-Application Coexistence and Spatial Partitioning.} HoloLens enforces single-immersive-app execution, allowing only auxiliary 2D apps to coexist spatially \cite{lebeck2019multiple}. Magic Leap One enabled limited concurrency via ``prisms'' that spatially confine app content\cite{MagicLeapPrisms}, though overlap is not strictly prevented\cite{lebeck2019multiple}. VisionOS adopts a ``Shared Space'' model \cite{AppleVisionOS}, allowing multiple windowed apps to coexist within a common coordinate frame while preserving visual separation via UI depth cues. AURA diverges by allowing multiple applications to share or isolate space via \textit{ambients} and \textit{worlds}. Conflict resolution is handled proactively through spatial declarations, allowing flexible partitioning or cooperative content merging.

-\paragraph{Runtime Architecture and Resource Arbitration.} Most commercial platforms simplify runtime scheduling by enforcing app exclusivity (e.g., ARCore) or restricting sensor access (e.g., HoloLens). VisionOS offers more advanced scheduling via a centralized compositor and shared spatial services, but lacks app-level coordination mechanisms. Magic Leap distributes access with minimal arbitration. AURA introduces a runtime mediator that interprets resource requirements from manifests, enforcing policies such as exclusive vs. shareable component access. Inspired by OS-level resource scheduling, AURA enables fair usage of sensors, rendering pipelines, and memory through its declarative runtime interface.
+\paragraph{Runtime Architecture and Resource Arbitration.} While commercial AR platforms have advanced features like persistent anchoring and SLAM sharing, comprehensive multi-application support remains limited. Platforms such as HoloLens and ARCore typically isolate applications, whereas Magic Leap and visionOS permit concurrent execution with minimal inter-app coordination. For instance, Apple's visionOS enables multiple applications to run simultaneously in a Shared Space, allowing users to interact with various app windows concurrently \cite{AppleVisionOSRenderPipeline}. However, visionOS lacks explicit mechanisms for app-level coordination of shared resources, relying instead on system-level management for resource allocation and scheduling.

-While commercial AR platforms have converged on persistent anchoring and SLAM sharing, true multi-application support remains limited or implicit. Their designs either isolate apps (HoloLens, ARCore), loosely separate them (Magic Leap, visionOS), or avoid concurrent execution altogether (Quest, Android). AURA bridges this gap by treating spatial context as a shared, declaratively-managed resource. It provides a unified runtime that understands each application's spatial, behavioural, and resource requirements upfront, enabling proactive conflict resolution, safe coexistence, and extensible inter-app communication.
+In contrast, AURA introduces a runtime mediator that interprets resource requirements from application manifests, enforcing policies such as exclusive or shared component access. This approach aims to enable fair usage of sensors, rendering pipelines, and memory through its declarative runtime interface. By treating spatial context as a shared, declaratively managed resource, AURA provides a unified runtime that comprehends each application's spatial, behavioral, and resource requirements upfront, facilitating proactive conflict resolution, safe coexistence, and extensible inter-application communication.

 \section{AURA Architecture and Runtime}\label{sec:aura-architecture}

 \begin{figure*}
  \centering
  \includesvg[inkscapelatex=false, width=0.7\linewidth]{figures/bootstrap.drawio.svg}
-  \caption{The AURA bootstrap process. Applications declare abstract requirements, the system proposes compatible components, user feedback finalizes mappings, and runtime execution begins.}\label{fig:aura_bootstrap}
+  \caption{The AURA bootstrap process. Applications declare abstract requirements, the system proposes compatible components, user feedback finalizes mappings (and may suggest that certain components are used for certain entities), application acknowledges the ambient announcing final mappings, and runtime execution begins.}\label{fig:aura_bootstrap}
 \end{figure*}

 \begin{figure*}
@ -246,7 +242,7 @@ While commercial AR platforms have converged on persistent anchoring and SLAM sh
  \caption{The System Manifest defines Worlds ($\omega_1$, $\omega_2$) and their associated Ambients ($\alpha_1..\alpha_6$). Each Application is tied to a unique Ambient (1:1 relationship). When an Application (e.g., $A_1$) accesses a Component's storage, the associated Ambient (e.g., $\alpha_1$) mediates access to the correct scoped storage view (e.g., $C_1$'s $\omega_1$ data). Thus, $A_1$, $A_2$, and $A_3$ share the same view of $C_1$, while $A_4$ and $A_5$/$A_6$ operate over isolated data contexts.}\label{fig:shared_memory}
 \end{figure*}

-AURA provides a structured and extensible runtime model for AR systems by decoupling application behaviour from system-level resource management. It defines a declarative interface through which applications describe their structure, spatial requirements, and interaction patterns, while the system governs spatial safety, coordination, and execution. This section formalizes the core abstractions in AURA, presents the manifest structure, and details runtime semantics including spatial bootstrapping, component-based memory, and cross-application communication strategy.
+AURA provides a structured and extensible runtime model for AR systems by decoupling application behaviour from system-level resource management. It defines a declarative interface through which applications describe, among other aspects, their structure (via entities), spatial requirements (via components), and interaction patterns (via heuristics), while the system governs spatial safety, coordination, and execution. This section formalizes the core abstractions in AURA, presents the manifest structure, and details runtime semantics including spatial bootstrapping, component-based memory, and cross-application communication strategy.

 \subsection{Fundamental Concepts and Vocabulary}\label{sec:core-concepts}

@ -256,7 +252,7 @@ AURA provides a structured and extensible runtime model for AR systems by decoup
    \toprule
     & \textbf{Application Manifest} & \textbf{System Manifest} \\
    \midrule
-    \textbf{Purpose} & Defines a AR application's structure, Component requirements, Entities and behaviour. & Defines the physical AR environment, available concrete components, and ambient and world setup. \\
+    \textbf{Purpose} & Defines an AR application's structure, Component requirements, Entities and behaviour. & Defines the physical AR environment, available concrete components, and ambient and world setup. \\
    \midrule
    \textbf{Written by} & Application's author. & AR System Administrator (or generated by the host platform in function of user input). \\
    \midrule
@ -264,23 +260,23 @@ AURA provides a structured and extensible runtime model for AR systems by decoup
    \midrule
    \textbf{Key Elements} & Agents, Components requirements, Entities, Variables, Heuristics. & Components (e.g., physical surfaces), Ambients, Worlds. \\
    \midrule
-    \textbf{Responsability} & Defines only application-specific requirements. & Ensures system-wide consistency and cross-application compatibility. \\
+    \textbf{Responsability} & Defines only application-specific requirements. & Ensures fair multi-application coexistence and interoperability. \\
    \bottomrule
  \end{tabularx}
 \end{table*}

-AURA establishes a declarative and runtime-coordinated interface between applications and spatial environments. AURA separates responsabilities between the Application and the System. The application is required to provide a static \textbf{application manifest} to the System. The system is encouraged to dynamically generate a storable \textbf{system manifest}, although not mandatory. The \cref{tab:manifest_comparison} outlines the key differences between both types of manifests.
+AURA establishes a declarative and runtime-coordinated interface between applications and spatial environments, separating responsibilities between the application and the system. Each application must provide a static \textbf{application manifest} describing its structure, requirements, and expected behaviour. While the runtime does not strictly require it, we recommend that implementations of AURA maintain a dynamically generated and persistable \textbf{system manifest}, which reflects the physical environment and current component configuration. This supports introspection, debugging, and reproducibility across sessions. \Cref{tab:manifest_comparison} outlines the key differences between both manifest types.

 AURA introduces a shared vocabulary to describe the relationships between abstract application logic and concrete spatial and system resources. These concepts enable multi-application environments to remain spatially coherent, interoperable, and conflict-free.

 \paragraph{Component}
-A system-defined unit representing a trackable physical structure (e.g., table top, wall), associated device interface (e.g., projector output), or spatial regions. Components are declared in the \textit{system manifest}, including their spatial attributes, capabilities (e.g., input/output modalities), and may expose structured shared storage. In the \textit{application manifest}, developers declare component \textit{requirements} via abstract descriptions (e.g., minimum dimensions, modalities), which the system maps to real-world instances at runtime. Components serve as the physical anchor for interaction and memory exchange. Virtual objects created and managed internally by the application --- such as photos, UI widgets, or 3D models --- are not declared as components in AURA. However, they may be spatially anchored to or interact with components at runtime, for example by displaying content on a surface component or updating that component's shared memory.
+A system-defined unit representing a trackable physical structure (e.g., table top, wall), associated device interface (e.g., projector output), or spatial regions. Components are declared in the \textit{system manifest}, including their spatial attributes, capabilities (e.g., input/output modalities). In the \textit{application manifest}, developers declare component \textit{requirements} via abstract descriptions (e.g., minimum dimensions, modalities), which the system maps to real-world instances at runtime. Components serve as the physical anchor for interaction and memory exchange. Elements created and managed internally by the application --- such as photos, UI widgets, or 3D models --- are not declared as components in AURA. However, they may be spatially anchored to or interact with components at runtime, for example by displaying content on a surface component or updating that component's shared memory.

 \paragraph{Entity}
-An application-defined abstraction representing a logical unit of interaction or visualization within the AR environment. Entities are declared in the \textit{application manifest} and are mapped at runtime to one or more components, which provide their physical presence. An entity can request the system to provide variables --- both discrete (e.g., proximity triggers) and continuous (e.g., the velocity of a component) --- that it uses to drive its behaviour. It may also include heuristics that describe commonly observed patterns through invariant-based state definitions over components. While entities are not responsible for rendering or tracking directly, they define how application logic attaches to the environment through the system-managed spatial structure.
+An application-defined abstraction representing a logical unit of interaction or visualization within the AR environment. Entities are declared in the \textit{application manifest} and are mapped at runtime to one or more components, which provide their physical presence. An entity can request the system to provide variables --- both discrete (e.g., proximity triggers) and continuous (e.g., the velocity of a component) --- that it uses to drive its behaviour. It may also include heuristics that describe commonly observed patterns through invariant-based state definitions over components (See \cref{sec:heuristics}). While entities are not responsible for rendering or tracking directly, they define how application logic attaches to the environment through the system-managed spatial structure.

 \paragraph{Agent}
-An agent represents an active participant in the AR environment, such as a user, camera, projector, or other sensor/actuator. Agents are declared in the \textit{application manifest} and annotated with a role (e.g., \textit{interactor}, \textit{sensor}) and associated services. These services define the types of system capabilities regarding the agent --- such as position tracking, gesture detection, or visual output --- along with the access mode (e.g., event-driven or polling). At runtime, agents are dynamically matched to available system-level agents, and their data streams become observable to applications within the associated ambient. Multiple instances of a given agent type may coexist depending on application constraints.
+An agent represents an active participant in the AR environment. Depending on the system, Agents may provide services such as position tracking, gesture detection, or visual output --- along with the access mode (e.g., event-driven or polling). In the \textit{application manifest}, the application describes desired and required services. During runtime, application agents are dynamically matched to available system-level agents. Multiple instances of a given agent type may coexist depending on application constraints.

 \paragraph{Ambient}
 An ambient defines a spatially and logically bounded context in which a single AR application is executed. Declared in the \textit{system manifest}, an ambient groups a set of components and agents under a coherent coordinate system and enforces application-level isolation. Each ambient is assigned exactly one application at runtime, ensuring exclusive access to its associated components for rendering and interaction. Ambients serve as the primary unit of spatial scoping in AURA: they restrict where an application can observe, modify, or produce output, and they mediate data flow between applications and the system. Although ambients are disjoint in terms of assigned components, they may participate in shared coordination via higher-level \textit{world}s.
@ -289,11 +285,11 @@ An ambient defines a spatially and logically bounded context in which a single A
 A world is a higher-level construct that groups multiple ambients into a shared spatial and semantic context. Declared in the \textit{system manifest}, a world defines global coordination rules --- such as data visibility, and access policies --- that apply across its ambients. Worlds enable interoperability between applications by allowing scoped observation of shared component storage and indirect event communication. While ambients enforce application isolation, worlds provide a mechanism for structured coexistence and interaction between applications that occupy distinct but related regions of space.

 \paragraph{Manifest}
-The pair of declarations that form AURA's behavioral contract: the application manifest defines what an app wants to do, while the system manifest declares what is possible in the environment. Together, they define a binding between abstract logic and physical reality.
+The pair of declarations that form AURA's contract: the application manifest defines what an app wants to do, while the system manifest declares what is possible in the environment. Together, they define a binding between abstract logic and physical reality. An application manifest is static while the system manifest may be updated during runtime.

 \subsection{Runtime Initialization and Spatial Matching}\label{sec:aura-bootstrap}

-As shown in \cref{fig:aura_bootstrap}, execution in AURA begins with an application submitting its manifest. The system analyzes spatial, sensory, and interaction requirements and identifies candidate components within the physical environment. The user may optionally assist in selecting which components should back each application entity.
+As shown in \cref{fig:aura_bootstrap}, execution in AURA begins with an application submitting its manifest. The system analyzes spatial, sensory, and interaction requirements and identifies candidate components within the physical environment. The user may optionally assist in selecting which components should back each application entity. However, the application is provided with the components and decides which will in fact use for each entity.

 Once mappings are finalized, the system transmits the relevant \textbf{ambient configuration} to the application. The application acknowledges the selected components and entity attribution, and execution begins. The runtime monitors spatial triggers, schedules component updates, routes input and output, and manages state transitions, all based on the declarative contract provided in the manifests.

@ -341,12 +337,12 @@ This model draws from prior work on secure multi-app AR systems~\cite{Ruth2019AR

 \subsection{Component-Based Memory Sharing}\label{sec:component-memory}

-Components in AURA expose structured data stores that act as shared memory spaces. This memory is accessible to the application assigned to the ambient the component belongs to, and may be readable across ambients if the component is part of a shared world.
+Components in AURA expose structured data stores that act as shared memory spaces. This memory is accessible to the application assigned to the ambient the component belongs to, and may be readable across ambients if the component is part of a world.

 AURA supports data exchange between applications by attaching shared memory to components. This enables coordination without direct dependencies between applications. In \cref{fig:shared_memory} we show how worlds and ambients influence data exchange ensuring proper interoperability.

 \paragraph{Component Storage}
-Each component in the system manifest may include a structured data store. Applications can read or write to this memory under defined constraints, enabling indirect coordination. For example, placing a virtual photo on a surface may involve storing its metadata in the corresponding component's data structure.
+Each component in the system manifest include a structured data store. Applications can read or write to this memory under defined constraints, enabling indirect coordination. For example, placing a virtual photo on a surface may involve storing its metadata in the corresponding component's data structure.

 \paragraph{Data Scoping}
 When components are part of shared worlds, their memory can be exposed to multiple applications for observation. Writes, however, are ambient-scoped and conflict-free. See \cref{fig:shared_memory}.
@ -360,22 +356,6 @@ Applications can query current component storage or historical logs of changes,
 \paragraph{Snapshot Consistency and System Responsibility}
 The system --- not individual applications --- must be responsible for producing consistent snapshots of component memory. Since AURA does not provide global synchronization guarantees across applications, any attempt by applications to independently maintain histories of component updates could result in inconsistent event orderings. By delegating this responsibility to the runtime, AURA ensures a coherent view of temporal changes across the shared environment.

-\subsection{Cross-Application Interoperability Mechanisms}
-\label{sec:xdg-portals}
-
-In addition to structured memory, we encourage the use of host-level \textbf{interoperability services}, such as the XDG Desktop Portals~\cite{XDGDesktopPortals}. These portals provide secure, permissioned channels for applications to request access to shared services, in a user-consented and sandboxed manner, without requiring tight integration.
-
-We envision the following AURA-specific uses for portals:
-\begin{itemize}
-  \item \textbf{File and clipboard exchange:} An application may use a portal to export captured data (e.g., annotated photos) to the host system.
-  \item \textbf{Permissioned interaction bridges}: Applications may launch dialogs or service requests (e.g., sharing a file or data blob) via a system-defined portal agent.
-  \item \textbf{User-confirmed output redirection}: Applications may request visual output redirection (e.g., streaming to a shared surface) via a media-sharing portal interface.
-  \item \textbf{Window sharing:} Legacy applications without an AURA manifest may be embedded via portals and anchored as virtual agents or visual surfaces within an ambient.
-\end{itemize}
-
-%This model preserves AURA's declarative and ambient-scoped design while extending its reach into general-purpose OS environments and user-facing application ecosystems.
-This bridges the gap between tightly scoped in-AR coordination (via manifests and shared memory) and more general multi-app interoperability through user-facing OS-level services.
-
 \subsection{Representing Virtual Objects and Interaction Semantics}\label{sec:virtual-objects}

 AURA distinguishes between system-managed spatial resources (components) and application-defined virtual entities. While virtual objects (e.g., icons, photos, avatars) are not directly declared in manifests, their behaviour is often tied to components via spatial anchoring or memory updates.
@ -392,6 +372,22 @@ This approach enables multiple applications to recognize, interact with, or obse
 \paragraph{Interoperability Constraints}
 Applications may implement alternative handling for virtual objects (e.g., proprietary object representations or tracking). However, such approaches break the declarative guarantees of AURA and prevent meaningful interoperability. Shared object semantics should, whenever possible, leverage AURA's shared memory and manifest declarations.

+\subsection{Cross-Application Interoperability Mechanisms}
+\label{sec:xdg-portals}
+
+In addition to structured memory, we encourage the use of host-level \textbf{interoperability services}, such as the XDG Desktop Portals~\cite{XDGDesktopPortals}. These portals provide secure, permissioned channels for applications to request access to shared services, in a user-consented and sandboxed manner, without requiring tight integration.
+
+We envision the following AURA-specific uses for portals:
+\begin{itemize}
+  \item \textbf{File and clipboard exchange:} An application may use a portal to export captured data (e.g., annotated photos) to the host system.
+  \item \textbf{Permissioned interaction bridges}: Applications may launch dialogs or service requests (e.g., sharing a file or data blob) via a system-defined portal agent.
+  \item \textbf{User-confirmed output redirection}: Applications may request visual output redirection (e.g., streaming to a shared surface) via a media-sharing portal interface.
+  \item \textbf{Window sharing:} Legacy applications without an AURA manifest may be embedded via portals and anchored as virtual agents or visual surfaces within an ambient.
+\end{itemize}
+
+%This model preserves AURA's declarative and ambient-scoped design while extending its reach into general-purpose OS environments and user-facing application ecosystems.
+This bridges the gap between tightly scoped in-AR coordination (via manifests and shared memory) and more general multi-app interoperability through user-facing OS-level services.
+
 \subsection{Heuristics and Observational State Modeling}\label{sec:heuristics}

 AURA allows applications to define \textbf{heuristics} --- semantic abstractions over component-level variables --- to express observable states and their transitions. These definitions are included in the \textit{application manifest} and enable the system to assist applications by evaluating logical conditions on tracked variables and reporting state changes. This mechanism supports a declarative programming model where application logic can respond to high-level events such as ``object rising'' or ``hand approaching surface'' rather than continuously polling low-level signals.
@ -428,13 +424,13 @@ The following example defines a \texttt{juggling ball} entity that transitions t
 } } }
 \end{lstlisting}

-In the \texttt{idle} state definition, the system may include a tolerance threshold (e.g., \texttt{abs(V.z) < 0.01}) to reduce false positives due to sensor noise or rounding errors for robustness.
+In the \texttt{idle} state definition, the system may apply a tolerance threshold (e.g., \texttt{abs(V.z) < 0.01}) to mitigate false positives caused by sensor noise or floating-point imprecision, thereby enhancing robustness. In future work, we aim to investigate adaptive strategies for threshold tuning, clarifying the boundary between system-level inference and application-specified semantics --- particularly in cases where precise control over state transitions is essential to application behaviour.

 \paragraph{System Role and Guarantees}

 Although heuristics are defined by the application, the system is responsible for continuously evaluating these invariants based on current component data and notifying the application when state changes occur. The variant conditions serve for the system to know when it should increase monitoring of a certain component. The system does not interpret or generalize the semantics of these states --- its role is limited to detecting when invariant conditions are satisfied or violated. This preserves encapsulation and ensures that state meaning remains application-specific.

-Because multiple applications may observe the same component within a shared \textit{world}, heuristic evaluation serves as a standardized mechanism for interpreting continuous signals without tight coupling. However, \emph{only the application that defines a heuristic receives its evaluated states} --- other applications may define different heuristics on the same component, or none at all.
+Because multiple applications may observe the same component within a shared world, heuristic evaluation serves as a standardized mechanism for interpreting continuous signals without tight coupling. However, only the application that defines a heuristic receives its evaluated states --- other applications may define different heuristics on the same component, or none at all.

 \subsection{Semantic Interoperability and Ontology-Driven Reasoning}\label{sec:semantic-interop}

@ -475,22 +471,21 @@ Each application begins with basic metadata and spatial constraints:
 }
 \end{lstlisting}

-Here, the application requests a minimum ambient size (\texttt{dimensions}), along with the coordinate system and origin it expects for spatial referencing (\texttt{coordinate system}, \texttt{origin}).
+Here, the application requests a minimum ambient size (\texttt{dimensions}), along with the coordinate system and origin it expects for spatial referencing (\texttt{coordinate system}).

 \paragraph{Agents}
-Agents represent users or sensing/actuation devices. If we need video or output over the entire ambient, we would define these devices as agents in this block. Here we define a tracked user:
+Agents represent users or actuation devices. Here we define a tracked user:
 \begin{lstlisting}[language=json, numbers=none, caption={Agent Definition.}]
 "agents": [{
  "type": "User",
  "id": "user_{}",
-  "role": "interactor",
  "instances": "+",
  "services": [
    {"service_type": "hand_tracking",
-    "call_type": "pooling"},
+    "call_type": "pooling", "required": true},
    {"service_type": "position_tracking",
-    "call_type": "trigger_feed"},
-    {"service_type": "hand_visual_output"}]
+    "call_type": "trigger_feed", "required": true},
+    {"service_type": "hand_visual_output", "required": false}]
 }]
 \end{lstlisting}

@ -500,7 +495,7 @@ With the \texttt{services} field, the application requires access to hand tracki

 \paragraph{Components}
 Components specify required physical characteristics and access modes. Both entities will use a flat surface that provides video feed and visual output capabilities:
-\begin{lstlisting}[language=json, numbers=none, caption={Abstract Component for Flat Surface.}]
+\begin{lstlisting}[language=json, numbers=none, caption={Application Manifest's Component for Flat Surface.}]
 "components": [{
  "id": "flat_surface",
  "type": "static_surface",
@ -526,28 +521,26 @@ Components specify required physical characteristics and access modes. Both enti
 \textbf{Note:} AURA allows to not assume specific output devices like projectors or HMDs. Instead, it abstracts output capabilities via \texttt{visual\_output}, which could map to projection, screen display, or AR headset rendering. Similarly, input requirements such as \texttt{bw\_video\_feed} (black and white video feed) can be provided by RGB cameras, depth sensors, or other modalities.

 \paragraph{Entities}
-The application defines two entities, both referring to the same component type.
+The application defines two entities, both referring to the same application manifest component \texttt{flat\_surface}. This does not imply that both entities will have the exact same system-provided component. Application Manifest components are abstract definitions of what is required, but they only assume form inside an entity, as it is based in an entity's \texttt{instances} that the system learns how many components with certain characteristics it has to provide the application with.

-The entity ``photo\_gallery'' consists of a single interactive visualiser
-for browsing photos.
+The entity \texttt{photo\_gallery} consists of a single interactive visualiser for browsing photos.
 \begin{lstlisting}[language=json, numbers=none, caption={Photo Gallery Entity Definition.}]
 "entities": [{
  "id": "photo_gallery",
  "components": ["flat_surface"],
  "instances": 1,
-  "discreteVariables": [
+  "discrete_variables": [
    "active": "distance(agent, flat_surface) < 0.3"
  ] },
 \end{lstlisting}

-The entity "photo\_viewer" can have zero or more instances and consists of a
-photo displayer.
+The entity \texttt{photo\_viewer} can have zero or more instances and consists of a photo displayer.
 \begin{lstlisting}[language=json, numbers=none, caption={Photo Viewer Entity Definition.}]
 {
  "id": "photo_viewer",
  "components": ["flat_surface"],
  "instances": "*",
-  "discreteVariables": [
+  "discrete_variables": [
    "active": "distance(agent, flat_surface) < 0.3"
  ]
 }]
@ -565,7 +558,7 @@ Components in the system manifest represent real tracked structures or devices.
 "id": "table_surface",
 "name": "Office's table top",
 "type": "static_surface",
-"spatialAttributes": {
+"spatial_attributes": {
  "x": 0, "y": 0, "z": 0.75,
  "width": 1.5, "height": 0.75
 },
@ -573,7 +566,7 @@ Components in the system manifest represent real tracked structures or devices.
 \end{lstlisting}

 \paragraph{Ambients and Worlds}
-Each application is assigned to an ambient, which acts as its exclusive spatial and resource scope.
+Each application is assigned to an ambient, which acts as its exclusive spatial and resource scope. Here the ambient makes a (concrete/system) component available for the application assigned. It could also suggest that such component was associated to a certain application entity (usually that would happen with user guidance, however the actual attribution can only be decided by the application at runtime and can be updated at any time).
 \begin{lstlisting}[language=json, numbers=none, caption={Ambient and World Declaration.}]
 "ambients": [{
  "id": "ambient_gallery",