aura_chi/main.tex

%% Commands for TeXCount
%TC:macro \cite [option:text,text]
%TC:macro \citep [option:text,text]
%TC:macro \citet [option:text,text]
%TC:envir table 0 1
%TC:envir table* 0 1
%TC:envir tabular [ignore] word
%TC:envir displaymath 0 word
%TC:envir math 0 word
%TC:envir comment 0 0
%%
%% The first command in your LaTeX source must be the \documentclass
%% command.
%%
%% For submission and review of your manuscript please change the
%% command to \documentclass[manuscript, screen, review]{acmart}.
%%
%% When submitting camera ready or to TAPS, please change the command
%% to \documentclass[sigconf]{acmart} or whichever template is required
%% for your publication.
%%
%%
%%\documentclass[sigconf,authordraft]{acmart}
\documentclass[manuscript,review,anonymous]{acmart}
%%\documentclass[manuscript,review]{acmart}
%%
%% \BibTeX command to typeset BibTeX logo in the docs
\AtBeginDocument{%
  \providecommand\BibTeX{{%
    Bib\TeX}}}

%% Rights management information.  This information is sent to you
%% when you complete the rights form.  These commands have SAMPLE
%% values in them; it is your responsibility as an author to replace
%% the commands and values with those provided to you when you
%% complete the rights form.
\setcopyright{acmlicensed}
\copyrightyear{2018}
\acmYear{2018}
\acmDOI{XXXXXXX.XXXXXXX}
%% These commands are for a PROCEEDINGS abstract or paper.
\acmConference[Conference acronym 'XX]{Make sure to enter the correct
  conference title from your rights confirmation email}{June 03--05,
  2018}{Woodstock, NY}
%%
%%  Uncomment \acmBooktitle if the title of the proceedings is different
%%  from ``Proceedings of ...''!
%%
%%\acmBooktitle{Woodstock '18: ACM Symposium on Neural Gaze Detection,
%%  June 03--05, 2018, Woodstock, NY}
\acmISBN{978-1-4503-XXXX-X/2018/06}

%% Diogo's packages
\usepackage{cleveref}
\usepackage{svg}
\usepackage{tabularx}
\usepackage{booktabs}
\usepackage{listings}

% Diogo Custom colours
\colorlet{punct}{red!60!black}
\definecolor{delim}{RGB}{20,105,176}
\colorlet{numb}{magenta!60!black}
\lstdefinelanguage{json}{
    basicstyle=\footnotesize\ttfamily,
    numbers=left,
    numberstyle=\scriptsize,
    stepnumber=1,
    numbersep=8pt,
    showtabs=false,
    showstringspaces=false,
    breaklines=true,
    frame=single,
    literate=
     *{0}{{{\color{numb}0}}}{1}
      {1}{{{\color{numb}1}}}{1}
      {2}{{{\color{numb}2}}}{1}
      {3}{{{\color{numb}3}}}{1}
      {4}{{{\color{numb}4}}}{1}
      {5}{{{\color{numb}5}}}{1}
      {6}{{{\color{numb}6}}}{1}
      {7}{{{\color{numb}7}}}{1}
      {8}{{{\color{numb}8}}}{1}
      {9}{{{\color{numb}9}}}{1}
      {:}{{{\color{punct}{:}}}}{1}
      {,}{{{\color{punct}{,}}}}{1}
      {\{}{{{\color{delim}{\{}}}}{1}
      {\}}{{{\color{delim}{\}}}}}{1}
      {[}{{{\color{delim}{[}}}}{1}
      {]}{{{\color{delim}{]}}}}{1},
}

%%
%% Submission ID.
%% Use this when submitting an article to a sponsored event. You'll
%% receive a unique submission ID from the organizers
%% of the event, and this ID should be used as the parameter to this command.
%%\acmSubmissionID{123-A56-BU3}

%%
%% For managing citations, it is recommended to use bibliography
%% files in BibTeX format.
%%
%% You can then either use BibTeX with the ACM-Reference-Format style,
%% or BibLaTeX with the acmnumeric or acmauthoryear sytles, that include
%% support for advanced citation of software artefact from the
%% biblatex-software package, also separately available on CTAN.
%%
%% Look at the sample-*-biblatex.tex files for templates showcasing
%% the biblatex styles.
%%

%%
%% The majority of ACM publications use numbered citations and
%% references.  The command \citestyle{authoryear} switches to the
%% "author year" style.
%%
%% If you are preparing content for an event
%% sponsored by ACM SIGGRAPH, you must use the "author year" style of
%% citations and references.
%% Uncommenting
%% the next command will enable that style.
%%\citestyle{acmauthoryear}


%%
%% end of the preamble, start of the body of the document source.
\begin{document}

%%
%% The "title" command has an optional parameter,
%% allowing the author to define a "short title" to be used in page headers.
\title{AURA: The Augmented Reality Unified Representation Architecture}

%%
%% The "author" command and its associated commands are used to define
%% the authors and their affiliations.
%% Of note is the shared affiliation of the first two authors, and the
%% "authornote" and "authornotemark" commands
%% used to denote shared contribution to the research.
\author{Diogo Peralta Cordeiro}
\email{mail@diogo.site}
\orcid{0000-0002-0260-5121}
\author{João Borges de Sousa}
\email{jtasso@fe.up.pt}
\orcid{0000-0002-2528-4666}
\affiliation{%
  \institution{University of Porto}
  \city{Porto}
  \country{Portugal}
}

%%
%% By default, the full list of authors will be used in the page
%% headers. Often, this list is too long, and will overlap
%% other information printed in the page headers. This command allows
%% the author to define a more concise list
%% of authors' names for this purpose.
\renewcommand{\shortauthors}{Diogo et al.}

%%
%% The abstract is a short summary of the work to be presented in the
%% article.
\begin{abstract}
Augmented Reality (AR) integrates digital content into physical space across diverse platforms, including projection systems, head-mounted displays, and mobile devices. However, current AR frameworks lack a standardized, platform-independent method for applications to declare their spatial and system-level requirements. This fragmentation complicates cross-platform development, and hinders the coexistence of multiple AR applications within the same environment.

We introduce AURA --- the \textbf{A}ugmented Reality \textbf{U}nified \textbf{R}epresentation \textbf{A}rchitecture ---, which defines a manifest format through which applications specify their spatial components, interactive elements, participating agents, and required system resources. AURA enables multiple applications to run concurrently by assigning them to scoped containers and managing their access to shared physical surfaces. AURA also supports dynamic, context-aware behaviour via event-driven triggers and system-mediated data exchanges at runtime. Through examples, we demonstrate how AURA facilitates cross-platform development and application interoperability.
\end{abstract}

%%
%% The code below is generated by the tool at http://dl.acm.org/ccs.cfm.
%% Please copy and paste the code instead of the example below.
%%
\begin{CCSXML}
<ccs2012>
   <concept>
       <concept_id>10003120.10003121.10003124.10010392</concept_id>
       <concept_desc>Human-centered computing~Mixed / augmented reality</concept_desc>
       <concept_significance>500</concept_significance>
       </concept>
   <concept>
       <concept_id>10003120.10003138.10003139.10010904</concept_id>
       <concept_desc>Human-centered computing~Ubiquitous computing</concept_desc>
       <concept_significance>300</concept_significance>
       </concept>
   <concept>
       <concept_id>10003120.10003123</concept_id>
       <concept_desc>Human-centered computing~Interaction design</concept_desc>
       <concept_significance>100</concept_significance>
       </concept>
 </ccs2012>
\end{CCSXML}

\ccsdesc[500]{Human-centered computing~Mixed / augmented reality}
\ccsdesc[300]{Human-centered computing~Ubiquitous computing}
\ccsdesc[100]{Human-centered computing~Interaction design}

%%
%% Keywords. The author(s) should pick words that accurately describe
%% the work being presented. Separate the keywords with commas.
\keywords{Spatial Computing, Application Manifest, Multi-application Systems, Intelligent Environments, Machine Perception, Interactive Spaces}

%% A "teaser" image appears between the author and affiliation
%% information and the body of the document, and typically spans the
%% page.
\begin{teaserfigure}
  \centering
  \includesvg[inkscapelatex=false, width=0.7\linewidth]{figures/teaser.drawio.svg}
  \caption{Overview of AURA's role during multi-application deployment.
  a) An AURA-based system scans the room, and independently created and launched applications submit their spatial and behavioural requirements through AURA manifests.
  b) The AURA-powered Head-Mounted Display (HMD) identifies suitable components in the environment and, with user input, assigns them to each application by creating dedicated ambients (e.g., $\alpha_1$, $\alpha_2$). This allows Applications 1 and 2 to coexist within the same room.}
  \label{fig:teaser}
\end{teaserfigure}

\received{20 February 2007}
\received[revised]{12 March 2009}
\received[accepted]{5 June 2009}

%%
%% This command processes the author and affiliation and title
%% information and builds the first part of the formatted document.
\maketitle

\section{Introduction}
Augmented Reality (AR) is emerging as a key paradigm for immersive computing, blending digital content with the physical world in real time. Despite rapid advances in hardware and toolkits, developing AR applications today remains a fragmented endeavour. Different platforms (e.g., mobile AR toolkits, smart-glasses) and engines (Unity, Unreal, etc.) impose distinct content representations and siloed runtime contexts. This fragmentation makes it difficult to coordinate multiple AR experiences together. In practice, users are confined to one AR application at a time, and there is no straightforward way to allow multiple AR apps to augment the environment concurrently \cite{huynh2022layerable}.

Most existing AR development platforms, such as Unity's XR Interaction Toolkit and Unreal Engine's AR APIs, are geared toward building single-application experiences. These tools provide abstractions for tracking, rendering, and input (for example, ARKit provides world tracking and scene understanding \cite{ARKitDocumentation}, and ARCore offers similar capabilities on Android \cite{ARCoreDocumentation}), but they lack a standardized model for representing application structure, coordinating access to physical components, or supporting interaction logic across applications.  Unity's XR Interaction Toolkit, for instance, is a ``high-level, component-based, interaction system for creating VR and AR experiences'' \cite{AndroidXRUnity}, focused on common interactions within one application. This single-app focus limits scalability, complicates cross-platform deployment, and hinders the development of multi-application AR environments.

The motivation for AURA stems from several observations. First, current AR experiences are typically built in isolation: there is no standard way for one AR application to expose its content or to incorporate content from others. This not only limits content reuse but also prevents compelling multi-application scenarios --- for example, a navigation app and a social AR app cannot easily co-exist and share the view \cite{lebeck2019multiple}. Second, prior efforts to standardize AR content (such as AR markup languages or model-based UI design tools) have not been widely adopted by today's AR platforms, which favour imperative and engine-specific approaches. Third, as AR moves towards mainstream use, there is a growing need for cross-application interoperability and consistency, akin to how web browsers unified content on the Internet. AURA directly addresses these needs by providing a unified representation layer that sits between AR applications and the underlying AR runtime or operating system. By doing so, it facilitates interoperability, simplifies content creation, and enables safer multi-source augmentation (ensuring that virtual objects from different apps can coexist without conflict).

AURA is a different approach to structuring AR applications and coordinating their behaviour within shared spatial environments. At its core, AURA introduces a manifest format that allows applications to formally declare their spatial and behavioural requirements, including entities, components, agents, and interaction logic. The AURA-based system, in turn, guides the runtime context definition: available physical components, spatial subdivisions (\textit{ambients} and \textit{worlds}), and mappings between devices and applications.

Importantly, AURA does not enforce a single global spatial model. Instead, each application is bound to an ambient-scoped view, enabling coexistence through a layered abstraction of space. The \textit{``unified''} nature of AURA refers to its consistent runtime contract: all applications interface with the system using a common schema, while the system tracks general interaction and coordinates access to components without requiring direct inter-application communication.

To ground the architecture in a tangible scenario, \cref{fig:teaser} illustrates how AURA supports runtime coordination across multiple independent AR applications. In (a), two applications submit their manifests to an AURA-based system, which parses their spatial and behavioural requirements. In (b), the system creates an ambient for each application, mapping them onto the physical space while preventing conflicts over shared surfaces. The mapping process remains user-adjustable, and interaction occurs via an HMD-based system --- though AURA abstracts away the underlying platform, as discussed later. AURA enables:
\begin{itemize}
  \item Standardized definition of application structure, including components, entities, interaction logic, devices, and performance requirements;
  \item Concurrent execution of multiple AR applications in shared physical environments without conflict, enabling true coexistence;
  \item Event- and request-driven data exchange via structured, system-managed memory tied to spatial components, enabling interoperability across applications.
\end{itemize}

In this paper, we present the core AURA specification and demonstrate how it enables scalable, interoperable, and modular AR systems. We focus on its support for application isolation, declarative interaction modelling, and conflict-free coordination of shared resources. Future work will explore runtime extensions such as probabilistic reasoning, formal validation, and distributed orchestration across heterogeneous AR platforms.

\section{Background and Related Work}
AURA builds upon concepts from software architecture, human-computer interaction (HCI), and AR system design. In this section, we review the key areas of prior work that inform our approach, and we clarify how AURA differentiates itself from or extends these foundations. We first discuss the Entity-Component-System pattern as used in interactive systems and its relevance to AR content structuring. We then draw parallels to application manifest frameworks in other domains, which inspire AURA's declarative approach. Next, we examine past efforts in modelling and specifying AR applications (such as SSIML/AR and AR markup languages) that aimed to generalize AR development. Finally, we consider the challenge of cross-application AR and how existing platforms and research address --- or fail to address --- concurrent AR experiences.

%\subsection{Augmented Reality Technologies}

%\begin{figure}[htbp]
%  \centering
%  \includesvg[inkscapelatex=false, width=0.7\linewidth]{figures/map_of_field.svg}
%  \caption{Augmented Reality as a form of Extended Reality.}\label{fig:ar_in_xr}
%\end{figure}

%As illustrated in \cref{fig:ar_in_xr}, Augmented Reality is distinct from other forms of Extended Reality (XR). While Virtual Reality (VR) immerses users entirely within a simulated environment, and Mixed Reality (MR) tightly blends physical and virtual elements through real-time spatial alignment, AR focuses on enriching the user's perception of the physical world by adding digital elements in one or more modalities --- most commonly visual, but also aural, haptic or olfatory. AR systems can be broadly classified into three categories~\cite{van2010survey}:
%\begin{enumerate}
%  \item \textbf{Projection-Based AR:} These systems use projectors to overlay digital content onto physical surfaces, enabling shared experiences without the need for individual HMDs. Such systems have been explored extensively in applications like collaborative design, education, and entertainment.
%  \item \textbf{Video See-Through AR:} In this approach, a live video feed of the real world is combined with virtual graphics before display. Most mobile AR applications (e.g., on smartphones or tablets using ARKit/ARCore) and some headsets, such as Apple Vision Pro, use video see-through; the device's camera provides the view of the world and virtual content is composited into the feed in real time.
%  \item \textbf{Optical See-Through AR:} These are headsets that use transparent displays to superimpose light on the user's view of the real world, such as Microsoft HoloLens or Magic Leap. This allows users to see the real environment directly with their eyes while holographic images appear integrated into their surroundings.
%\end{enumerate}

%Each of these technological classes comes with different capabilities and interaction modalities, as well as their own set of strengths and challenges, such as the accuracy of the alignment of virtual and physical elements, and user experience consistency. However, regardless of display method, developers face similar software challenges in structuring AR content and interactions. Current AR development frameworks like ARKit/ARCore abstract many low-level details (tracking, sensor fusion, rendering) \cite{ARKitDocumentation}, yet developers still work in imperative terms (placing objects, writing event handlers) within a single app's context. AURA instead introduces a higher-level, declarative layer on top of such runtimes, enabling the definition of AR content and behaviour in a way that is agnostic to the underlying AR rendering technology and that permits multiple applications to share the same physical space.

\subsection{Entity Component System Architecture in AR}

The Entity-Component-System (ECS) architecture separates data (components) from logic (systems), enabling modular, efficient representation of interactive environments. This pattern has seen widespread adoption in game development and is increasingly embraced in AR engines. For instance, Apple's RealityKit employs an ECS-based design, where developers define entities and attach components (e.g., geometries, anchors, physics bodies) to them, with systems processing entities based on their components \cite{AppleRealityKitChapter9}. Similarly, Unity's DOTS ECS framework improves runtime performance by processing large numbers of entities using data-oriented memory access patterns, including in AR/VR contexts \cite{UnityECSConcepts}.

AURA draws on ECS principles to structure declarative AR applications in a runtime-independent way. Each AURA Entity represents a logical construct (e.g., a ``PhotoWall''), while Components encapsulate spatial or functional capabilities such as display surfaces, coordinate systems, or I/O devices. Unlike engine-level ECS systems, where entities and components exist in runtime code, AURA's ECS model is expressed declaratively through JSON-LD manifests. The runtime acts as the system layer: it matches component requirements to physical resources and mediates interactions, enabling applications to remain engine-neutral and spatially adaptive.

\subsection{Declarative Manifest Models in Robotics, Web, and XR}

The idea of using a declarative specification to describe an application's structure and requirements is precedented in other domains. In robotics, for instance, the Unified Robot Description Format (URDF) provides an XML-based model to describe a robot's links, joints, sensors, and physical properties in a standardized way. URDF and its successors (e.g., SDF in Gazebo simulation) allow different software modules to understand a robot's structure without hard-coding it, facilitating interoperability between planning, simulation, and control systems. This inspires AURA's approach to describing AR spaces (e.g., a manifest can declare a ``table surface' component needed, analogous to how a robot's URDF declares it has a camera or an arm joint).

On the web, declarative formats are the norm for content: HTML, CSS, and related web technologies describe the structure and style of user interfaces without prescribing how the browser must implement them. This separation of content declaration from runtime execution has been key to the web's cross-platform consistency. Similarly, mobile app ecosystems use manifest files (e.g., the Android \texttt{AndroidManifest.xml}) to declare application capabilities, required sensors, and permissions, which the operating system uses to configure and manage the app. These manifests don't describe UI layout but do describe an app's contracts with the system (activities, services, permissions, etc.). AURA's manifest concept plays a parallel role for AR: it declares what an AR application will need (e.g., ``a horizontal surface at least 1m$^2$ for placing content'' or ``access to the user's GPS location''), as well as what interactions it supports, so that the AR operating environment can mediate and satisfy these needs in a safe, combined way.

There have also been efforts to introduce declarative approaches specifically in XR (extended reality) content. Notably, A-Frame (by Mozilla) is a web framework that allows AR/VR scenes to be written in HTML using custom tags. A-Frame internally follows an entity-component architecture and lets developers declare 3D entities and their components in markup form, which are then realized by the runtime in the browser \cite{AFrame}. The success of A-Frame (and related WebXR frameworks) in lowering the barrier for creating XR content demonstrates the power of declarative abstractions. However, these frameworks operate at the level of a single application's content (inside one webpage or session).

\subsection{AR Content Description Languages and Frameworks}

The idea of a standardized description for AR applications has been explored in prior research, though with limited adoption in practice. One early effort was SSIML/AR (Scene Structure and Integration Modelling Language for AR), a visual language for the abstract specification of AR user interfaces. SSIML/AR, introduced by Vitzthum~\cite{vitzthum2006ssimlar}, extended a 3D UI modeling language (SSIML) to cover AR-specific constructs, allowing developers to model virtual objects, interactions, and the relationships between real and virtual entities at a high level. While SSIML/AR demonstrated the feasibility of describing AR application structure abstractly, it was primarily a design-time tool and did not see integration with mainstream AR runtimes.

In industry, several AR markup or description formats have been proposed. The Open Geospatial Consortium's Augmented Reality Markup Language (ARML) 2.0 is an XML-based standard for AR content, finalized in 2015 \cite{OGC_ARML2_Adoption}. ARML 2.0 allows AR content providers to specify virtual objects with their visual properties and anchors in the real world, and it even defines an ECMAScript-based interface for dynamic behaviours and user interactions. ARML was used by some early AR browsers like Wikitude, Layar, and Junaio to share AR content across platforms \cite{OGC_ARML2_Adoption}. However, ARML did not achieve broad adoption in the era of ARKit/ARCore --- partly because those platforms opted for their own APIs, and partly because ARML's scope was limited to relatively static content (points of interest, 3D annotations) rather than full application logic.

Another example was Metaio's AREL (Augmented Reality Experience Language), introduced around 2011 as part of the Junaio AR browser. AREL combined XML and JavaScript to let developers define AR scenes and basic interactions which the browser could then execute. It was essentially a declarative layer on top of Metaio's AR engine. Metaio's approach was ahead of its time, but once Metaio was acquired (and its technology folded into Apple's ARKit), AREL was discontinued. The broader AR industry shifted focus to native SDKs and engine-based development, leaving a gap in terms of a unifying content specification.

AURA's manifest draws upon these lessons: like SSIML/AR and ARML, it seeks a cross-platform description format for AR content. But it goes further by explicitly modelling not just content placement, but also \textbf{behavioural logic and multi-application coordination}. AURA manifests are written in JSON (specifically JSON-LD for semantic clarity) instead of XML, aligning with modern web technologies and developer preferences. AURA is designed to work at runtime, not just as an offline description. This means the system can load manifests on the fly, reason about resource assignments, and manage multiple running AR applications, which prior markup languages did not address. In that sense, AURA serves a role analogous to an operating system's window manager or app coordinator, using manifest declarations as input.

\subsection{Multi-Application AR and Spatial Computing Platforms}

Supporting multiple concurrently running AR applications in a shared physical environment presents challenges related to visual consistency, interaction integrity, and resource management. Although commercial platforms have made strides in spatial persistence and immersive UI, robust multi-application AR remains largely unsupported. Prior work such as Huynh et al.'s ``Layerable Apps'' \cite{huynh2022layerable} and Lebeck et al.'s early analysis of AR runtime conflicts \cite{lebeck2019multiple} highlights user interest in multi-app usage and the architectural shortcomings that impede it. In this section, we compare the architectural approaches of leading platforms --- including Microsoft HoloLens, Apple Vision Pro and ARKit, Magic Leap, Meta Quest, and Android ARCore --- across three dimensions: (1) spatial anchoring and persistence, (2) multi-application coexistence and spatial partitioning, and (3) runtime architecture and resource arbitration. We also contrast these designs with AURA's manifest-based runtime coordination model.

\paragraph{Spatial Anchoring and Persistence.} Modern AR platforms support application-level persistence through spatial anchors. Microsoft's HoloLens enables persistent and shareable anchors\cite{MicrosoftSpatialAnchors} via Azure Spatial Anchors\cite{MicrosoftLearnAzureSpatialAnchors}, though anchor management remains scoped per application. Apple's ARKit supports ARWorldMaps for persistence \cite{ARWorldMap}, while visionOS extends this with automatic anchor restoration \cite{AppleVisionOS}. Magic Leap introduces environmental ``Spaces'' to persist and share spatial mappings \cite{MagicLeapSpacesApp}. Meta's Mixed Reality Utility Kit supports anchor persistence \cite{MetaSpatialAnchorsPersist}, while ARCore offers similar functionality with Cloud Anchors \cite{ARCoreCloudAnchors}. Despite this, all platforms treat anchors as per-app constructs without system-level coordination. In contrast, AURA introduces declarative anchor management via manifests, enabling coordinated anchor access, no duplication, and shared reference across applications.

\paragraph{Multi-Application Coexistence and Spatial Partitioning.} HoloLens enforces single-immersive-app execution, allowing only auxiliary 2D apps to coexist spatially \cite{lebeck2019multiple}. Magic Leap One enabled limited concurrency via ``prisms'' that spatially confine app content\cite{MagicLeapPrisms}, though overlap is not strictly prevented\cite{lebeck2019multiple}. VisionOS adopts a ``Shared Space'' model \cite{AppleVisionOS}, allowing multiple windowed apps to coexist within a common coordinate frame while preserving visual separation via UI depth cues. AURA diverges by allowing multiple applications to share or isolate space via \textit{ambients} and \textit{worlds}. Conflict resolution is handled proactively through spatial declarations, allowing flexible partitioning or cooperative content merging.

\paragraph{Runtime Architecture and Resource Arbitration.} While commercial AR platforms have advanced features like persistent anchoring and SLAM sharing, comprehensive multi-application support remains limited. Platforms such as HoloLens and ARCore typically isolate applications, whereas Magic Leap and visionOS permit concurrent execution with minimal inter-app coordination. For instance, Apple's visionOS enables multiple applications to run simultaneously in a Shared Space, allowing users to interact with various app windows concurrently \cite{AppleVisionOSRenderPipeline}. However, visionOS lacks explicit mechanisms for app-level coordination of shared resources, relying instead on system-level management for resource allocation and scheduling.

In contrast, AURA introduces a runtime mediator that interprets resource requirements from application manifests, enforcing policies such as exclusive or shared component access. This approach aims to enable fair usage of sensors, rendering pipelines, and memory through its declarative runtime interface. By treating spatial context as a shared, declaratively managed resource, AURA provides a unified runtime that comprehends each application's spatial, behavioural, and resource requirements upfront, facilitating proactive conflict resolution, safe coexistence, and extensible inter-application communication.

\section{AURA Architecture and Runtime}\label{sec:aura-architecture}

\begin{figure*}
  \centering
  \includesvg[inkscapelatex=false, width=0.7\linewidth]{figures/bootstrap.drawio.svg}
  \caption{The AURA bootstrap process. Applications declare abstract requirements, the system proposes compatible components, user feedback finalizes mappings (and may suggest that certain components are used for certain entities), application acknowledges the ambient announcing final mappings, and runtime execution begins.}\label{fig:aura_bootstrap}
\end{figure*}

\begin{table*}
  \caption{Comparison Between the Application and System Manifests.}\label{tab:manifest_comparison}
  \begin{tabularx}{\textwidth}{l*{2}{X}}
    \toprule
     & \textbf{Application Manifest} & \textbf{System Manifest} \\
    \midrule
    \textbf{Purpose} & Defines an AR application's structure, Component requirements, Entities and behaviour. & Defines the physical AR environment, available concrete components, and ambient and world setup. \\
    \midrule
    \textbf{Written by} & Application's author. & AR System Administrator (or generated by the host platform in function of user input). \\
    \midrule
    \textbf{Scope} & Focuses on an individual AR application. & Covers the entire AR environment, supporting multiple applications. \\
    \midrule
    \textbf{Key Elements} & Agents, Components requirements, Entities, Variables, Heuristics. & Components (e.g., physical surfaces), Ambients, Worlds. \\
    \midrule
    \textbf{Responsibility} & Defines only application-specific requirements. & Ensures fair multi-application coexistence and interoperability. \\
    \bottomrule
  \end{tabularx}
\end{table*}

\begin{figure*}
  \centering
  \includesvg[inkscapelatex=false, width=\linewidth]{figures/sarm_basic_storage_concept.drawio.svg}
  \caption{The System Manifest defines Worlds ($\omega_1$, $\omega_2$) and their associated Ambients ($\alpha_1..\alpha_6$). Each Application is tied to a unique Ambient (1:1 relationship). When an Application (e.g., $A_1$) accesses a Component's storage, the associated Ambient (e.g., $\alpha_1$) mediates access to the correct scoped storage view (e.g., $C_1$'s $\omega_1$ data). Thus, $A_1$, $A_2$, and $A_3$ share the same view of $C_1$, while $A_4$ and $A_5$/$A_6$ operate over isolated data contexts.}\label{fig:shared_memory}
\end{figure*}

AURA provides a structured and extensible runtime model for AR systems by decoupling application behaviour from system-level resource management. It defines a declarative interface through which applications describe, among other aspects, their structure (via entities), spatial requirements (via components), and interaction patterns (via heuristics), while the system governs spatial safety, coordination, and execution. This section formalizes the core abstractions in AURA, presents the manifest structure, and details runtime semantics including spatial bootstrapping, component-based memory, and cross-application communication strategy.

\subsection{Fundamental Concepts and Vocabulary}\label{sec:core-concepts}

AURA establishes a declarative and runtime-coordinated interface between applications and spatial environments, separating responsibilities between the application and the system running it. Each application must provide a static \textbf{application manifest} describing its structure, requirements, and expected behaviour. While the runtime does not strictly require it, we recommend that implementations of AURA maintain a dynamically generated and persistent \textbf{system manifest}, which reflects the physical environment and current component configuration. This supports introspection, debugging, and reproducibility across sessions. \Cref{tab:manifest_comparison} outlines the key differences between both manifest types.

AURA introduces a shared vocabulary to describe the relationships between abstract application logic and concrete spatial and system resources. These concepts enable multi-application environments to remain spatially coherent, interoperable, and conflict-free.

\paragraph{Component}
A system-defined unit representing a trackable physical structure (e.g., table top, wall), associated device interface (e.g., projector output), or spatial regions. Components are declared in the \textit{system manifest}, including their spatial attributes, capabilities (e.g., input/output modalities). In the \textit{application manifest}, developers declare component \textit{requirements} via abstract descriptions (e.g., minimum dimensions, modalities), which the system maps to real-world instances at runtime. Components serve as the physical anchor for interaction and information exchange. Elements created and managed internally by the application --- such as photos, UI widgets, or 3D models --- are not declared as components in AURA. However, they may be spatially anchored to or interact with components at runtime, as discussed in \cref{sec:virtual-elements}.

\paragraph{Entity}
An application-defined abstraction representing a logical unit of interaction or visualization within the AR environment. Entities are declared in the \textit{application manifest} and are mapped at runtime to one or more components, which provide their physical presence. An entity may observe discrete (e.g., proximity triggers) and continuous (e.g., component velocity) variables derived from the system's observation of components, which it uses to drive its behaviour. It may also include heuristics that describe commonly observed patterns through invariant-based state definitions over components (See \cref{sec:heuristics}). While entities are not responsible for rendering or tracking directly, they define how application logic attaches to the environment through the system-managed spatial structure.

\paragraph{Agent}
An agent represents an active participant in the AR environment. Depending on the system, Agents may provide services such as position tracking, gesture detection, or visual output --- along with the access mode (e.g., event-driven or polling). In the \textit{application manifest}, the application describes desired and required services. During runtime, application agents are dynamically matched to available system-level agents. Multiple instances of a given agent type may coexist depending on application constraints.

\paragraph{Ambient}
An ambient defines a spatially and logically bounded context in which a single AR application is executed. Declared in the \textit{system manifest}, an ambient groups a set of components and agents under a coherent coordinate system and enforces application-level isolation. Each ambient is assigned exactly one application at runtime, ensuring exclusive access to its associated components for rendering and interaction. Ambients serve as the primary unit of spatial scoping in AURA: they restrict where an application can observe, modify, or produce output, and they mediate data flow between applications and the system. Although ambients are disjoint in terms of assigned components, they may participate in shared coordination via higher-level \textit{world}s.

\paragraph{World}
A world is a higher-level construct that groups multiple ambients into a shared spatial and semantic context. Declared in the \textit{system manifest}, a world defines global coordination rules --- such as data visibility, and access policies --- that apply across its ambients. Worlds enable interoperability between applications by allowing scoped observation of shared component storage and event communication. While ambients enforce application isolation, worlds provide a mechanism for structured coexistence and communication between applications that occupy distinct but related regions of space.

\paragraph{Manifest}
The pair of declarations that form AURA's contract: the application manifest defines what an app wants to do, while the system manifest declares what is possible in the environment. Together, they define a binding between abstract logic and physical reality. An application manifest is static while the system manifest may be updated during runtime.

\subsection{Runtime Initialization and Spatial Matching}\label{sec:aura-bootstrap}

As illustrated in \cref{fig:aura_bootstrap}, execution in AURA begins when a user launches one or more applications --- potentially authored by different third-party developers --- and each submits its manifest to the AURA-based system. The system analyses the spatial, sensory, and interaction requirements declared in the manifests and identifies suitable components in the physical space. The user may optionally assist in mapping these components to application entities, offering spatial preferences or intent. Based on this input, the system provides each application with an ambient that includes the available components and a proposed entity-to-component assignment. Each application then decides which of the assigned components to bind to its entities. At runtime, the system monitors spatial triggers, schedules component updates, routes input and output, and manages state transitions --- all according to the declarative contracts defined in the manifests and the final entity-to-component bindings.

This process ensures that:
\begin{itemize}
  \item Component use is explicitly negotiated;
  \item Runtime conflicts (e.g., overlapping visual output or sensor contention) are avoided up front;
  \item System and application share a consistent understanding of spatial layout.
\end{itemize}

%\begin{enumerate}
%  \item The application submits a manifest detailing component requirements, agents, and entities.
%  \item The system resolves these requests against available resources and proposes candidate mappings.
% \item The user may review and confirm these mappings via a configuration interface or preset rules.
%  \item The system finalizes the ambient and component assignments and sends the ambient specification back to the application.
%  \item The application communicates back the entity-component attribution to the system. Which may be updated anytime during runtime by the application.
%  \item Runtime execution begins, with the system mediating spatial tracking, input/output dispatch, and variable observation.
%\end{enumerate}

%At runtime, the system:
%\begin{enumerate}
%  \item Parses application manifests and validates spatial constraints;
%  \item Validates that all non-disjoint ambient areas don't have common components;
%  \item Resolves component mappings and injects coordinates into the application's declared referential;
%  \item Routes sensor and actuator access through system-controlled proxies, enforcing access limitations;
%  \item Emits events for observable triggers, allowing the application to respond;
%  \item Mediates access to component data and sensors.
%\end{enumerate}

%Components may be shared between ambients as long as they are never simultaneously assigned --- disjoint regions. Applications may observe changes to a shared component's storage (within the same world) but cannot modify or output to it if it is not explicitly assigned within their ambient --- present in ambient's region.

%This declarative startup enables safe multi-application coexistence, supports mixed automated and user-directed spatial decisions, and ensures clear boundary definitions for application behaviour.

\subsection{Ambient Isolation and Security Guarantees}\label{sec:aura-security}

AURA enforces strong isolation through its ambient-based execution model:
\begin{itemize}
  \item Shared components must not be assigned to overlapping ambients;
  \item Applications can only observe and write to components within their assigned ambient;
  \item Outputs (e.g., visual rendering or actuator control) must target components currently available in the ambient;
  \item Cross-ambient observation requires explicit world-scoped coordination.
\end{itemize}

These constraints mitigate several threats outlined by Ruth et al.~\cite{Ruth2019ARsecurity}, including attempts by one user to (1) share unwanted or undesirable content, (2) see private content belonging to another user, or (3) manipulate another user's AR content. By scoping visibility and interaction to well-defined ambients, AURA provides strong default boundaries that prevent such interactions unless explicitly mediated.

AURA also focuses on isolating applications from one another by design: components are non-shareable across ambient instances unless explicitly mediated. This helps limit multi-app interference but does not yet include inter-app communication controls or shared-world consistency guarantees, which remain future work.

\subsection{Component-Based Memory Sharing}\label{sec:component-memory}

Components in AURA expose structured data stores that act as shared memory spaces. This memory is accessible to the application assigned to the ambient the component belongs to, and may be readable across ambients if the component is part of a world.

AURA supports data exchange between applications by attaching shared memory to components. This enables coordination without direct dependencies between applications. In \cref{fig:shared_memory} we show how worlds and ambients influence data exchange ensuring proper interoperability.

\paragraph{Component Storage}
Each component in the system manifest includes a structured data store. Applications can read or write to this memory under defined constraints, enabling indirect coordination. For example, placing a virtual photo on a surface may involve storing its metadata in the corresponding component's data structure.

\paragraph{Data Scoping}
When components are in an ambient that is contained in a world, their memory is shared with all the other ambients that refer to the same component in the same world. Writes, however, are ambient-scoped and conflict-free. See \cref{fig:shared_memory}.

\paragraph{Event-Driven Access}
Applications may subscribe to updates on component storage, such as when another application modifies a shared component. For example, an application monitoring a movable surface could receive events when another application moves or drops a virtual photo on it in a different ambient on the same world.

\paragraph{Request-Based Access}
Applications can query current component storage or historical logs of changes, enabling snapshot-based reasoning.

\paragraph{Snapshot Consistency and System Responsibility}
The system --- not individual applications --- must be responsible for producing consistent snapshots of component memory. Since AURA does not provide global synchronization guarantees across applications, any attempt by applications to independently maintain histories of component updates could result in inconsistent event orderings. By delegating this responsibility to the runtime, AURA ensures a coherent view of temporal changes across the shared environment.

\subsection{Interaction Semantics with Component-Based Memory}\label{sec:virtual-elements}

AURA distinguishes between system-managed components and application-defined virtual elements. Virtual elements --- such as photos, tools, avatars, or icons --- are not explicitly declared in manifests, but their behavior can be grounded and made interoperable by binding them to component-based shared memory. This model allows for spatial anchoring, inter-application observability, and indirect coordination across loosely coupled applications.

\paragraph{Data-Driven Object Sharing Across Applications}

Consider a scenario where a user is cooking in the kitchen using a recipe application projected onto the countertop. Simultaneously, their family is seated in the living room, browsing a photo gallery displayed on the dining table via a separate AR application. Both applications are deployed in distinct ambients --- one for the kitchen, one for the living room --- but are part of the same world (a ``kitchenette'') and share access to a common physical component: a lightweight physical object (e.g., a serving plate or designated tag) declared in both ambients.

The user decides to share a recipe photo with the family. First, the recipe app copies the photo's metadata into the agent's (hand) component store. Then, the user transfers the data into the shared object's component store --- effectively ``placing'' the virtual photo onto the shared object. The user then physically tosses the object from the kitchen into the living room. Upon catching it, the family interacts with the photo gallery app. Since the gallery app can observe the shared component, it retrieves the updated memory and displays the photo. The family then drags the photo from the shared object onto the dining table, which becomes the new anchor for the virtual content.

\paragraph{Cross-Ambient Semantics via Worlds}

This interaction is only possible because both ambients --- kitchen and living room --- are part of the same AURA world. Within a world, shared components remain uniquely identified and consistently observable across ambients. Although the applications are logically separated, AURA's runtime sees the physical object as a single system-managed component. Thus, any update to its memory in one ambient becomes visible in the other, enabling seamless, asynchronous coordination. This declarative and spatially scoped approach to data exchange ensures interoperability without requiring direct app-to-app communication, supporting safe and modular interaction between independently authored applications.

\paragraph{Interoperability Constraints}
While applications are free to use proprietary formats or tracking logic for virtual objects, doing so outside AURA's shared memory model undermines interoperability. To ensure system-level coordination and reuse of spatial context, application logic should rely on AURA-declared components and ambient-bound storage. By doing so, developers preserve the benefits of modularity, observability, and runtime-managed consistency.

\subsection{Cross-Application Interoperability Mechanisms} \label{sec:xdg-portals}

Beyond structured, spatially scoped memory, AURA can benefit from integration with host-level \textbf{interoperability services} such as XDG Desktop Portals~\cite{XDGDesktopPortals}. These standardized interfaces enable applications to access system services --- like file dialogs, media capture, or clipboard operations --- in a sandboxed, user-mediated manner. They complement AURA's declarative model by enabling permissioned, OS-level interaction without requiring direct inter-application communication.

We envision the following AURA-specific extensions or use cases for portal-based services:
\begin{itemize}
  \item \textbf{File and clipboard exchange:} Applications may export artefacts (e.g., annotated images, sensor logs) to the host filesystem or clipboard via standardized save or copy dialogs.
  \item \textbf{Permissioned service bridging:} AURA applications can delegate cross-app requests (e.g., blob sharing, identity selection) to a system-defined portal agent, enabling interactions across app boundaries with explicit user consent.
  \item \textbf{Visual output redirection:} Applications may request that rendered content be redirected to a different component or shared surface (e.g., projector, virtual screen), mediated by a media-sharing portal.
  \item \textbf{Embedding legacy applications:} Apps that do not provide an AURA manifest --- such as conventional desktop programs --- may be wrapped via window-sharing portals and presented as visual agents or embedded surfaces within an ambient.
\end{itemize}

This hybrid model bridges the gap between AURA's tightly scoped, manifest-driven coordination and broader interoperability with traditional system services, allowing AR applications to coexist seamlessly with host applications and general-purpose OS environments.

\subsection{Heuristics and Observational State Modelling}\label{sec:heuristics}

AURA allows applications to define \textbf{heuristics} --- semantic abstractions over component-level variables --- to express observable states and their transitions. These definitions are included in the \textit{application manifest} and enable the system to assist applications by evaluating logical conditions on tracked variables and notifying them of relevant state changes. This mechanism supports a declarative programming model where application logic can respond to high-level events such as ``object rising'' or ``hand approaching surface'' rather than continuously polling system's sensors signals.

Heuristics are based on:

\begin{itemize}
  \item \textbf{Discrete variables}, derived from Boolean conditions (e.g., \texttt{distance(agent, surface) < 0.3}), that encode binary spatial or contextual predicates.
  \item \textbf{Continuous variables}, representing scalar or vector quantities (e.g., position, velocity, acceleration, light intensity), whose values evolve over time and are sampled by the runtime.
\end{itemize}

A heuristic consists of named \textit{states}, each defined by an \textit{invariant} --- a logical condition that must hold for the state to be considered active.  Applications may optionally define a \textit{transition model} to describe likely or expected state progressions. While not enforced by the system, these transitions can support runtime optimizations such as smoothing, prediction, or performance improvements in tracking and projection. Developers should also specify an optional \textit{activation condition} --- a separate Boolean expression used to indicate when monitoring a state should become more frequent or prioritized.

The following example defines a \texttt{juggling ball} entity that transitions through states based on the vertical component of its velocity ($V.z$):

\begin{lstlisting}[language=json, numbers=none, caption={Juggling Ball Entity's Heuristic-Driven State Model.}]
"continuous_variables": {
  "V": {"exp": "ball_surface.velocity", "sample_period_ms": 20}
}, "discrete_variables": {
  "T": {"exp": "room_sensor.temperature > 30", "sample_period_ms": 1000},
  "H": {"exp": "room_sensor.humidity > 10"}
}, "heuristics": {
  "ball_surface": {
    "states": {
        "up":   { "invariant": "V.z > 0", "activation": "distance(agent, ball_surface) < 0.5" },
        "down": { "invariant": "V.z < 0", "activation": "distance(agent, ball_surface) < 0.5" },
        "idle": { "invariant": "V.z = 0", "activation": "distance(agent, ball_surface) < 0.5" },
        "miss": { "invariant": "V.z is unknown", "activation": "distance(agent, ball_surface) < 0.5" }
    },
    "transitions": { "idle": "up", "up": "idle", "idle": "down" }
} }
\end{lstlisting}

In the \texttt{idle} state definition, the system may apply a tolerance threshold (e.g., \texttt{abs(V.z) < 0.01}) to mitigate false positives caused by sensor noise or floating-point imprecision, thereby enhancing robustness. In future work, we aim to investigate adaptive strategies for threshold tuning, clarifying the boundary between system-level inference and application-specified semantics --- particularly in cases where precise control over state transitions is essential to application behaviour.

\paragraph{System Role and Guarantees}

Although heuristics are authored by the application, the system is responsible for evaluating them at runtime and reporting state transitions. It does not interpret their meaning or enforce transitions --- it merely tracks when invariants hold or are violated. This ensures that the semantics of the states remain under application control, preserving modularity and encapsulation.

Multiple applications may observe the same component and define different heuristics for it. Each application receives only the evaluated results of its own heuristics, even if they are in the same world, allowing for individualized interpretations without conflict. The activation mechanism helps AURA prioritize state monitoring dynamically, based on spatial or contextual relevance, without requiring global synchronization or explicit inter-application coordination.

\subsection{Semantic Interoperability and Ontology-Driven Reasoning}\label{sec:semantic-interop}

The AURA vocabulary is designed with extensibility in mind, leveraging JSON-LD for semantic richness. To ensure compatibility with emerging AR technologies and applications, developers can embed domain-specific vocabularies, link to external ontologies, and declare new types of services, components, or agents.

\textbf{Example Extensions:}
\begin{itemize}
  \item Declaring IoT devices with established ontologies (e.g., SAREF, SOSA/SSN).
  \item Describing specialized capabilities (e.g., ``surgical\_instrument\_tracking'').
  \item Defining new interaction types for domain-specific use cases, such as medical or industrial applications.
  \item Including metadata for accessibility features, such as haptic feedback or audio descriptions.
  \item Enabling system reasoning over application intent (e.g., prioritizing alerts over background info).
\end{itemize}

Semantic metadata can also assist the runtime in resolving conflicts (e.g., prioritizing critical overlays), adapting interaction logic, or composing behaviours from multiple applications within a world.

\section{AURA Manifests: Example and Usage}

We illustrate AURA's architecture through an AR photo gallery application inspired by Microsoft LightSpace~\cite{wilson2010combining}, showcasing how photos can be explored and manipulated across physical surfaces in a multi-user setting. We describe how AURA defines system structure and application behaviour through two complementary JSON-LD-based manifests: the \textit{application manifest} and the \textit{system manifest}. Together, they define the logical, spatial, and operational interface between AR applications and their runtime environment.

\subsection{Application Manifest: Declaring Behaviour and Requirements}

The application manifest is authored by its developer and provides a static declaration of an application's structure and requirements. It includes metadata, agent roles, components, entity definitions, and interaction triggers.

\paragraph{Application Metadata}
Each application begins with basic metadata and spatial constraints:
\begin{lstlisting}[language=json, numbers=none, caption={Application Metadata and Spatial Requirements.}]
"@type": "ApplicationManifest",
"application": {
  "id": "com.devX.photo_gallery_ar_app",
  "name": "Photo Gallery Application",
  "version": "1.0",
  "description": "Interactive photo gallery and viewer.",
  "dimensions": {
    "minimum_width": 1.5, "minimum_height": 1.0
  },
  "coordinate_system": ["type": "left-handed", origin: "bottom-left"],
}
\end{lstlisting}

Here, the application requests a minimum ambient size (\texttt{dimensions}), along with the coordinate system and origin it expects for spatial referencing (\texttt{coordinate system}).

\paragraph{Agents}
Agents represent users or actuation devices. Here we define a tracked user:
\begin{lstlisting}[language=json, numbers=none, caption={Agent Definition.}]
"agents": [{
  "type": "User",
  "id": "user_{}",
  "instances": "+",
  "services": [
    {"service_type": "hand_tracking",
    "call_type": "pooling", "required": true},
    {"service_type": "position_tracking",
    "call_type": "trigger_feed", "required": true},
    {"service_type": "hand_visual_output", "required": false}]
}]
\end{lstlisting}

The application defines that it requires at least one agent (\texttt{instances}) in order to be executed, the system can halt the application if there are no agents in the ambient as it explicitly depends on this agent in order to be useful/helpful.

With the \texttt{services} field, the application requires access to hand tracking on a pooling basis, agent coordinates for usage in trigger conditions, and the ability of showing visual output in the agent's hand. Therefore, the application may use these resources directly (system calls) or indirectly (system watchers that require this data).

\paragraph{Components}
Components specify required physical characteristics and Input/Output capabilities. Both entities will use a flat surface that provides video feed and visual output capabilities:
\begin{lstlisting}[language=json, numbers=none, caption={Application Manifest's Component for Flat Surface.}]
"components": [{
  "id": "flat_surface",
  "type": "static_surface",
  "dimensions": {
    "minimum_width": 0.3,
    "minimum_height": 0.3
  },
  "input": [{
    "type": "bw_video_feed",
    "minimum_width": 640,
    "minimum_height": 480
  }],
  "output": [{
    "type": "visual_output",
    "minimum_width": 1280,
    "minimum_height": 720
  }]
}]
\end{lstlisting}

\textbf{Note:} AURA allows to not assume specific output devices like projectors or HMDs. Instead, it abstracts output capabilities via \texttt{visual\_output}, which could map to projection, screen display, or AR headset rendering. Similarly, input requirements such as \texttt{bw\_video\_feed} (black and white video feed) can be provided by RGB cameras, depth sensors, or other modalities.

\paragraph{Entities}
The application defines two entities, both referring to the same application manifest component \texttt{flat\_surface}. This does not imply that both entities will have the exact same system-provided component. Application Manifest components are abstract definitions of what is required, but they only assume form inside an entity, as it is based in an entity's \texttt{instances} that the system learns how many components with certain characteristics it has to provide the application with.

The entity \texttt{photo\_gallery} consists of a single interactive visualiser for browsing photos. If it was defined as "1", then if the component that represents this entity was moved off of the ambient's physical space, the application could be halted by the system, as one instance of this entity would be required for it to run.
\begin{lstlisting}[language=json, numbers=none, caption={Photo Gallery Entity Definition.}]
"entities": [{
  "id": "photo_gallery",
  "components": ["flat_surface"],
  "instances": "[0-1]",
  "discrete_variables": [
    "active": "distance(agent, flat_surface) < 0.3"
  ] },
\end{lstlisting}

The entity \texttt{photo\_viewer} can have zero or more instances and consists of a photo displayer.
\begin{lstlisting}[language=json, numbers=none, caption={Photo Viewer Entity Definition.}]
{
  "id": "photo_viewer",
  "components": ["flat_surface"],
  "instances": "*",
  "discrete_variables": [
    "active": "distance(agent, flat_surface) < 0.3"
  ]
}]
\end{lstlisting}

In both entities, we request the system to notify of valuation changes of the discrete variable ``active''. When a user approaches within 0.3m of a gallery or photo viewer surface, the system emits an event. This allows the application to react accordingly by requesting access to the associated video feed and/or visual output.

\subsection{System: Allocation and Coordination}

A \textbf{system manifest} is generated at setup time or by a system administrator. It defines the physical layout, available components, agents, ambients, and their grouping into worlds. It ensures applications are isolated unless explicitly allowed to share.

\paragraph{Components}
Components in the system manifest represent real tracked structures or devices.
\begin{lstlisting}[language=json, numbers=none, caption={Component Definition.}]
"id": "table_surface",
"name": "Office's table top",
"type": "static_surface",
"spatial_attributes": {
  "x": 0, "y": 0, "z": 0.75,
  "width": 1.5, "height": 0.75
},
"capabilities": ["video_feed", "visual_output"]
\end{lstlisting}

\paragraph{Ambients and Worlds}
Each application is assigned to an ambient, which acts as its exclusive spatial and resource scope. Here the ambient makes a (concrete/system) component available for the application assigned. It could also suggest that such component was associated to a certain application entity (usually that would happen with user guidance, however the actual attribution can only be decided by the application at runtime and can be updated at any time).
\begin{lstlisting}[language=json, numbers=none, caption={Ambient and World Declaration.}]
"ambients": [{
  "id": "ambient_gallery",
  "components": ["table_surface"],
  "dimensions": {
    "width": 2, "height": 1.5
  },
  "coordinate_system": ["type": "left-handed", origin: "bottom-left"],
  "application": "com.devX.photo_gallery_ar_app"
}]
\end{lstlisting}

Ambients can be grouped into worlds for shared observation and data exchange:
\begin{lstlisting}[language=json, numbers=none, caption={World Declaration.}]
"worlds": [{
  "id": "room1",
  "ambients": ["ambient_gallery"]
}]
\end{lstlisting}

After deployment, the system proposes candidate components matching the application's requirements, which the application may confirm or remap as part of the bootstrap procedure (see \cref{sec:aura-bootstrap}).

\section{Conclusion and Future Work}

This paper introduced AURA, a declarative specification and runtime model for spatially structured, multi-application AR systems. By separating application logic from physical resource allocation and introducing structured data sharing, heuristic-based state modelling, and ambient scoping, AURA enables modular development, scalable coordination, and conflict-free coexistence across applications.

Currently, AURA assumes a centralized runtime and lacks global synchronization guarantees, which may lead to diverging histories when applications observe shared state independently. Heuristics are application-defined and deterministic, and semantic arbitration is not yet formalized.

Future work includes implementing a reference runtime with arbitration and event routing, formal validation of manifests and ambient compatibility constraints, and distributed orchestration across heterogeneous AR platforms. We are particularly interested in probabilistic extensions to heuristic evaluation for improving tracking and rendering on moving objects, as well as developing an ambient calculus for structured data exchange between components.

%%
%% The next two lines define the bibliography style to be used, and
%% the bibliography file.
\bibliographystyle{ACM-Reference-Format}
\bibliography{biblio}

\end{document}
\endinput
%%
%% End of file `sample-sigconf-authordraft.tex'.