AR Development May 19, 2026 · 10 min read

Progressive WebAR Experiences for Enterprise: Marker-Based vs Markerless — When to Pick Which

WebAR for Enterprise: What It Actually Delivers vs. What Gets Oversold

Progressive WebAR experiences for enterprise almost always start the same way: a polished demo, a QR code, a photorealistic product appearing on a conference table. What follows the prototype is where the real design work begins. The gap between that controlled demo and a deployment that survives real devices, real lighting, and real users is almost entirely determined by one decision made early in the brief — whether you anchor content to a marker or let the device track the environment itself.

These are not stylistic choices. Marker-based and markerless AR are architecturally different, and each carries a distinct set of trade-offs across tracking stability, device coverage, interaction depth, deployment complexity, and cost. Getting this decision wrong means rebuilding after the first field test.

We've seen this play out directly. The Meet Eva Here installation at ArtScience Museum Singapore — an AI avatar Unity build that tracked and responded to visitor gestures in a real, unpredictable museum environment — taught us more about what WebAR interaction models can and cannot sustain than any controlled prototype. The visitors were unscripted. The lighting shifted. The physical space had no markers. The system had to hold.

This post maps the two approaches across five dimensions. At the end, you'll find a scoring matrix and a direct verdict on when to pick which.


What "Progressive WebAR Experiences for Enterprise" Actually Means

Before the comparison, a definition that matters: progressive WebAR experiences for enterprise are not single-shot demos. They are layered deployments — starting with the broadest possible device support and a lightweight interaction model, then adding fidelity and complexity as the device and browser prove capable. The word "progressive" is deliberate. It mirrors progressive enhancement in web development: design for the floor, then build toward the ceiling.

The floor is a marker-based image trigger on a mid-range Android device running Chrome. The ceiling is markerless SLAM tracking with persistent spatial anchors, gesture input, and real-time AI response on a flagship iOS device with ARKit. Most enterprise deployments need to work somewhere between those two points — and the brief should specify exactly where before a studio opens a scene file.

The stack underneath all of this is not monolithic. ARKit on iOS and ARCore on Android implement visual-inertial SLAM pipelines that detect planes and track device motion. WebXR sits above them, exposing pose data to JavaScript. Frameworks like A-Frame and Babylon.js then render content via WebGL or WebGPU. These are three independently evolving layers, and fragmentation at any one of them propagates into the user experience. WebXR is fully supported in Chrome on Android; on iOS Safari, support for immersive AR sessions remains partial. That single fact shapes every enterprise deployment decision.


Dimension 1: Tracking Stability Under Real Conditions

Marker-based: Stable, predictable, and relatively forgiving of hardware variation. Because the system knows the marker's geometry in advance, pose estimation is fast and accurate — provided the marker is visible and reasonably lit. Tracking breaks cleanly: when the marker leaves the frame, the experience pauses. When it returns, it resumes. This binary behaviour is actually an asset in controlled environments because it is debuggable and explainable to users.

Markerless: Depends on SLAM — simultaneous localization and mapping — which estimates both device pose and a map of the environment from camera and sensor data in real time. In well-lit spaces with textured surfaces, ARKit and ARCore perform well. In low-light environments, reflective surfaces, or feature-poor spaces (blank white walls, polished floors), tracking degrades. Drift accumulates over time and distance; virtual objects slowly misalign from their intended positions. Mitigating this requires spatial anchors, periodic relocalization, or a Visual Positioning System layered on top.

Meet Eva Here was a markerless environment. Visitors walked freely around the avatar, triggering responses through gesture and movement. There were no markers to fall back on. The interaction model had to be designed around the assumption that tracking quality would vary — and that a tracking hiccup mid-interaction should degrade gracefully rather than break the experience visibly. That constraint shaped every animation state and input threshold in the build.

Verdict on this dimension: Marker-based wins on stability. Markerless wins on spatial freedom. If your deployment environment is uncontrolled, budget explicitly for drift handling.


Dimension 2: Device and Browser Coverage

Marker-based: Broader compatibility. Image tracking via JavaScript computer vision libraries can run without WebXR, which means it works on devices and browsers that don't fully support the WebXR AR module. This includes older iOS Safari versions and some corporate-managed Android devices. The compute requirement is lower, which matters on mid-range and budget hardware common in enterprise device fleets.

Markerless: Narrower. Full markerless world tracking via WebXR requires a browser that supports the immersive-ar session type. On Android, Chrome handles this well. On iOS, Safari's WebXR support is incomplete — certain session types require workarounds or hybrid viewer approaches. Enterprise device policies that restrict camera access on non-HTTPS domains create an additional constraint that is easy to miss in a development environment and painful to discover in a pilot.

The practical implication: If your target audience is a mixed fleet of corporate devices — some managed, some BYOD, spanning iOS and Android across two or three generations — a marker-based WebAR approach will reach more of them reliably. Markerless should be scoped to a known device matrix, tested on the actual hardware before the brief is signed.

Verdict on this dimension: Marker-based wins on coverage. Markerless requires a validated device matrix before committing.


Dimension 3: Interaction Depth and Fidelity

Marker-based: Interaction is spatially constrained by the marker. Users can look at a product from different angles, trigger animations, or access information layers — but the experience is anchored to a fixed point. This is appropriate for product visualization, packaging activation, and printed collateral. It is not appropriate for experiences that require users to walk through a space, place content on arbitrary surfaces, or interact with large-scale environments.

Markerless: Enables richer spatial interaction — placing objects on detected planes, walking around them, scaling them, and anchoring multiple pieces of content across a room. This is the model that makes enterprise use cases like virtual showrooms, field service overlays, and spatial training environments possible. It also enables gesture and body tracking at the level we used in Meet Eva Here, where the avatar responded to visitor proximity and movement rather than a static trigger point.

The fidelity ceiling is set by the browser's rendering pipeline. WebGL limits polygon counts and shader complexity compared to native. A markerless experience that looks photorealistic in a Unity editor will need asset optimization passes before it performs acceptably in a WebGL context on a mid-range device. This is not a reason to avoid markerless AR — it is a reason to scope the asset pipeline correctly from the start.

Verdict on this dimension: Markerless wins on interaction depth. Factor in a WebGL optimization pass for any asset-heavy scene.


Dimension 4: Deployment and Maintenance Complexity

Marker-based: Simpler to deploy and maintain. The marker is the integration point — update the URL behind the trigger and the experience changes without touching physical assets. Analytics are straightforward: scan events, session duration, and interaction triggers map cleanly to standard web analytics. Security is easier to reason about because the camera access scope is narrow and the session is short.

Markerless: More moving parts. Plane detection must be calibrated for the target environment. Spatial anchors may need to be pre-mapped. Updates to the 3D scene require regression testing across the device matrix because SLAM behaviour can change with OS updates — ARKit and ARCore both push tracking algorithm changes through OS releases, and a scene that tracked well on iOS 16 may behave differently on iOS 18. Analytics are harder to instrument because the interaction model is spatial rather than click-based.

For enterprise clients with IT security requirements, the markerless path also involves more negotiation around camera permissions, data handling, and HTTPS enforcement. These are solvable problems, but they take time — and they should be scoped into the project timeline, not discovered during UAT.

Verdict on this dimension: Marker-based wins on deployment simplicity. Markerless requires ongoing device matrix maintenance and security scoping.


Dimension 5: Cost and Time to First Value

Marker-based: Lower development cost, faster to first prototype. A marker-based WebAR experience with a single 3D asset, basic animation, and image tracking can be built and deployed in weeks. The interaction model is constrained, which means fewer edge cases, fewer test scenarios, and faster QA cycles. For an enterprise team validating whether WebAR belongs in their sales or training stack at all, marker-based is the right starting point.

Markerless: Higher cost, longer timeline. SLAM-based tracking requires more engineering, more device testing, and more design work to handle tracking failure gracefully. The Meet Eva Here project — which involved gesture tracking, real-time AI avatar response, and an unpredictable museum audience — required a significantly more complex build than a marker-based equivalent would have. That complexity was justified by the use case: a static marker trigger would not have delivered the visitor interaction the ArtScience Museum needed. But for an enterprise team starting out, that complexity should be earned, not assumed.

Verdict on this dimension: Marker-based wins on time to first value. Markerless investment is justified only when the use case genuinely requires spatial freedom.


Scoring Matrix and Verdict

Dimension Marker-Based Markerless
Tracking stability ✅ High ⚠️ Environment-dependent
Device/browser coverage ✅ Broad ⚠️ Requires validated matrix
Interaction depth ⚠️ Constrained ✅ Spatial, rich
Deployment complexity ✅ Low ⚠️ High
Time to first value ✅ Fast ⚠️ Longer

Pick marker-based WebAR if:

  • Your deployment environment is controlled — packaging, printed collateral, demo booths, trade-show signage
  • Your device fleet is mixed or unvalidated
  • You are running a first WebAR pilot and need to prove value before committing to a larger build
  • Your IT security team needs a narrow camera permission scope
  • Timeline is under 8 weeks

Pick markerless WebAR if:

  • Your use case requires users to move through a space — virtual showrooms, spatial training, field service overlays
  • You have a validated device matrix and can test on actual hardware
  • Interaction depth and spatial freedom are core to the value proposition, not nice-to-haves
  • You have budget for drift mitigation, spatial anchor management, and ongoing device matrix maintenance
  • The experience needs to respond to user presence and movement, as in the Meet Eva Here AI avatar installation

The most common mistake we see in enterprise WebAR briefs is specifying a markerless experience because it looks better in the demo reel, then discovering mid-project that the target device fleet doesn't support it reliably. The second most common mistake is defaulting to marker-based because it's simpler, then delivering an experience so constrained that it doesn't justify the investment.

Progressive WebAR experiences for enterprise are built by matching the tracking model to the deployment environment first — before the brief, before the concept, before the asset pipeline. Everything else follows from that decision.


Related Reading


If you're scoping a WebAR deployment and want a straight read on which tracking model fits your environment, device fleet, and timeline — we're straightforward about what the stack can and cannot sustain. Talk to the VVS team before the brief is written, not after the first prototype breaks in the field.

Interested in building something like this?
We'd love to hear about your project — from VR training to WebGL experiences and beyond.
Get in Touch →