Blog Performance 9 min read

Art Gallery — Gamepad-Driven Generative Soundscape

A gallery generative soundscape that responds to visitor presence with a DualSense as the curator's tuning surface — full patch architecture and sensor wiring.

By Aidxn Design

An art gallery gamepad rig is the curator's hand on a generative composition that is mostly responding to the room itself. Visitor presence drives the density. A slow internal clock drives the scene rotation. The gamepad sits on the curator's desk and exists for one job: tuning the weights when the room sounds wrong, without taking the system offline. It is the difference between an install that drifts toward chaos over a six-week run and one that stays composed.

TL;DR
  • Presence sensor + MIDI CC + generative patch + curator gamepad — four layers, one install.
  • Visitor count drives density, not pitch or rhythm. Subtle, not gimmicky.
  • The DualSense whitelist locks visitor inputs out if someone picks it up off the curator's desk.
  • Tune for a week. Generative systems need live observation before they settle.
  • This pairs with our museum install post — read both for the full pattern.

The architecture, top down

┌──────────────────┐
│ Presence sensor  │  (mmWave radar / depth cam / PIR)
└────────┬─────────┘
         │  visitor count + rough position
         ▼
┌──────────────────┐
│ Raspberry Pi     │  MIDI emitter
└────────┬─────────┘
         │  CC 20 (count), CC 21 (motion)
         ▼  RTP-MIDI over LAN
┌──────────────────────────────────────┐
│ Gallery host (Mac mini)              │
│  ├─ Universal Controller MIDI        │  ←─── DualSense
│  └─ Max/MSP or Ableton patch         │       (curator override)
└────────┬─────────────────────────────┘
         │  audio
         ▼
┌──────────────────┐
│ Gallery speakers │  55–65 dB SPL
└──────────────────┘

Four discrete components, each individually replaceable. The presence sensor is the riskiest piece — depth cams have firmware quirks, mmWave radar is sensitive to room geometry. Pick one and test it for a full opening day before committing.

Presence sensing — what to use

The sensor is the input layer. Three options, in increasing reliability and cost:

SensorCostReliabilityWhat you get
PIR sensor (HC-SR501)~$3LowBinary "movement detected" — no count, no position
mmWave radar (LD2410)~$15HighPerson count (0–3), distance, presence with no movement
Depth camera (Kinect Azure)~$400MediumBody count, skeletal positions, rough age/height
LiDAR scanner (RPLIDAR A2)~$280High2D floor map, multi-person tracking, no faces

For most galleries, mmWave radar is the right choice. It runs forever, ignores temperature changes, and gives a reliable 0–3 person count for the typical contemplation distance from a sound piece (1–4 m).

The Pi-to-MIDI bridge

The sensor talks to a Raspberry Pi (or any small Linux board) over UART or I²C. The Pi emits MIDI CCs over RTP-MIDI (network MIDI) to the gallery host. RTP-MIDI is built into macOS and runs over LAN with sub-3 ms latency — fine for a slow soundscape where parameters drift over seconds.

# pi-sensor-to-midi.py — minimal sketch

import time
import mido
from sensor_ld2410 import LD2410   # vendor driver

port = mido.open_output('Gallery RTP-MIDI', virtual=True)
sensor = LD2410(uart='/dev/ttyAMA0')

while True:
    count = sensor.person_count()        # 0..3
    motion = sensor.motion_level()       # 0..100
    port.send(mido.Message('control_change',
              channel=0, control=20, value=min(127, count * 42)))
    port.send(mido.Message('control_change',
              channel=0, control=21, value=motion))
    time.sleep(0.5)   # 2 Hz update is enough for a gallery

The generative patch — what it actually does

Three or four scenes that share a tonal palette but differ in density and rhythm. The visitor-count CC biases the scene weights — more visitors weight toward denser scenes, fewer toward sparser. The motion CC drives a per-scene density parameter independently. Both are smoothed with a 30-second slew so the soundscape never reacts to a single person walking past.

Within each scene, bounded randomness on every parameter. Filter cutoff drifts between 800 Hz and 3.5 kHz. Reverb tail drifts between 2 and 5 seconds. Note density (in voices/minute) drifts between 12 and 45 depending on the visitor-count input. The patch never plays the same five seconds twice.

The curator's gamepad

The DualSense sits on the curator's desk in the back office, plugged into the gallery host via a long USB-C extension. It only emits the inputs whitelisted in the bridge — visitors who pick it up by accident hit dead buttons.

InputMIDICurator effect
TriangleNote 60Force scene A (sparse, washy)
SquareNote 61Force scene B (mid-density, melodic)
CrossNote 62Force scene C (dense, rhythmic)
CircleNote 63Return to automatic (presence-driven)
Touchpad XCC 16Density bias — +/-50% of presence signal
Touchpad YCC 17Tonal tilt — darker / brighter EQ
L2 (analogue)CC 11Master gain ride (-12 dB to 0 dB)
R2 (analogue)CC 12Reverb size override
L1+R1 held 2 sNote 127Emergency mute (fades to silence in 1 s)

The tuning week

Generative systems do not work on opening day. They work on day seven, after a week of live observation and weight adjustment. The pattern is consistent across every install we have helped with:

  • Day 1–2. Something is wrong — too dense, too quiet, the wrong scene at the wrong time. The curator nudges the weights via the touchpad. Take notes.
  • Day 3–4. The big wrongness is gone. Subtle wrongnesses appear — the patch always goes sparse mid-afternoon when actually the room is busy then. Adjust the presence sensor's smoothing window.
  • Day 5–6. The patch is mostly right. The curator notices one specific scene is too foregrounded against a specific artwork. Lower its weight by 15%.
  • Day 7+. Hands off. Watch the install run.

What good gallery soundscapes have in common

We have studied a decent number of gallery sound installs over the years — both as visitors and as install techs. The ones that work share a small list of properties:

  • Quiet enough to allow conversation. 55–65 dB SPL at a metre.
  • Sparse enough to allow attention to the visual work. Foreground/background discipline.
  • Slow enough that the rate of change is below conscious notice. Anything faster than 30 s feels like reactive sound effects.
  • Bounded enough that you never hear the same phrase twice in a visit. Markov chains and slow LFOs over a fixed palette.
  • Composed enough that it sounds intentional, not algorithmic. The curator's hand matters.

The Met's research conservation team have written publicly about the long-run challenges of media-art installation, and the AES has run a few panels on the same. Worth reading if you are pitching a six-month install.

Where the gamepad earns its place

A custom hardware tuning panel would do the same job. It would cost $2,000+, take six weeks to commission, and be unique to this install. A DualSense plus Universal Controller MIDI costs under $200, ships tomorrow, and is replaceable from any electronics shop. For a gallery on a six-week show schedule with a small AV budget, the maths is not close.

Plug in the sensor, write the patch, lock the controller in curator mode, tune for a week. After that, the install runs itself, and the gamepad is the curator's hand for the rare days it sounds wrong.

Keep reading

More setup walkthroughs