Blog Pitch Detection 10 min read

DualSense Microphone MIDI Pitch: Hum-to-Notes Guide

DualSense microphone MIDI pitch tracking — turn the built-in mic into a real-time hum, whistle, or beatbox-to-MIDI input under 25ms. Start sketching ideas.

By Aidxn Design

DualSense microphone MIDI pitch tracking is the feature most owners do not realise their controller is capable of. Behold: a real MEMS microphone hidden under the touchpad of every Sony DualSense, 16 kHz sample rate, decent noise floor, mounted in a controller that is always in your hand. Universal Controller MIDI v1.4 added a real-time YIN pitch-detection pipeline that turns that mic into a monophonic MIDI input. Hum a melody, whistle a bassline, beatbox a kick pattern — the controller becomes the keyboard. This is the long-form guide to the workflow, what it is good at, where it fails, and how to set it up for real production work.

TL;DR
  • What: use the DualSense built-in mic as a monophonic pitch-to-MIDI input via Universal Controller MIDI v1.4+.
  • What you need: DualSense, bridge v1.4+, macOS 12+ or Windows 10+, a quiet-ish room.
  • Time: 10 minutes to a working hum-to-MIDI pipeline.
  • Cost: mic capture and YIN pitch detection are free tier. Polyphonic detection and chord output are Pro $89.

What you'll learn

  • Where Sony hid a MEMS mic inside the touchpad and what its frequency response actually is.
  • The four-stage YIN pitch pipeline — capture, pre-filter, autocorrelation, MIDI emit — with measured latencies per stage.
  • How to calibrate the noise floor, set chromatic vs scale-snap quantisation, and route channel 16 in your DAW.
  • A latency table for DualSense mic vs interface + Logic vs Sonuus i2M vs iPad MIDI Guitar.
  • The kitchen-melody workflow that turns the controller into an always-on pitch sketchpad.

What the DualSense microphone actually is (spec for MIDI pitch use)

Inside the touchpad assembly is a single MEMS microphone capsule — frequency response specced from 100 Hz to 10 kHz, sampled at 16 kHz with 16-bit resolution. Mono. Not great for vocals. Excellent for pitch detection of a single fundamental between roughly 80 Hz (E2) and 2 kHz (B6) — which covers humming, whistling, beatboxing, and most kick/snare attacks.

macOS exposes the mic as an input device named Wireless Controller in Audio MIDI Setup. Windows shows it as Headset Microphone on the controller's USB audio interface. The bridge taps in directly via Core Audio or WASAPI so you do not have to route through another app.

Sound quality is not the point

You are not recording vocals with this mic. You are extracting pitch and amplitude curves to drive a synth. The mic does not need to sound good — it needs to deliver a clean enough signal that YIN can lock to a fundamental within 20 ms. It does that reliably for a hummed melody from any healthy adult voice.

The DualSense microphone MIDI pitch detection pipeline

Four stages, total cost 21 ms on M-series silicon. Behold:

  1. Capture: 16 kHz mono audio buffer, 512 samples at a time (32 ms).
  2. Pre-filter: high-pass at 70 Hz to kill rumble, low-pass at 3 kHz to kill HF noise that confuses pitch detection.
  3. Pitch estimation: YIN algorithm with a confidence threshold (default 0.85). Returns frequency in Hz and a confidence value 0.0–1.0.
  4. MIDI output: frequency converted to MIDI note (nearest, or with quantisation to a scale), amplitude converted to velocity. Pitch bend output covers cents between semitones if enabled.

Total processing latency from acoustic event at the mic to MIDI note on the virtual port: 21 ms mean on M-series Macs, 27 ms on Intel Macs and Windows. That matches a $300 hardware pitch-to-MIDI box from 2018, running on hardware most people already own.

mic 16 kHz YIN autocorr f₀ quantise A4 · 440 MIDI 69 hum → fundamental → note number
Voice waveform → YIN fundamental → A=440 Hz → MIDI note 69, the textbook conversion.

YIN wins this latency budget against every alternative I tested. Measured on M2 Mac with a 512-sample window where applicable:

AlgorithmWindowMean latencyf₀ accuracyOctave errorsNotes
YIN (de Cheveigné 2002)512 / 32 ms21 ms±2 centsRare, gated by confidenceBridge default
Probabilistic YIN (pYIN)1024 / 64 ms38 ms±1 centAlmost noneToo slow for live
FFT peak picking1024 / 64 ms34 ms±15 centsCommon below 200 HzCheap, sloppy
Autocorrelation (raw)512 / 32 ms19 ms±6 centsFrequentNo confidence output
CREPE (CNN, GPU)1024 / 64 ms55 ms±0.5 centsNoneStudio-grade but heavy
SWIPE'1024 / 64 ms42 ms±2 centsRareStrong on speech

YIN vs FFT

Why YIN instead of FFT? Autocorrelation handles low-pitch fundamentals (below 200 Hz) much better than short-window FFT, and YIN returns a confidence value that lets the bridge silently reject octave errors — the de Cheveigné & Kawahara original YIN paper (IRCAM) is the canonical reference for the math. Long-window FFT could match accuracy but adds 30+ ms of latency. YIN at 512 samples hits the latency target with acceptable accuracy.

The bridge's YIN inner loop, in the actual shape it runs (Rust → translated to readable pseudocode):

// 16 kHz mono buffer, 512 samples = 32 ms window
fn yin(buf: &[f32], sample_rate: f32, threshold: f32) -> Option<Pitch> {
    let half = buf.len() / 2;
    let mut d = vec![0.0; half];

    // 1. Difference function
    for tau in 1..half {
        for j in 0..half {
            let delta = buf[j] - buf[j + tau];
            d[tau] += delta * delta;
        }
    }

    // 2. Cumulative mean normalised difference
    let mut running = 0.0;
    let mut cmnd = vec![1.0; half];
    for tau in 1..half {
        running += d[tau];
        cmnd[tau] = d[tau] * tau as f32 / running;
    }

    // 3. Absolute threshold — first dip below cuts the octave error
    for tau in 2..half - 1 {
        if cmnd[tau] < threshold && cmnd[tau] < cmnd[tau + 1] {
            let refined = parabolic_interpolate(&cmnd, tau);
            let f0 = sample_rate / refined;
            return Some(Pitch { hz: f0, confidence: 1.0 - cmnd[tau] });
        }
    }
    None // gate: hold previous note
}

Setup: DualSense microphone MIDI pitch in 10 minutes

1. Enable the mic in the bridge

Open Settings → Microphone in Universal Controller MIDI. Toggle Capture DualSense mic ON. On macOS the OS will ask for microphone permission — grant it. On Windows the controller appears as a default audio input the moment the bridge subscribes to it.

2. Choose detection mode

Three modes ship by default:

  • Pitch → Note: standard monophonic pitch-to-MIDI. Hum a note, get a MIDI note. Default channel 1.
  • Onset → Drum: amplitude transient detection without pitch. Each detected onset fires a fixed MIDI note (default C2 for kick) with velocity from peak amplitude. Use for beatboxing.
  • Envelope → CC: amplitude envelope sent as continuous CC. Whisper into the mic to ride a CC value. Useful for expression macros.

3. Calibrate the noise floor

Hold the controller in playing position, hit Calibrate, sit silent for 3 seconds. The bridge samples the room and DualSense's electrical noise floor, then sets the gate threshold 6 dB above it. This is what stops the bridge sending notes when you are breathing.

4. Set quantisation

Humans do not hum in perfect tune. Without quantisation, every hold turns into a stream of pitch bends bouncing between two semitones. Three quantisation options:

  • None: raw frequency converted to nearest MIDI note plus pitch bend. Use only if your singing is studio-grade.
  • Chromatic: snap to nearest semitone. Discards pitch bend entirely. Default.
  • Scale: snap to a chosen scale and root. C minor, A major, whatever. Best for melodic composition.

5. Route in your DAW

The mic outputs through the same virtual MIDI port as the rest of the bridge, on a dedicated channel (default 16 to avoid colliding with button mappings). In Ableton, set a track's MIDI From to Universal Controller MIDI channel 16. In Logic, create a track and filter to channel 16. Done — hum into the controller, watch notes land on the piano roll. The virtual MIDI port explainer walks through cross-platform routing if your DAW does not auto-discover the bridge.

Mic → MIDI latency budget (M-series, 21 ms total) capture · 12 ms filter 4 YIN 4 MIDI 1 0 ms 21 ms cursor crosses one full sample window
Stacked latency stages — capture dominates; YIN itself is only 4 ms of the budget.

What DualSense microphone MIDI pitch is good at

  • Sketching melodies away from a keyboard: I write half my bass parts by humming them into the controller while I cook dinner. Latency is low enough that I can sing along to a backing track without timing falling apart.
  • Beatboxing drum patterns: onset detection on a beatboxed pattern is shockingly tight. Tap out a kick-snare pattern and the bridge fires MIDI notes within 22 ms.
  • Vocal expression sources: the envelope mode lets you ride a CC with breath dynamics. Pair it with the trigger velocity curve workflow for a controller that takes pitch from your voice and timbre from your fingers.
  • Practising melodic improvisation: snap-to-scale mode is brutal honesty — if you hum a note that is not in the scale, you hear silence. Forces ear training.

The kitchen melody workflow

My most-used trick: open a Live session, drop a soft synth on a track, route channel 16 from the bridge to it, hit record, and wander the house humming. Twenty minutes later I have 6 minutes of melodic fragments to mine for hooks. DualSense in hand the whole time. No mic stand, no laptop on a kitchen bench, no setup.

A = 440 Hz → MIDI 69 (the reference) F4 349 · 65 G4 392 · 67 A4 440 · 69 B4 494 · 71 C5 523 · 72 low Hz high Hz
A=440 Hz pins to MIDI note 69; every neighbour is one semitone away on the ladder.

Where DualSense microphone MIDI pitch tracking fails

  • Polyphony: it is monophonic. Singing or playing two notes simultaneously confuses the YIN tracker and you get warbling between them. Pro tier has experimental two-note detection but it is not production-ready.
  • Plosives: hard "p" and "b" sounds trigger spurious onsets in onset mode. Use "doo" and "dee" syllables for melody, "boom" and "tss" for drum patterns where the plosive is the point.
  • Very low pitches: below 80 Hz the YIN window cannot lock. If you have a true bass voice, transpose up an octave for tracking, then transpose down in MIDI.
  • Noisy rooms: the mic is unidirectional but not aggressive. Air conditioning or a loud monitor speaker in the room will confuse onset detection. Calibration helps but does not eliminate.
  • Pitch when whispering: whispering has no fundamental. The bridge falls back to spectral centroid for whispered input but it is unreliable. Hum, do not whisper.

Latency in practice vs other pitch-to-MIDI options

The verdict against plugging a real mic into an interface and routing through Logic's MIDI Capture or Cubase's VariAudio:

  • DualSense mic via bridge: 21 ms mean. Wired USB-C, M-series Mac.
  • Audio interface + Logic Pro MIDI Capture: 34 ms mean. Requires a real-time tracker plugin.
  • Audio interface + dedicated pitch-to-MIDI hardware (Sonuus i2M): 12 ms. Dedicated DSP, no buffer overhead.
  • iPad MIDI Guitar / pitch apps: 25–40 ms depending on app.

The DualSense rig sits comfortably in the middle — slower than dedicated hardware, faster than software-only DAW-side tracking, free if you already own the controller. Full latency methodology in the benchmark post.

Combining DualSense microphone MIDI pitch with controller surfaces

The most fun setup uses the mic for melody and the controller's surfaces for timbre. A live patch I run often:

  • Mic → channel 16 → soft synth, drives pitch.
  • L2 trigger → CC 1 → synth filter cutoff.
  • R2 trigger → CC 11 → synth amp envelope.
  • Touchpad XY → CC 16/17 → reverb size and shimmer.
  • Face buttons → fixed pitch overrides for chorus hooks.

Singing a melody while your fingers shape its timbre is a different musical surface than any single instrument — closer to playing a vocoder than playing a keyboard. Pair with the MPE post and the touchpad XY workflow for a fully expressive setup.

Privacy: is the DualSense microphone always on?

No. The mic is captured only when the bridge is running and the toggle is ON. The bridge does not record audio to disk — it processes samples in memory and discards them. Nothing leaves your machine. On macOS the orange mic-active dot in the menu bar lights up whenever the bridge has the mic open, which is the OS-level honest signal that capture is active. If you are paranoid, the toggle is one click.

Tuning DualSense microphone MIDI pitch detection for your voice

Defaults work for most voices, but four knobs change tracker behaviour. Worth tuning once and saving as a preset.

Confidence threshold

Default 0.85. Drop to 0.7 if you have a breathy voice and the bridge keeps gating notes off. Raise to 0.92 if background noise produces spurious notes. The trade-off is permissiveness versus false positives.

Octave bias

YIN occasionally reports a pitch one octave below the true fundamental, especially for low whistles and humming. The Octave bias setting (range -1 to +1 octaves) post-corrects. Most people land on 0. Whistlers try +1; chest-voice hummers try -0.5 and the bridge lifts octave-low detections.

Onset sensitivity

For drum mode. Default 0.4 on a 0–1 scale. Raise to 0.6 for cleaner trigger capture from a controlled beatbox. Lower to 0.25 if you need every micro-onset including ghost notes. Set this with the controller in your normal playing position — sensitivity changes with mic distance to mouth.

Hold time

Minimum note duration before the bridge releases. Default 40 ms. Notes shorter than this filter out as noise. Drop to 20 ms for fast melodic runs. Raise to 100 ms for sustained singing so vibrato does not fragment into multiple note-on/off events.

When to use DualSense microphone MIDI pitch (and when not to)

The verdict: use it for sketching, beatboxing drum patterns, breath-driven expression, and ear-training practice. Skip it for serious lead-vocal capture (use a real mic), for chord progressions (it is monophonic), and for any context where the mic might pick up sensitive conversation. The DualSense mic was designed for game voice chat, not studio capture — the pitch pipeline plays to those strengths.

Once Universal Controller MIDI is running, the mic feature is one toggle away. Spend an afternoon humming melodies into the controller and you will capture ideas you would have otherwise lost between the kitchen and the studio.

FAQ

Is DualSense microphone MIDI pitch tracking polyphonic?

No, the production pipeline is monophonic. Pro tier ships an experimental two-note detection mode but it is not reliable enough for live use. For chord input use the face buttons or the touchpad XY surface instead.

How accurate is the pitch detection at low notes?

The YIN tracker locks reliably from 80 Hz (E2) to 2 kHz (B6). Below 80 Hz the analysis window cannot resolve the fundamental — true bass voices should transpose up an octave for tracking, then transpose back down inside the DAW.

Does the DualSense microphone work over Bluetooth?

Yes, but with extra latency (+15 ms) and a lower effective sample rate due to the BT codec. Wired USB-C is recommended for sketching and tracking. Bluetooth is fine for casual hum-capture practice.

Will the bridge record audio from the DualSense microphone?

No. The bridge processes samples in memory and discards them after pitch and amplitude extraction. Nothing is written to disk. On macOS the orange mic-active dot in the menu bar lights up whenever the bridge has the mic open, so OS-level capture indication is honest.

Can I use the mic alongside button mappings on the same controller?

Yes — mic-driven notes default to MIDI channel 16 while button mappings live on channels 1–15, so they never collide. Route them to different tracks in the DAW and you have pitch from your voice and timbre from your fingers on the same DualSense.

Keep reading

More setup walkthroughs