# Input device selection

> Pin a preferred microphone by CoreAudio UID in Settings, fall back to system default when unplugged, Bluetooth A2DP→HFP switch delay and audioReady HUD cue, and the readyLevelThreshold / readyFallbackDelay backstop in AudioCaptureService.

- Repository: cartesia-ai/InkIt
- GitHub: https://github.com/cartesia-ai/InkIt
- Human docs: https://www.grok-wiki.com/public/docs/cartesia-ai-inkit-18975554254b
- Complete Markdown: https://www.grok-wiki.com/public/docs/cartesia-ai-inkit-18975554254b/llms-full.txt

## Source Files

- `InkIt/AudioDeviceManager.swift`
- `InkIt/AudioCaptureService.swift`
- `InkIt/AudioPCMConverter.swift`
- `InkIt/SettingsStore.swift`
- `InkIt/SettingsView.swift`

---

---
title: "Input device selection"
description: "Pin a preferred microphone by CoreAudio UID in Settings, fall back to system default when unplugged, Bluetooth A2DP→HFP switch delay and audioReady HUD cue, and the readyLevelThreshold / readyFallbackDelay backstop in AudioCaptureService."
---

InkIt pins dictation input by CoreAudio device UID (`kAudioDevicePropertyDeviceUID`), not by the transient `AudioDeviceID`. Settings persists the UID in UserDefaults; `AudioCaptureService` resolves it at each hotkey press via `AVAudioEngine.inputNode.auAudioUnit.setDeviceID(_:)` before reading the input format and streaming mono 16 kHz PCM to Cartesia.

## Why pin a microphone

macOS routes the system default input dynamically. Bluetooth headsets and AirPods often become the default when connected, switching from stereo output (A2DP) to a narrowband hands-free mic profile (HFP) only when capture starts. That hijack can degrade transcription without any visible setting change.

Pinning decouples InkIt's capture device from macOS routing: you can keep dictating through a wired or built-in mic even when Bluetooth gear is connected.

<Note>
Pinning stores a preference, not an exclusive lock. InkIt does not change the system default input device in macOS Sound settings.
</Note>

## Choose an input device in Settings

The picker lives in **Settings → Dictation → Microphone → Input device**. Search for "mic", "input", or "bluetooth" to surface the same control inline.

<Steps>
<Step title="Open the microphone picker">
Open Settings from the Home gear icon, select the **Dictation** pane, and find the **Microphone** section.
</Step>
<Step title="Select a device">
Choose **System default** to follow macOS, or pick a specific input from the list. Bluetooth devices appear with a `(Bluetooth)` suffix.
</Step>
<Step title="Read the advisory caption">
An orange caption appears when the pinned device is disconnected (fallback active) or when a Bluetooth mic is selected (accuracy warning).
</Step>
<Step title="Verify on next dictation">
Hold or toggle your dictation shortcut. The notch HUD shows a pulsing dot until the mic is live, then switches to the live waveform — that transition is the cue to start speaking.
</Step>
</Steps>

| Picker value | Stored UID | Capture behavior |
| --- | --- | --- |
| System default | `""` (empty string) | Routes to `kAudioHardwarePropertyDefaultInputDevice` on every take |
| Named device | CoreAudio UID string | Routes to that device when attached |
| Pinned but unplugged | Stale UID (still stored) | Falls back to system default; Settings shows an orange warning |

## Persistence

<ParamField body="preferredInputDeviceUID" type="string" default='""'>
UserDefaults key written by `SettingsStore`. Empty string means follow the macOS default. Non-empty values are stable CoreAudio UIDs that survive reboot and replug; the transient `AudioDeviceID` is never saved.
</ParamField>

The picker binds directly to `settings.preferredInputDeviceUID`. Changes take effect on the next dictation — there is no separate Apply step.

## Device enumeration

Two layers cooperate: a static CoreAudio query usable off the main actor, and a main-actor manager that keeps the Settings list fresh.

```text
  Settings UI                    Capture (per take)
  ─────────────                  ──────────────────
  AudioDeviceManager             AudioDevices (static)
    │ refresh on attach/detach     │ inputDevices()
    │ + default-input change       │ deviceID(forUID:)
    ▼                              │ defaultInputDeviceID()
  MicrophonePickerRow              ▼
    Picker ← preferredInputDeviceUID   AudioCaptureService.start()
                                         setDeviceID(pinned ?? default)
```

`AudioDevices.inputDevices()` returns every hardware device with at least one input stream, reading UID and name from CoreAudio and flagging `isBluetooth` when transport is `kAudioDeviceTransportTypeBluetooth` or `BluetoothLE`.

`AudioDeviceManager` registers property listeners on:

- `kAudioHardwarePropertyDevices` — device attach and detach
- `kAudioHardwarePropertyDefaultInputDevice` — system default changes

The picker refreshes automatically while Settings is open; each mount starts and stops its own manager instance.

## Capture-time routing

On every `startDictation`, `AppCoordinator` copies the stored preference into the capture service and starts the engine:

```swift
audio.preferredDeviceUID = settings.preferredInputDeviceUID
try audio.start { data in client?.sendAudio(data) }
```

Inside `AudioCaptureService.start()`:

1. Resolve `pinnedID = AudioDevices.deviceID(forUID: preferredDeviceUID)` — returns `nil` when the UID is empty or the device is unplugged.
2. Call `setDeviceID(pinnedID ?? defaultInputDeviceID())` while the engine is stopped and **before** reading `inputFormat(forBus: 0)`.
3. Build an `AudioPCMConverter` from the active device's native format to mono `pcm_s16le` at 16 kHz.
4. Install an input tap, start the engine, and arm the readiness backstop.

<Warning>
`AVAudioEngine` is reused across takes. When the preference is cleared (System default), InkIt explicitly resets to the current default on each `start()`. Without that reset, a previously pinned device would stick on the engine instance.
</Warning>

If `start()` throws, `AppCoordinator` surfaces `Audio start failed: …` in the notch HUD and cancels the STT session.

## Bluetooth A2DP→HFP and the audioReady HUD cue

Bluetooth headsets spend roughly 200–500 ms switching from output (A2DP) to the hands-free mic profile (HFP) after capture begins. During that window the hardware emits digital silence — speech in the gap is lost at the hardware level, not in InkIt's buffer.

InkIt addresses this with a two-stage readiness signal:

| Stage | UI | Meaning |
| --- | --- | --- |
| `audioReady == false` | `HUDPreparingDot` — single softly pulsing dot | Mic profile still coming up; wait before speaking |
| `audioReady == true` | `HUDWaveform` driven by `inputLevel` | Device is delivering real audio; safe to speak |

`AppCoordinator.audioReady` resets to `false` at the start of each recording. `AudioCaptureService.onReady` fires exactly once per take on the main queue; the coordinator sets `audioReady = true`.

```mermaid
sequenceDiagram
    participant User
    participant Coordinator as AppCoordinator
    participant Capture as AudioCaptureService
    participant HUD as NotchHUD

    User->>Coordinator: Hotkey press
    Coordinator->>Coordinator: audioReady = false
    Coordinator->>Capture: preferredDeviceUID + start()
    HUD->>HUD: HUDPreparingDot (pulsing)
    Note over Capture: Bluetooth A2DP→HFP switch<br/>~200–500 ms digital silence
    alt Input level > readyLevelThreshold
        Capture->>Coordinator: onReady()
    else readyFallbackDelay elapses
        Capture->>Coordinator: onReady() (timer backstop)
    end
    Coordinator->>HUD: audioReady = true
    HUD->>HUD: HUDWaveform (live level)
    User->>User: Start speaking
```

The still→moving transition in the notch is intentional: it tells you when to begin talking, especially on Bluetooth.

<Info>
Settings warns when a Bluetooth device is pinned: *"Bluetooth mics use a narrowband profile that can lower transcription accuracy. A wired or built-in mic usually works better."* Pinning a wired mic prevents AirPods from hijacking capture even when they remain the macOS default.
</Info>

## Readiness backstop constants

Readiness is detected in the input tap on the main queue. Each buffer's peak level is log-compressed to a normalized 0…1 float (with a −50 dB floor). When level exceeds the threshold, `signalReadyIfNeeded()` fires `onReady` and cancels the timer.

<ParamField body="readyLevelThreshold" type="Float" default="0.03">
Normalized peak level above which the device is treated as genuinely delivering audio, as opposed to the digital silence a Bluetooth mic emits during profile switch. Sits just above the meter noise floor.
</ParamField>

<ParamField body="readyFallbackDelay" type="TimeInterval" default="0.6">
Seconds after `start()` before readiness is signaled unconditionally. Covers the worst-case Bluetooth profile switch so the HUD never stays stuck on the preparing cue in a silent room where no buffer crosses the threshold.
</ParamField>

`signalReadyIfNeeded()` is idempotent: whichever path wins (signal or timer), the fallback `DispatchWorkItem` is cancelled and `hasSignaledReady` prevents duplicate callbacks.

## Unplugged fallback

When a pinned UID no longer matches any attached input device:

- Capture silently routes to the system default — dictation continues.
- Settings shows: *"Pinned mic isn't connected — using the system default until it's back"*
- The stored UID is **not** cleared; reconnecting the device restores the pin automatically.

## Troubleshooting

| Symptom | Likely cause | What to check |
| --- | --- | --- |
| First word clipped on AirPods | Spoke during A2DP→HFP switch | Wait for waveform (not pulsing dot) before speaking |
| Transcription quality dropped after connecting headphones | Bluetooth narrowband HFP profile | Pin built-in or wired mic in Settings |
| Picker shows warning but dictation works | Pinned device unplugged | Reconnect device or switch to System default |
| HUD stuck on preparing dot > 1 s | Audio start failure or permission issue | Microphone permission; check for `Audio start failed` error |
| Wrong mic despite pin | Stale engine state (rare) | Preference applies every take; restart InkIt if behavior persists |

Enable **Settings → General → Advanced → Debug logging** and inspect `~/Library/Logs/InkIt-debug.log` for `startDictation` traces when diagnosing capture routing.

## Related pages

<CardGroup>
<Card title="Dictation pipeline" href="/dictation-pipeline">
Hotkey press through 16 kHz PCM capture, Cartesia STT streaming, and paste — where pinned-device audio enters the pipeline.
</Card>
<Card title="Dictation state machine" href="/dictation-state-machine">
`DictationState` lifecycle and how `audioReady` couples to the recording HUD during `.recording`.
</Card>
<Card title="Settings reference" href="/settings-reference">
Full `SettingsStore` key inventory including `preferredInputDeviceUID`.
</Card>
<Card title="Permissions model" href="/permissions-model">
Microphone permission requirements before `AudioCaptureService.start()` can succeed.
</Card>
<Card title="Runtime troubleshooting" href="/runtime-troubleshooting">
Bluetooth mic profile delay, debug logging, and other non-API runtime issues.
</Card>
</CardGroup>
