> audio-voice-recovery

Audio forensics and voice recovery guidelines for CSI-level audio analysis. This skill should be used when recovering voice from low-quality or low-volume audio, enhancing degraded recordings, performing forensic audio analysis, or transcribing difficult audio. Triggers on tasks involving audio enhancement, noise reduction, voice isolation, forensic authentication, or audio transcription.

fetch
$curl "https://skillshub.wtf/pproenca/dot-skills/audio-voice-recovery?format=md"
SKILL.mdaudio-voice-recovery

Forensic Audio Research Audio Voice Recovery Best Practices

Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.

When to Apply

Reference these guidelines when:

  • Recovering voice from noisy or low-quality recordings
  • Enhancing audio for transcription or legal evidence
  • Performing forensic audio authentication
  • Analyzing recordings for tampering or splices
  • Building automated audio processing pipelines
  • Transcribing difficult or degraded speech

Rule Categories by Priority

PriorityCategoryImpactPrefixRules
1Signal Preservation & AnalysisCRITICALsignal-5
2Noise Profiling & EstimationCRITICALnoise-5
3Spectral ProcessingHIGHspectral-6
4Voice Isolation & EnhancementHIGHvoice-7
5Temporal ProcessingMEDIUM-HIGHtemporal-5
6Transcription & RecognitionMEDIUMtranscribe-5
7Forensic AuthenticationMEDIUMforensic-5
8Tool Integration & AutomationLOW-MEDIUMtool-7

Quick Reference

1. Signal Preservation & Analysis (CRITICAL)

2. Noise Profiling & Estimation (CRITICAL)

3. Spectral Processing (HIGH)

4. Voice Isolation & Enhancement (HIGH)

5. Temporal Processing (MEDIUM-HIGH)

6. Transcription & Recognition (MEDIUM)

7. Forensic Authentication (MEDIUM)

8. Tool Integration & Automation (LOW-MEDIUM)

Essential Tools

ToolPurposeInstall
FFmpegFormat conversion, filteringbrew install ffmpeg
SoXNoise profiling, effectsbrew install sox
WhisperSpeech transcriptionpip install openai-whisper
librosaPython audio analysispip install librosa
noisereduceML noise reductionpip install noisereduce
AudacityVisual editingbrew install audacity

Workflow Scripts (Recommended)

Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.

  • scripts/preflight_audio.py - Generate a forensic preflight report (JSON or Markdown).
  • scripts/plan_from_preflight.py - Create a workflow plan template from the preflight report.
  • scripts/compare_audio.py - Compare objective metrics between baseline and processed audio.

Example usage:

# 1) Analyze and capture baseline metrics
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

# 2) Generate a workflow plan template
python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md

# 3) Compare baseline vs processed metrics
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
  --before evidence.wav \
  --after enhanced.wav \
  --format md \
  --out comparison.md

Forensic Preflight Workflow (Do This Before Any Changes)

Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001). Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence. Use scripts/preflight_audio.py to capture baseline metrics and preserve the report with the case file.

Capture and record before processing:

  • Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
  • Record signal integrity: sample rate, bit depth, channels, duration
  • Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
  • Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
  • Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
  • Locate the region of interest (ROI) and document time ranges and changes over time
  • Inspect spectral content and estimate speech-band energy and intelligibility risk
  • Scan for temporal defects: dropouts, discontinuities, splices, drift
  • Evaluate channel correlation and phase anomalies (if stereo)
  • Extract and preserve metadata: timestamps, device/model tags, embedded notes

Procedure:

  1. Prepare a forensic working copy, verify hashes, and preserve the original untouched.
  2. Locate ROI and target signal; document exact time ranges and changes across the recording.
  3. Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
  4. Identify required processing and plan a workflow order that avoids unwanted artifacts. Generate a plan draft with scripts/plan_from_preflight.py and complete it with case-specific decisions.
  5. Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
  6. Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
  7. Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
  8. If stereo, evaluate channel correlation and phase; document anomalies.
  9. Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.

Failure-pattern guardrails:

  • Do not process until every preflight field is captured.
  • Document every process, setting, software version, and time segment to enable repeatability.
  • Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
  • Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
  • Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
  • Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
  • If the request is not achievable, communicate limitations and do not declare completion.
  • Require objective metrics and A/B listening before declaring completion.
  • Do not rely solely on objective metrics; corroborate with critical listening.
  • Take listening breaks to avoid ear fatigue during extended reviews.

Quick Enhancement Pipeline

# 1. Analyze original (run preflight and capture baseline metrics)
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

# 2. Create working copy with checksum
cp evidence.wav working.wav
sha256sum evidence.wav > evidence.sha256

# 3. Apply enhancement
ffmpeg -i working.wav -af "\
  highpass=f=80,\
  adeclick=w=55:o=75,\
  afftdn=nr=12:nf=-30:nt=w,\
  equalizer=f=2500:t=q:w=1:g=3,\
  loudnorm=I=-16:TP=-1.5:LRA=11\
" enhanced.wav

# 4. Transcribe
whisper enhanced.wav --model large-v3 --language en

# 5. Verify original unchanged
sha256sum -c evidence.sha256

# 6. Verify improvement (objective comparison + A/B listening)
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
  --before evidence.wav \
  --after enhanced.wav \
  --format md \
  --out comparison.md

How to Use

Read individual reference files for detailed explanations and code examples:

Reference Files

FileDescription
AGENTS.mdComplete compiled guide with all rules
references/_sections.mdCategory definitions and ordering
assets/templates/_template.mdTemplate for new rules
metadata.jsonVersion and reference information

┌ stats

installs/wk0
░░░░░░░░░░
github stars80
██████████
first seenMar 17, 2026
└────────────

┌ repo

pproenca/dot-skills
by pproenca
└────────────

┌ tags

└────────────