System Overview
The system uses 4 stationary spotter drones arranged around a perimeter, each carrying a fixed multi-camera array. Each drone produces bearing measurements (azimuth + elevation) to any flying target it sees. A ground station fuses bearings from all spotters into precise 3D tracks, then cues a free-flying chaser drone to follow individual threats at close range.
Component summary
| Component | Role | Mobility | Compute |
|---|---|---|---|
| Spotter drones (×4) | Detect & report bearings to all visible targets | Stationary (tethered hover) | FPGA only — sync & framing |
| Tether base stations (×4) | Power, fiber, mic array, RTK rover | Stationary on ground | None (passthrough) |
| Ground station | Fusion, tracking, classification, planning | Stationary | 4× GPU workstation |
| Chaser drone (×1+) | Close-range follow of cued target | Free-flying | Onboard Jetson for visual servoing |
Design principles
- Decouple detection from pursuit. Spotters never move during operation; chaser does the dynamic work.
- Dumb sensors, smart center. All AI on the ground; drones just stream timestamped frames.
- Fixed cameras, no gimbals. No moving optics on spotters; reliability + zero slew latency.
- Fiber over RF for stationary nodes — bandwidth, latency, security.
- Bearings-only fusion. Triangulation across 4+ spotters is sub-meter accurate at 1 km.
- Optical + acoustic redundancy. Each modality covers the other's weather/noise blind spots.
Spotter Constellation Geometry
Four spotter drones in a 100 m × 100 m square, with staggered altitudes from 60–200 m. The horizontal layout gives full 360° azimuth coverage; the vertical staggering improves elevation accuracy by ~5–8× over a coplanar layout.
Top-down view
Side view (altitude staggering)
Why stagger altitudes?
If all drones are coplanar, the four sightlines lie nearly in a plane, making elevation triangulation poorly conditioned (geometric dilution of precision). Vertical spread comparable to horizontal baseline produces a well-conditioned 3D solution.
| Configuration | Horizontal accuracy @ 1 km | Vertical accuracy @ 1 km |
|---|---|---|
| All drones at 100 m (coplanar) | ±1.1 m | ±8–15 m (poor near drone altitude) |
| Staggered 60/105/155/200 m | ±1.1 m | ±1.5–2 m |
Minimum viable configuration
Four spotters is the comfortable design point, not a hard requirement. The system degrades gracefully as you reduce the count:
| # Spotters | What you get | What you give up |
|---|---|---|
| 1 | Per-drone stereo from 4 leg cameras (~70 cm baseline). 360° azimuth coverage. Bearing & rough range out to ~150 m. | No long-range 3D. Range error grows quadratically beyond stereo baseline. |
| 2 (minimum recommended) | True triangulation with baseline = drone separation (50–200 m). Sub-meter accuracy at 1 km. Full 3D tracks. | No outlier rejection. Single occlusion blocks a target. Elevation poorly conditioned if both drones at similar altitude. |
| 3 | Over-determined system: outlier rejection works. One drone can fail without losing tracks. | Coverage geometry has gaps if poorly placed. |
| 4 (sweet spot) | Square layout = uniform coverage. Two redundant baselines per axis. Vertical staggering possible. Robust to single-drone failure. | Logistics start to matter (4× tethers, power, calibration). |
| 5+ | Each additional drone improves accuracy and extends coverage. With 9+ in a grid, becomes a continuous sensing fabric. | Scales linearly in cost & complexity. |
Per-Drone Camera Architecture
Each spotter carries 5 fixed cameras: one upward-facing fisheye for hemispheric detection, and four wide-tele cameras on the legs for 360° azimuthal coverage. No gimbals, no yawing during operation.
Camera placement (top-down)
Vertical FOV (side view)
Why fixed cameras + low altitude
- Fly the spotter low (25–40 m) so the upward fisheye sees only sky → clean background, easy detection.
- No moving parts on cameras → no slew latency, no gimbal failures, lower mass, simpler calibration.
- 4 simultaneous detection feeds per drone → no slew delay when threat appears in any direction.
- Stereo as a bonus: adjacent leg cameras have overlapping FOV → ~50 cm baseline gives short-range stereo depth out to ~150 m.
- Threats are mostly above the drone anyway. Blind cone below is acceptable for sky-target tracking.
Camera spec summary (per drone)
| Camera | Position | FOV | Resolution | Role |
|---|---|---|---|---|
| Fisheye | Top, facing up | ~180° | 4K | Hemispheric detection |
| Tele 1–4 | Each leg, radial outward | 50° | 4K, 60 fps | Bearing measurement, range tracking |
Bearing-Only Triangulation
Each spotter reports an angular vector to the target (azimuth, elevation). With ≥2 spotters seeing the same target, the intersection of sightlines yields a 3D position. With 4 spotters, the system is over-determined → outlier rejection + uncertainty estimation come for free.
Two-spotter triangulation (2D, conceptual)
Triangulation accuracy in the 100 m × 100 m camp
With 4 drones at corners, best baseline used is the diagonal (~141 m). Assumed angular precision σθ = 150 µrad (1–2 px centroid + atmospheric shimmer on 4K @ 50° FOV).
| Target distance | Range accuracy ΔR | Cross-range accuracy |
|---|---|---|
| 200 m | ±0.04 m | ±0.03 m |
| 500 m | ±0.27 m | ±0.08 m |
| 1 km | ±1.1 m | ±0.15 m |
| 2 km | ±4.3 m | ±0.30 m |
| 3 km | ±9.6 m | ±0.45 m |
Fusion algorithm (ground station)
- Per-camera detection: YOLO/RT-DETR on each frame → bounding boxes with class scores.
- Per-camera tracking: ByteTrack assigns persistent track IDs within each camera stream.
- Bearing extraction: bbox centroid + camera intrinsics + drone pose (RTK GPS + IMU) → unit sightline vector in world frame.
- Multi-camera association: JPDA or MHT matches tracks across the 20+ camera streams (4 drones × 5 cams).
- 3D state estimation: Extended Kalman Filter per target, state = [x,y,z,vx,vy,vz], measurement = unit sightline vectors.
- Track output: 60 Hz updates with covariance ellipsoids; cued downstream to chaser controller.
Acoustic Detection & Fusion
Acoustic sensing complements optical: it covers what vision can't (fog, behind obstacles, night) and provides early warning. Mics live on the tether base stations, not on the drones (rotor self-noise is fatal to airborne mics).
Tetrahedral mic array (per base station)
Active rotor noise cancellation
Each spotter drone sends real-time ESC telemetry (rotor RPM @ 200 Hz) down the fiber. The ground station knows the exact instantaneous blade-pass frequency + harmonics of every spotter and uses an adaptive filter (LMS/RLS) to subtract those signatures from each microphone stream.
- Suppression: 15–25 dB against own constellation rotors
- Residual: aerodynamic broadband noise (un-cancellable, but doesn't mask narrow-band targets)
- Result: mics can detect foreign drone signatures even with own drones aloft
Realistic acoustic detection range
| Target | Source level @ 1m | Quiet range | Noisy range |
|---|---|---|---|
| Mavic-class quad | 75 dB | ~300 m | ~30 m |
| FPV racing drone | 90 dB | ~1,800 m | ~180 m |
| Heavy hexa/octo | 90–95 dB | ~2,000 m | ~250 m |
Optical + acoustic fusion
The two modalities are complementary failure-domain partners:
| Condition | Optical | Acoustic |
|---|---|---|
| Clear day | excellent | good |
| Fog / rain | poor | good |
| Night | poor (without IR) | good |
| Strong wind | good | poor |
| Behind tree/structure | blocked | diffracts around |
Chaser Drone
The chaser is the only mobile drone in the system. It's free-flying (battery, RF link), launched from the ground station area, cued by the spotter constellation, and follows individual targets at close range (10–30 m typical).
Primary mission: identify, then act
The spotter constellation knows where a threat is. The chaser's job is to determine what it is and respond accordingly. At close range it gets the high-resolution imagery the spotters can't, runs a fine-grained classifier, and selects an action.
| Phase | Function | Output |
|---|---|---|
| 1. Cue | Receive 3D track from constellation, fly to standoff position (15 m behind, 5 m below). | Visual lock acquired |
| 2. Identify | Close-range tele imagery + fine-grained classifier: species, drone model, payload, markings, behavior, transponder/RF emissions. | Classification + confidence |
| 3. Decide | Friend / foe / unknown / observation-of-interest. Rules engine on ground station applies policy. | Action selection |
| 4. Act | Action depends on classification (see below). | Execution |
Action options by classification
| Classification | Action |
|---|---|
| Friend / authorized | Log, disengage, return chaser to standby. Keep passive track via constellation. |
| Wildlife (study target) | Maintain trail at standoff distance, record imagery and trajectory, follow to nest / roost / origin. |
| Unknown / unidentified | Continue close observation, escalate to human operator for decision. |
| Foe / unauthorized intrusion | Operator-defined response: persistent track, warn (broadcast / strobe), follow to source, or hand off to a kinetic asset. Note: kinetic engagement is out of scope of this passive sensing system. |
Cue + intercept geometry
Two-stage control
- Long-range (cued): chaser receives 30 Hz target state from ground, flies to a "trail position" computed by intercept solver — typically 15 m behind and 5 m below predicted threat position.
- Close-range (visual servoing): within ~50 m, chaser's onboard gimbaled tele camera locks on. Onboard Jetson runs a fast tracker; PID controllers on yaw/pitch/throttle keep threat centered + maintain target bbox size (proxy for distance).
- Loss recovery: if chaser loses threat, ground station continues to provide cues from the constellation — no global re-acquisition needed.
What the chaser carries
| Component | Spec | Mass |
|---|---|---|
| Airframe | ~Mavic-class quad, 30+ min flight | ~900 g |
| Gimbaled tele camera | 4K, 70–200 mm equiv, 3-axis gimbal | ~400 g |
| Onboard compute | Jetson Orin Nano (visual servoing only) | ~150 g |
| RTK GPS + RF link | 5.8 GHz to ground, RTK rover | ~80 g |
Tether & Communications
Each spotter is tethered to its base station with a hybrid power + fiber + safety cable. The tether eliminates batteries, RF interference, bandwidth limits, and limits flight to a fixed station — which is exactly what we want for spotters.
Tether cross-section
Why fiber, not wireless?
| Property | 5.8 GHz / WiFi | Fiber over tether |
|---|---|---|
| Bandwidth | 100–500 Mbps | 10–100 Gbps |
| Latency | 5–20 ms | <1 ms (~5 µs/km) |
| Interference | RF, jamming, weather | none |
| Multi-drone bandwidth | shared spectrum, contention | dedicated per drone |
| Power on drone for radio | several watts | none (passive optics) |
| Time sync | ~1 ms over wireless PTP | sub-µs PTP over fiber |
What this enables
- Stream raw 4K @ 60 fps from all 5 cameras of all 4 drones simultaneously to ground compute. No on-drone inference needed.
- Workstation-class compute on the ground (4× RTX 4090 / H100) instead of Jetson on the drone.
- Drone becomes a dumb sensor head — cameras, mics, FPGA for sync, that's it. Lower mass, longer hover, fewer failure modes.
- Microsecond time sync across all 4 drones via PTP-over-fiber → bearing fusion is geometrically clean, not noise-limited by clock skew.
- RTK GPS corrections distributed over fiber from a single ground base → cm-level position accuracy on every drone.
Ground Station
The ground station is where all the intelligence lives. Receives raw sensor streams from 4 spotters over fiber, runs detection + fusion + tracking + classification, drives the operator UI, and cues the chaser drone.
Ground station block diagram
Data flow rates
| Stream | Direction | Bandwidth |
|---|---|---|
| Raw 4K @ 60 fps × 5 cams (per drone) | spotter → ground | ~30 Gbps uncompressed / ~50 Mbps H.265 |
| Mic array audio (per base) | base → ground | ~3 Mbps |
| ESC telemetry, IMU, RTK pose | spotter → ground | ~2 Mbps |
| Time sync (PTP) | bidirectional | negligible |
| Chaser cue | ground → chaser (RF) | ~10 KB/s |
| Chaser telemetry + video | chaser → ground (RF) | ~10 Mbps |
Specs, Range & Speed
Detection range (per camera)
Pixels-on-target needed: ~10×10 px to detect, ~20×20 px to track reliably. Assumes 4K + 50° FOV wide-tele, daylight, sky background.
| Target | Body size | Detection | Tracking |
|---|---|---|---|
| Micro target | ~12 cm | ~250 m | ~120 m |
| Small target | ~30 cm | ~600 m | ~300 m |
| Medium target | ~50 cm | ~1,000 m | ~500 m |
| Large target | 80–100 cm | ~1,800 m | ~900 m |
| Extra-large target | 100+ cm | ~2,000+ m | ~1,000 m |
| FPV drone | ~25 cm | ~400 m | ~200 m |
| Mavic-class quad | ~35 cm | ~600 m | ~300 m |
Speed handling (frame rate + shutter)
For accurate tracking, motion blur should stay under ~1 px/frame. With 4K @ 60 fps, 50° FOV, and adaptive shutter:
| Target | Velocity | Range | Required shutter | Trackable? |
|---|---|---|---|---|
| Slow target (gliding) | 15 m/s | 500 m | 1/500 s | trivial |
| Fast diving target | 90 m/s | 500 m | 1/2000 s | easy |
| FPV drone | 40 m/s | 200 m | 1/1000 s | easy |
| Cruise missile | 250 m/s | 1 km | 1/2000 s | fine |
| Hypersonic | 1,500 m/s | 1 km | 1/8000 s | edge of envelope |
Coverage envelope (4-drone constellation, 100 m square)
| Zone | Range from center | What's tracked |
|---|---|---|
| Inner | 0–300 m | any threat, full 3D track, sub-meter accuracy |
| Mid | 300 m – 1 km | medium+ threats; micro targets detection-limited |
| Outer | 1–2 km | large threats, drones |
| Detection frontier | 2–3 km | large slow-moving targets against sky only |
Scaling: distributed constellation
Replace single-cell with sparse grid of drones at 500–700 m spacing. Each drone participates in triangulation with all neighbors.
| Layout | # Drones | Coverage | Best baseline | Range accuracy @ 1 km |
|---|---|---|---|---|
| 4-drone cell, 100 m | 4 | ~1 km dome | 141 m | ±1 m |
| 3×3 grid, 500 m | 9 | ~1.5 km × 1.5 km | 1,400 m | ±0.11 m |
| 5×5 grid, 500 m | 25 | ~2.5 km × 2.5 km | 2,800 m | ±0.05 m |
| 4×4 grid, 1 km | 16 | ~3 km × 3 km | 4,200 m | ±0.04 m |
Latency budget (target → chaser cue)
| Stage | Time |
|---|---|
| Camera exposure | 0.5–2 ms |
| Frame transit (fiber) | <1 ms |
| Detection (GPU) | ~10 ms |
| Cross-camera fusion | ~5 ms |
| EKF update | ~2 ms |
| RF cue to chaser | ~5 ms |
| Total | ~25–30 ms |
Operational limits
- Wind: <25 m/s for tethered hover (typical commercial limit)
- Tether altitude: 50–100 m typical legal limit (varies by jurisdiction)
- Endurance: spotters indefinite (tethered); chaser 25–35 min per battery
- Visibility: optical degrades with rain/fog; acoustic compensates
- Night: requires IR or low-light cameras (additional cost)
- Legal: drone operation rules vary; tracking wildlife may need permits